Skip to content

View alloc fixes for P3 #2980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 10, 2024
Merged

View alloc fixes for P3 #2980

merged 3 commits into from
Sep 10, 2024

Conversation

tcclevenger
Copy link
Contributor

@tcclevenger tcclevenger commented Sep 3, 2024

In both small kernel and monolithic P3 there are view allocs during run phase, this PR removes these allocs.

Main changes:

  • Add a Temporaries struct for p3 temps when using SK. Then these views are added to the Buffer.
  • Remove get_latent_heat() function and remove global views for latent heat values. Instead use physics constants where appropriate.
  • Instead of declaring global view for bools (size (ncol, 2)), use kokkos' scratch space to allocate a 2 bools per team.

Viewing changes with "Hide whitespace" will help.

@tcclevenger tcclevenger self-assigned this Sep 3, 2024
@tcclevenger tcclevenger changed the title [WIP] View alloc fixes for P3 small kernels [WIP] View alloc fixes for P3 Sep 3, 2024
@tcclevenger tcclevenger force-pushed the tcclevenger/view_allocs_in_p3 branch from c2c072d to 83bcc0c Compare September 3, 2024 18:21
@tcclevenger tcclevenger changed the title [WIP] View alloc fixes for P3 View alloc fixes for P3 Sep 3, 2024
@tcclevenger tcclevenger added p3 regarding p3 microphysics AT: RETEST labels Sep 3, 2024
@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 3, 2024

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 3, 2024

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5801
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1a540cf
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 68f648a
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6027
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1a540cf
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 68f648a
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: tcclevenger/view_allocs_in_p3
  • SHA: 1a540cf
  • Mode: TEST_REPO

Pull Request Author: tcclevenger

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 3, 2024

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5801
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1a540cf
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 68f648a
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6027
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1a540cf
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 68f648a
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Mappy # 5801 FAILED (click to see last 100 lines of console output)

++ export PATH=/ascldap/users/jgfouca/packages/Python-3.8.5/bin:/usr/lib64/qt-3.3/bin:/usr/condabin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/opt/dell/srvadmin/sbin:/ascldap/users/e3sm-jenkins/.local/bin:/ascldap/users/e3sm-jenkins/bin
++ PATH=/ascldap/users/jgfouca/packages/Python-3.8.5/bin:/usr/lib64/qt-3.3/bin:/usr/condabin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/opt/dell/srvadmin/sbin:/ascldap/users/e3sm-jenkins/.local/bin:/ascldap/users/e3sm-jenkins/bin
++ module load sems-archive-git/2.10.1
+++ /projects/sems/install/rhel7-x86_64/sems/v2/lmod/lmod/8.3/gcc/10.1.0/zbzzu7k/lmod/lmod/libexec/lmod sh load sems-archive-git/2.10.1
++ eval '__LMOD_REF_COUNT_LD_LIBRARY_PATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/lib:1;' export '__LMOD_REF_COUNT_LD_LIBRARY_PATH;' 'LD_LIBRARY_PATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/lib;' export 'LD_LIBRARY_PATH;' '__LMOD_REF_COUNT_LOADEDMODULES=sems-archive-env:1\;sems-archive-git/2.10.1:1;' export '__LMOD_REF_COUNT_LOADEDMODULES;' 'LOADEDMODULES=sems-archive-env:sems-archive-git/2.10.1;' export 'LOADEDMODULES;' '__LMOD_REF_COUNT_MANPATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/share/man:1\;/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/man:1;' export '__LMOD_REF_COUNT_MANPATH;' 'MANPATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/share/man:/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/man;' export 'MANPATH;' 'MODULEPATH=/projects/sems/modulefiles/rhel7-x86_64/sems-archive/devpack:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/compiler:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/tpl:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/utility:/projects/sems/modulefiles/projects:/projects/sems/cee-sierra-modules:/projects/sems/modulefiles/rhel7-x86_64/sems/linux-rhel7-x86_64/compilers:/projects/sems/modulefiles/rhel7-x86_64/sems/linux-rhel7-x86_64/Core:/projects/sems/modulefiles/rhel7-x86_64/sems/linux-rhel7-x86_64/project-modulefiles:/projects/aue/modules/cee/x86_64/rhel7:/usr/share/Modules/modulefiles:/etc/modulefiles;' export 'MODULEPATH;' '__LMOD_REF_COUNT_PATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/sbin:1\;/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/bin:1\;/ascldap/users/jgfouca/packages/Python-3.8.5/bin:1\;/usr/lib64/qt-3.3/bin:1\;/usr/condabin:1\;/usr/local/sbin:1\;/usr/local/bin:1\;/sbin:1\;/bin:1\;/usr/sbin:1\;/usr/bin:1\;/root/bin:1\;/opt/dell/srvadmin/sbin:1\;/ascldap/users/e3sm-jenkins/.local/bin:1\;/ascldap/users/e3sm-jenkins/bin:1;' export '__LMOD_REF_COUNT_PATH;' 'PATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/sbin:/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/bin:/ascldap/users/jgfouca/packages/Python-3.8.5/bin:/usr/lib64/qt-3.3/bin:/usr/condabin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/opt/dell/srvadmin/sbin:/ascldap/users/e3sm-jenkins/.local/bin:/ascldap/users/e3sm-jenkins/bin;' export 'PATH;' 'SEMS_GIT_LOCAL_COMPILER_VERSION=4.8.5;' export 'SEMS_GIT_LOCAL_COMPILER_VERSION;' 'SEMS_GIT_LOCAL_PYTHON_VERSION=2.7.5;' export 'SEMS_GIT_LOCAL_PYTHON_VERSION;' 'SEMS_GIT_ROOT=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1;' export 'SEMS_GIT_ROOT;' 'SEMS_GIT_VERSION=2.10.1;' export 'SEMS_GIT_VERSION;' '__LMOD_REF_COUNT__LMFILES_=/projects/sems/modulefiles/projects/sems-archive-env:1\;/projects/sems/modulefiles/rhel7-x86_64/sems-archive/utility/sems-archive-git/2.10.1:1;' export '__LMOD_REF_COUNT__LMFILES_;' '_LMFILES_=/projects/sems/modulefiles/projects/sems-archive-env:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/utility/sems-archive-git/2.10.1;' export '_LMFILES_;' '_ModuleTable001_=X01vZHVsZVRhYmxlXz17WyJNVHZlcnNpb24iXT0zLFsiY19yZWJ1aWxkVGltZSJdPWZhbHNlLFsiY19zaG9ydFRpbWUiXT1mYWxzZSxkZXB0aFQ9e30sZmFtaWx5PXt9LG1UPXtbInNlbXMtYXJjaGl2ZS1lbnYiXT17WyJmbiJdPSIvcHJvamVjdHMvc2Vtcy9tb2R1bGVmaWxlcy9wcm9qZWN0cy9zZW1zLWFyY2hpdmUtZW52IixbImZ1bGxOYW1lIl09InNlbXMtYXJjaGl2ZS1lbnYiLFsibG9hZE9yZGVyIl09MSxwcm9wVD17fSxbInN0YWNrRGVwdGgiXT0wLFsic3RhdHVzIl09ImFjdGl2ZSIsWyJ1c2VyTmFtZSJdPSJzZW1zLWFyY2hpdmUtZW52Iix9LFsic2Vtcy1hcmNoaXZlLWdpdCJdPXtbImZuIl09Ii9wcm9qZWN0cy9zZW1zL21vZHVsZWZpbGVzL3JoZWw3LXg4Nl82NC9z;' export '_ModuleTable001_;' '_ModuleTable002_=ZW1zLWFyY2hpdmUvdXRpbGl0eS9zZW1zLWFyY2hpdmUtZ2l0LzIuMTAuMSIsWyJmdWxsTmFtZSJdPSJzZW1zLWFyY2hpdmUtZ2l0LzIuMTAuMSIsWyJsb2FkT3JkZXIiXT0yLHByb3BUPXt9LFsic3RhY2tEZXB0aCJdPTAsWyJzdGF0dXMiXT0iYWN0aXZlIixbInVzZXJOYW1lIl09InNlbXMtYXJjaGl2ZS1naXQvMi4xMC4xIix9LH0sbXBhdGhBPXsiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMtYXJjaGl2ZS9kZXZwYWNrIiwiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMtYXJjaGl2ZS9jb21waWxlciIsIi9wcm9qZWN0cy9zZW1zL21vZHVsZWZpbGVzL3JoZWw3LXg4Nl82NC9zZW1zLWFyY2hpdmUvdHBsIiwi;' export '_ModuleTable002_;' '_ModuleTable003_=L3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMtYXJjaGl2ZS91dGlsaXR5IiwiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcHJvamVjdHMiLCIvcHJvamVjdHMvc2Vtcy9jZWUtc2llcnJhLW1vZHVsZXMiLCIvcHJvamVjdHMvc2Vtcy9tb2R1bGVmaWxlcy9yaGVsNy14ODZfNjQvc2Vtcy9saW51eC1yaGVsNy14ODZfNjQvY29tcGlsZXJzIiwiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMvbGludXgtcmhlbDcteDg2XzY0L0NvcmUiLCIvcHJvamVjdHMvc2Vtcy9tb2R1bGVmaWxlcy9yaGVsNy14ODZfNjQvc2Vtcy9saW51eC1yaGVsNy14ODZfNjQvcHJvamVjdC1tb2R1bGVmaWxlcyIsIi9wcm9qZWN0cy9hdWUv;' export '_ModuleTable003_;' '_ModuleTable004_=bW9kdWxlcy9jZWUveDg2XzY0L3JoZWw3IiwiL3Vzci9zaGFyZS9Nb2R1bGVzL21vZHVsZWZpbGVzIiwiL2V0Yy9tb2R1bGVmaWxlcyIsfSxbInN5c3RlbUJhc2VNUEFUSCJdPSIvdXNyL3NoYXJlL01vZHVsZXMvbW9kdWxlZmlsZXM6L2V0Yy9tb2R1bGVmaWxlcyIsfQ==;' export '_ModuleTable004_;' '_ModuleTable_Sz_=4;' export '_ModuleTable_Sz_;'
+++ __LMOD_REF_COUNT_LD_LIBRARY_PATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/lib:1
+++ export __LMOD_REF_COUNT_LD_LIBRARY_PATH
+++ LD_LIBRARY_PATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/lib
+++ export LD_LIBRARY_PATH
+++ __LMOD_REF_COUNT_LOADEDMODULES='sems-archive-env:1;sems-archive-git/2.10.1:1'
+++ export __LMOD_REF_COUNT_LOADEDMODULES
+++ LOADEDMODULES=sems-archive-env:sems-archive-git/2.10.1
+++ export LOADEDMODULES
+++ __LMOD_REF_COUNT_MANPATH='/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/share/man:1;/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/man:1'
+++ export __LMOD_REF_COUNT_MANPATH
+++ MANPATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/share/man:/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/man
+++ export MANPATH
+++ MODULEPATH=/projects/sems/modulefiles/rhel7-x86_64/sems-archive/devpack:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/compiler:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/tpl:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/utility:/projects/sems/modulefiles/projects:/projects/sems/cee-sierra-modules:/projects/sems/modulefiles/rhel7-x86_64/sems/linux-rhel7-x86_64/compilers:/projects/sems/modulefiles/rhel7-x86_64/sems/linux-rhel7-x86_64/Core:/projects/sems/modulefiles/rhel7-x86_64/sems/linux-rhel7-x86_64/project-modulefiles:/projects/aue/modules/cee/x86_64/rhel7:/usr/share/Modules/modulefiles:/etc/modulefiles
+++ export MODULEPATH
+++ __LMOD_REF_COUNT_PATH='/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/sbin:1;/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/bin:1;/ascldap/users/jgfouca/packages/Python-3.8.5/bin:1;/usr/lib64/qt-3.3/bin:1;/usr/condabin:1;/usr/local/sbin:1;/usr/local/bin:1;/sbin:1;/bin:1;/usr/sbin:1;/usr/bin:1;/root/bin:1;/opt/dell/srvadmin/sbin:1;/ascldap/users/e3sm-jenkins/.local/bin:1;/ascldap/users/e3sm-jenkins/bin:1'
+++ export __LMOD_REF_COUNT_PATH
+++ PATH=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/sbin:/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1/bin:/ascldap/users/jgfouca/packages/Python-3.8.5/bin:/usr/lib64/qt-3.3/bin:/usr/condabin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/opt/dell/srvadmin/sbin:/ascldap/users/e3sm-jenkins/.local/bin:/ascldap/users/e3sm-jenkins/bin
+++ export PATH
+++ SEMS_GIT_LOCAL_COMPILER_VERSION=4.8.5
+++ export SEMS_GIT_LOCAL_COMPILER_VERSION
+++ SEMS_GIT_LOCAL_PYTHON_VERSION=2.7.5
+++ export SEMS_GIT_LOCAL_PYTHON_VERSION
+++ SEMS_GIT_ROOT=/projects/sems/install/rhel7-x86_64/sems/utility/git/2.10.1
+++ export SEMS_GIT_ROOT
+++ SEMS_GIT_VERSION=2.10.1
+++ export SEMS_GIT_VERSION
+++ __LMOD_REF_COUNT__LMFILES_='/projects/sems/modulefiles/projects/sems-archive-env:1;/projects/sems/modulefiles/rhel7-x86_64/sems-archive/utility/sems-archive-git/2.10.1:1'
+++ export __LMOD_REF_COUNT__LMFILES_
+++ _LMFILES_=/projects/sems/modulefiles/projects/sems-archive-env:/projects/sems/modulefiles/rhel7-x86_64/sems-archive/utility/sems-archive-git/2.10.1
+++ export _LMFILES_
+++ _ModuleTable001_=X01vZHVsZVRhYmxlXz17WyJNVHZlcnNpb24iXT0zLFsiY19yZWJ1aWxkVGltZSJdPWZhbHNlLFsiY19zaG9ydFRpbWUiXT1mYWxzZSxkZXB0aFQ9e30sZmFtaWx5PXt9LG1UPXtbInNlbXMtYXJjaGl2ZS1lbnYiXT17WyJmbiJdPSIvcHJvamVjdHMvc2Vtcy9tb2R1bGVmaWxlcy9wcm9qZWN0cy9zZW1zLWFyY2hpdmUtZW52IixbImZ1bGxOYW1lIl09InNlbXMtYXJjaGl2ZS1lbnYiLFsibG9hZE9yZGVyIl09MSxwcm9wVD17fSxbInN0YWNrRGVwdGgiXT0wLFsic3RhdHVzIl09ImFjdGl2ZSIsWyJ1c2VyTmFtZSJdPSJzZW1zLWFyY2hpdmUtZW52Iix9LFsic2Vtcy1hcmNoaXZlLWdpdCJdPXtbImZuIl09Ii9wcm9qZWN0cy9zZW1zL21vZHVsZWZpbGVzL3JoZWw3LXg4Nl82NC9z
+++ export _ModuleTable001_
+++ _ModuleTable002_=ZW1zLWFyY2hpdmUvdXRpbGl0eS9zZW1zLWFyY2hpdmUtZ2l0LzIuMTAuMSIsWyJmdWxsTmFtZSJdPSJzZW1zLWFyY2hpdmUtZ2l0LzIuMTAuMSIsWyJsb2FkT3JkZXIiXT0yLHByb3BUPXt9LFsic3RhY2tEZXB0aCJdPTAsWyJzdGF0dXMiXT0iYWN0aXZlIixbInVzZXJOYW1lIl09InNlbXMtYXJjaGl2ZS1naXQvMi4xMC4xIix9LH0sbXBhdGhBPXsiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMtYXJjaGl2ZS9kZXZwYWNrIiwiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMtYXJjaGl2ZS9jb21waWxlciIsIi9wcm9qZWN0cy9zZW1zL21vZHVsZWZpbGVzL3JoZWw3LXg4Nl82NC9zZW1zLWFyY2hpdmUvdHBsIiwi
+++ export _ModuleTable002_
+++ _ModuleTable003_=L3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMtYXJjaGl2ZS91dGlsaXR5IiwiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcHJvamVjdHMiLCIvcHJvamVjdHMvc2Vtcy9jZWUtc2llcnJhLW1vZHVsZXMiLCIvcHJvamVjdHMvc2Vtcy9tb2R1bGVmaWxlcy9yaGVsNy14ODZfNjQvc2Vtcy9saW51eC1yaGVsNy14ODZfNjQvY29tcGlsZXJzIiwiL3Byb2plY3RzL3NlbXMvbW9kdWxlZmlsZXMvcmhlbDcteDg2XzY0L3NlbXMvbGludXgtcmhlbDcteDg2XzY0L0NvcmUiLCIvcHJvamVjdHMvc2Vtcy9tb2R1bGVmaWxlcy9yaGVsNy14ODZfNjQvc2Vtcy9saW51eC1yaGVsNy14ODZfNjQvcHJvamVjdC1tb2R1bGVmaWxlcyIsIi9wcm9qZWN0cy9hdWUv
+++ export _ModuleTable003_
+++ _ModuleTable004_=bW9kdWxlcy9jZWUveDg2XzY0L3JoZWw3IiwiL3Vzci9zaGFyZS9Nb2R1bGVzL21vZHVsZWZpbGVzIiwiL2V0Yy9tb2R1bGVmaWxlcyIsfSxbInN5c3RlbUJhc2VNUEFUSCJdPSIvdXNyL3NoYXJlL01vZHVsZXMvbW9kdWxlZmlsZXM6L2V0Yy9tb2R1bGVmaWxlcyIsfQ==
+++ export _ModuleTable004_
+++ _ModuleTable_Sz_=4
+++ export _ModuleTable_Sz_
++ source /ascldap/users/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5801/scream/components/eamxx/scripts/jenkins/sandia_son_proxy
+++ export http_proxy=http://proxy.sandia.gov:80
+++ http_proxy=http://proxy.sandia.gov:80
+++ export RSYNC_PROXY=proxy.sandia.gov:80
+++ RSYNC_PROXY=proxy.sandia.gov:80
+++ export rsync_proxy=proxy.sandia.gov:80
+++ rsync_proxy=proxy.sandia.gov:80
+++ export HTTPS_PROXY=http://proxy.sandia.gov:80
+++ HTTPS_PROXY=http://proxy.sandia.gov:80
+++ export https_proxy=http://proxy.sandia.gov:80
+++ https_proxy=http://proxy.sandia.gov:80
+++ export HTTP_PROXY=http://proxy.sandia.gov:80
+++ HTTP_PROXY=http://proxy.sandia.gov:80
++ SCREAM_MACHINE=mappy
+ [[ 0 == 1 ]]
+ [[ 0 == 1 ]]
+ [[ 0 == 1 ]]
++ whoami
+ [[ e3sm-jenkins == \e\3\s\m\-\j\e\n\k\i\n\s ]]
+ git config --local user.email [email protected]
+ git config --local user.name 'Jenkins Jenkins'
+ declare -i fails=0
+ BASELINES_DIR=AUTO
+ TAS_ARGS='--baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m $machine'
+ [[ mappy == \p\m\-\g\p\u ]]
+ set +e
+ '[' -n 2980 ']'
+ is_at_run=1
+ SA_FAILURES_DETAILS=
+ '[' 1 -eq 1 ']'
++ ./scripts/gather-all-data './scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m $machine' -l -m mappy
Build timed out (after 120 minutes). Marking the build as failed.
$ ssh-agent -k
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 97674 killed;
[ssh-agent] Stopped.
Build was aborted
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins1961906281920379537.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: [email protected]
Finished: FAILURE

SCREAM_PullRequest_Autotester_Weaver # 6027 FAILED (click to see last 100 lines of console output)

129:shoc_p3_nudged
130:shoc_p3_nudged_remapped
131:shoc_p3_nudging_glob_novert
132:homme_shoc_cld_p3_rrtmgp_np1
133:homme_shoc_cld_p3_rrtmgp_baseline_cmp
134:homme_shoc_cld_p3_rrtmgp_pg2_np1
135:homme_shoc_cld_p3_rrtmgp_pg2_baseline_cmp
136:model_baseline
137:model_initial
138:model_restart
139:restarted_vs_monolithic_check_np1
140:homme_shoc_cld_spa_p3_rrtmgp_np1
141:homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp
142:homme_shoc_cld_spa_p3_rrtmgp_128levels_np1
143:homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1
144:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp
145:homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1
146:homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp
147:homme_shoc_cld_p3_mam_optics_rrtmgp_np1
148:homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp
149:homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1
150:homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp
151:homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1
152:homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp

Build type full_sp_debug failed at testing time. Here's a list of failed tests:
30:p3_tests_omp1
34:p3_run_and_cmp_cxx
75:p3_standalone_np1
76:p3_tend_check_np1
77:p3_standalone_baseline_cmp
98:shoc_p3_subcycled
99:shoc_p3_monolithic
100:check_subcycling
101:check_subcycling_tend_check
103:shoc_p3_source
104:shoc_p3_nudged
105:shoc_p3_nudged_remapped
106:shoc_p3_nudging_glob_novert

Build type release failed at testing time. Here's a list of failed tests:
82:p3_standalone_np1
83:p3_tend_check_np1
84:p3_standalone_baseline_cmp
112:shoc_cld_p3_rrtmgp_np1
113:shoc_cld_spa_p3_rrtmgp_np1
117:shoc_cldfrac_mam4_aci_p3_np1
118:shoc_cldfrac_mam4_aci_p3_rrtmgp_np1
119:shoc_cldfrac_mam4_aci_p3_mam4_optics_rrtmgp_np1
120:p3_mam4_wetscav_np1
121:shoc_cldfrac_p3_wetscav_np1
122:shoc_p3_subcycled
123:shoc_p3_monolithic
124:check_subcycling
125:check_subcycling_tend_check
127:shoc_p3_source
128:shoc_p3_nudged
129:shoc_p3_nudged_remapped
130:shoc_p3_nudging_glob_novert
131:homme_shoc_cld_p3_rrtmgp_np1
132:homme_shoc_cld_p3_rrtmgp_baseline_cmp
133:homme_shoc_cld_p3_rrtmgp_pg2_np1
134:homme_shoc_cld_p3_rrtmgp_pg2_baseline_cmp
135:model_baseline
136:model_initial
137:model_restart
138:restarted_vs_monolithic_check_np1
139:homme_shoc_cld_spa_p3_rrtmgp_np1
140:homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp
141:homme_shoc_cld_spa_p3_rrtmgp_128levels_np1
142:homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1
143:homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp
146:homme_shoc_cld_p3_mam_optics_rrtmgp_np1
147:homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp
148:homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1
149:homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp
150:homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1
151:homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6027/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6027/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6027/scream/components/eamxx
weaver failed
######################################################
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins978948057627589426.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: [email protected]
Finished: FAILURE

@tcclevenger tcclevenger force-pushed the tcclevenger/view_allocs_in_p3 branch from 1a540cf to fa5351b Compare September 4, 2024 17:55
@tcclevenger tcclevenger force-pushed the tcclevenger/view_allocs_in_p3 branch 2 times, most recently from b29f360 to acb9de5 Compare September 4, 2024 19:51
@tcclevenger tcclevenger force-pushed the tcclevenger/view_allocs_in_p3 branch from acb9de5 to 3a85062 Compare September 4, 2024 19:55
@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 4, 2024

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5808
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6034
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: tcclevenger/view_allocs_in_p3
  • SHA: 3a85062
  • Mode: TEST_REPO

Pull Request Author: tcclevenger

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 4, 2024

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5808
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6034
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Mappy # 5808 FAILED (click to see last 100 lines of console output)

DIFF ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 (phase BASELINE)
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5.C.20240904_144102_lfh1ct
DIFF ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 (phase BASELINE)
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5.C.20240904_144102_lfh1ct
PASS ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20240904_144102_lfh1ct
PASS PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav.C.20240904_144102_lfh1ct
PASS SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20240904_144102_lfh1ct
test-scheduler took 1596.877445936203 seconds'
+ set +x
######################################################
FAILS DETECTED:
  SCREAM STANDALONE TESTING FAILED!
Build type full_debug failed at testing time. Here's a list of failed tests:
340:cosp_standalone_np1
341:cosp_standalone_np2
342:cosp_standalone_np3
343:cosp_standalone_np4
344:cosp_standalone_baseline_cmp
384:mam4_srf_online_emiss_standalone_baseline_cmp
392:mam4_constituent_fluxes_standalone_baseline_cmp
  SCREAM V1 TESTING FAILED!
Waiting for tests to finish
PASS ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240904_144102_lfh1ct
PASS ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20240904_144102_lfh1ct
PASS ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20240904_144102_lfh1ct
DIFF ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5 (phase BASELINE)
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels--scream-output-preset-5.C.20240904_144102_lfh1ct
DIFF ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5 (phase BASELINE)
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_p3--scream-output-preset-5.C.20240904_144102_lfh1ct
DIFF ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5 (phase BASELINE)
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-small_kernels_shoc--scream-output-preset-5.C.20240904_144102_lfh1ct
PASS ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-arm97.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-comble.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FIOP-SCREAMv1-DP.mappy_gnu.scream-dpxx-dycomsrf01.C.20240904_144102_lfh1ct
PASS ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30pg2_ne30pg2.FRCE-SCREAMv1-DP.mappy_gnu.C.20240904_144102_lfh1ct
PASS PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-drydep.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20240904_144102_lfh1ct
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-wetscav.C.20240904_144102_lfh1ct
PASS SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20240904_144102_lfh1ct
test-scheduler took 1596.877445936203 seconds
######################################################
Build step 'Execute shell' marked build as failure
$ ssh-agent -k
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 82385 killed;
[ssh-agent] Stopped.
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins18419719684687356011.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: [email protected]
Finished: FAILURE

SCREAM_PullRequest_Autotester_Weaver # 6034 FAILED (click to see last 100 lines of console output)

RUN: taskset -c 104-155 sh -c '\''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='\''\'\'''\''ON'\''\'\'''\'' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '\''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/ctest-build/release
===============================================================================
Testing '\''8e1287abd1a8753638bb27b9b97eedf9e28ed015'\'' for test '\''full_debug'\''
===============================================================================
RUN: taskset -c 0-51 sh -c '\''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='\''\'\'''\''ON'\''\'\'''\'' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '\''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx/ctest-build/full_debug
Build type full_debug failed at testing time. Here'\''s a list of failed tests:
114:mam4_srf_online_emiss_standalone_baseline_cmp
116:mam4_constituent_fluxes_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:
99:mam4_srf_online_emiss_standalone_baseline_cmp
101:mam4_constituent_fluxes_standalone_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
113:mam4_srf_online_emiss_standalone_baseline_cmp
115:mam4_constituent_fluxes_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx
weaver failed'

  • errors='Build type full_debug failed at testing time. Here'''s a list of failed tests:
    114:mam4_srf_online_emiss_standalone_baseline_cmp
    116:mam4_constituent_fluxes_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:
99:mam4_srf_online_emiss_standalone_baseline_cmp
101:mam4_constituent_fluxes_standalone_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
113:mam4_srf_online_emiss_standalone_baseline_cmp
115:mam4_constituent_fluxes_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx
weaver failed'

  • SA_FAILURES_DETAILS+='Build type full_debug failed at testing time. Here'''s a list of failed tests:
    114:mam4_srf_online_emiss_standalone_baseline_cmp
    116:mam4_constituent_fluxes_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:
99:mam4_srf_online_emiss_standalone_baseline_cmp
101:mam4_constituent_fluxes_standalone_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
113:mam4_srf_online_emiss_standalone_baseline_cmp
115:mam4_constituent_fluxes_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx
weaver failed'

  • [[ 1 == 0 ]]
  • [[ weaver == \m\a\p\p\y ]]
  • set +x
    ######################################################
    FAILS DETECTED:
    SCREAM STANDALONE TESTING FAILED!
    Build type full_debug failed at testing time. Here's a list of failed tests:
    114:mam4_srf_online_emiss_standalone_baseline_cmp
    116:mam4_constituent_fluxes_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here's a list of failed tests:
99:mam4_srf_online_emiss_standalone_baseline_cmp
101:mam4_constituent_fluxes_standalone_baseline_cmp

Build type release failed at testing time. Here's a list of failed tests:
113:mam4_srf_online_emiss_standalone_baseline_cmp
115:mam4_constituent_fluxes_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6034/scream/components/eamxx
weaver failed
######################################################
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins1885532067426590222.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: [email protected]
Finished: FAILURE

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 5, 2024

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I defer to Luca though for a final say.

Do you want me to run a quick profile for each SK/not before/after this PR to be sure?

Also copying @ndkeen for awareness

@@ -361,7 +361,11 @@ class P3Microphysics : public AtmosphereProcess
// 1d view scalar, size (ncol)
static constexpr int num_1d_scalar = 2; //no 2d vars now, but keeping 1d struct for future expansion
// 2d view packed, size (ncol, nlev_packs)
#ifdef SCREAM_P3_SMALL_KERNELS
static constexpr int num_2d_vector = 64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose at some point we should try to see if all these views need to be different, or if we can have a few that alias each other. I guess there are a few that can alias each other, but knowing which ones would require an analysis of p3. Also, allowing aliasing put some constraints on the small kernel execution order, since rearranging them may cause data to be not what was expected.

Still, food for thoughts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, almost certainly there could be some reduction. I'll make an issue.

@ndkeen
Copy link
Contributor

ndkeen commented Sep 6, 2024

I was following this PR, but was waiting for it to be ready for me to test. I can try using the profiler to verify we no longer see the cuda mallocs if ready?

@tcclevenger
Copy link
Contributor Author

@mahf708 @ndkeen Yes, y'all testing this would be fantastic! I may also reach out to learn come of the profiler commands/tricks, this would be a good exercise for me to use the tool.

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 9, 2024

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 9, 2024

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5821
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6045
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: tcclevenger/view_allocs_in_p3
  • SHA: 3a85062
  • Mode: TEST_REPO

Pull Request Author: tcclevenger

@E3SM-Bot
Copy link
Collaborator

E3SM-Bot commented Sep 9, 2024

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5821
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6045
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Mappy # 5821 FAILED (click to see last 100 lines of console output)

	at java.base/java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2915)
	at java.base/java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3410)
	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:954)
	at java.base/java.io.ObjectInputStream.(ObjectInputStream.java:392)
	at hudson.remoting.ObjectInputStreamEx.(ObjectInputStreamEx.java:50)
	at hudson.remoting.Command.readFrom(Command.java:142)
	at hudson.remoting.Command.readFrom(Command.java:128)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:61)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:75)
Caused: java.io.IOException: Backing channel 'mappy' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:215)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:285)
	at jdk.proxy2/jdk.proxy2.$Proxy99.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1212)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1204)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:195)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:145)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:818)
	at hudson.model.Build$BuildExecution.build(Build.java:199)
	at hudson.model.Build$BuildExecution.doRun(Build.java:164)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:526)
	at hudson.model.Run.execute(Run.java:1894)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
	at hudson.model.ResourceController.execute(ResourceController.java:101)
	at hudson.model.Executor.run(Executor.java:446)
FATAL: Unable to delete script file /tmp/jenkins15886939882553630332.sh
java.io.EOFException
	at java.base/java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2915)
	at java.base/java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3410)
	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:954)
	at java.base/java.io.ObjectInputStream.(ObjectInputStream.java:392)
	at hudson.remoting.ObjectInputStreamEx.(ObjectInputStreamEx.java:50)
	at hudson.remoting.Command.readFrom(Command.java:142)
	at hudson.remoting.Command.readFrom(Command.java:128)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:61)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:75)
Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@166b5bc0:mappy": Remote call on mappy failed. The channel is closing down or has closed down
	at hudson.remoting.Channel.call(Channel.java:1035)
	at hudson.FilePath.act(FilePath.java:1229)
	at hudson.FilePath.act(FilePath.java:1218)
	at hudson.FilePath.delete(FilePath.java:1765)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:163)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:818)
	at hudson.model.Build$BuildExecution.build(Build.java:199)
	at hudson.model.Build$BuildExecution.doRun(Build.java:164)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:526)
	at hudson.model.Run.execute(Run.java:1894)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
	at hudson.model.ResourceController.execute(ResourceController.java:101)
	at hudson.model.Executor.run(Executor.java:446)
Build step 'Execute shell' marked build as failure
ERROR: Unable to tear down: Channel "hudson.remoting.Channel@166b5bc0:mappy": Remote call on mappy failed. The channel is closing down or has closed down
java.io.EOFException
	at java.base/java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2915)
	at java.base/java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3410)
	at java.base/java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:954)
	at java.base/java.io.ObjectInputStream.(ObjectInputStream.java:392)
	at hudson.remoting.ObjectInputStreamEx.(ObjectInputStreamEx.java:50)
	at hudson.remoting.Command.readFrom(Command.java:142)
	at hudson.remoting.Command.readFrom(Command.java:128)
	at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:35)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:61)
Caused: java.io.IOException: Unexpected termination of the channel
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:75)
Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@166b5bc0:mappy": Remote call on mappy failed. The channel is closing down or has closed down
	at hudson.remoting.Channel.call(Channel.java:1035)
	at hudson.Launcher$RemoteLauncher.launch(Launcher.java:1121)
	at hudson.Launcher$ProcStarter.start(Launcher.java:506)
	at PluginClassLoader for ssh-agent//com.cloudbees.jenkins.plugins.sshagent.exec.ExecRemoteAgent.stop(ExecRemoteAgent.java:116)
	at PluginClassLoader for ssh-agent//com.cloudbees.jenkins.plugins.sshagent.SSHAgentBuildWrapper$SSHAgentEnvironment.tearDown(SSHAgentBuildWrapper.java:343)
	at hudson.model.AbstractBuild$AbstractBuildExecution.tearDownBuildEnvironments(AbstractBuild.java:566)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:530)
	at hudson.model.Run.execute(Run.java:1894)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
	at hudson.model.ResourceController.execute(ResourceController.java:101)
	at hudson.model.Executor.run(Executor.java:446)
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

Exception when executing the batch command : no workspace from node hudson.slaves.DumbSlave[mappy] which is computer hudson.slaves.SlaveComputer@65e5f986 and has channel null
Build step 'Post build task' marked build as failure
Sending e-mails to: [email protected]
Finished: FAILURE

SCREAM_PullRequest_Autotester_Weaver # 6045 PASSED (click to see last 100 lines of console output)

        Start 143: model_restart
143/157 Test #143: model_restart .........................................................   Passed    7.02 sec
        Start 144: restarted_vs_monolithic_check_np1
144/157 Test #144: restarted_vs_monolithic_check_np1 .....................................   Passed    0.11 sec
        Start 145: homme_shoc_cld_spa_p3_rrtmgp_np1
145/157 Test #145: homme_shoc_cld_spa_p3_rrtmgp_np1 ......................................   Passed   11.73 sec
        Start 146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp
146/157 Test #146: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp .............................   Passed    0.12 sec
        Start 147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1
147/157 Test #147: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1 ............................   Passed   14.14 sec
        Start 148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1
148/157 Test #148: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1 .................   Passed    1.53 sec
        Start 149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp
149/157 Test #149: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp ...................   Passed    0.60 sec
        Start 150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1
150/157 Test #150: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1 ...............................   Passed   18.20 sec
        Start 151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp
151/157 Test #151: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp ......................   Passed    0.09 sec
        Start 152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1
152/157 Test #152: homme_shoc_cld_p3_mam_optics_rrtmgp_np1 ...............................   Passed   17.33 sec
        Start 153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp
153/157 Test #153: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp ......................   Passed    0.15 sec
        Start 154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1
154/157 Test #154: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_np1 ............   Passed   18.09 sec
        Start 155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp
155/157 Test #155: homme_shoc_cld_mam_aci_p3_mam_optics_rrtmgp_mam_drydep_baseline_cmp ...   Passed    0.14 sec
        Start 156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1
156/157 Test #156: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_np1 .........................   Passed   31.93 sec
        Start 157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp
157/157 Test #157: homme_shoc_cld_spa_p3_rrtmgp_mam4_wetscav_baseline_cmp ................   Passed    0.19 sec

100% tests passed, 0 tests failed out of 157

Label Time Summary:
baseline_cmp = 137.38 secproc (23 tests)
baseline_gen = 328.40 sec
proc (25 tests)
bfbhash = 0.91 secproc (1 test)
check = 0.90 sec
proc (1 test)
cld = 48.76 secproc (7 tests)
cld_fraction = 1.17 sec
proc (1 test)
cxx baseline_cmp = 6.47 secproc (2 tests)
diagnostics = 50.84 sec
proc (23 tests)
driver = 97.60 secproc (16 tests)
dynamics = 5.72 sec
proc (3 tests)
fail = 39.68 secproc (5 tests)
io = 53.59 sec
proc (14 tests)
mam4_aci = 34.70 secproc (4 tests)
mam4_constituent_fluxes = 8.23 sec
proc (1 test)
mam4_drydep = 3.60 secproc (1 test)
mam4_optics = 8.64 sec
proc (1 test)
mam4_srf_online_emiss = 8.23 secproc (1 test)
mam4_wetscav = 21.91 sec
proc (2 tests)
nudging = 8.84 secproc (2 tests)
p3 = 109.18 sec
proc (12 tests)
p3_sk = 55.41 secproc (2 tests)
physics = 206.86 sec
proc (27 tests)
remap = 5.39 secproc (1 test)
rrtmgp = 48.31 sec
proc (11 tests)
shoc = 60.34 secproc (13 tests)
spa = 8.31 sec
proc (4 tests)
surface_coupling = 4.93 sec*proc (1 test)

Total Test time (real) = 816.04 sec

Testing '''05cbe3bc2a71ffc5825aefcb5bc3e46a27e9f886''' for test '''full_sp_debug'''

RUN: taskset -c 52-103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/full_sp_debug

Testing '''05cbe3bc2a71ffc5825aefcb5bc3e46a27e9f886''' for test '''release'''

RUN: taskset -c 104-155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/release

Testing '''05cbe3bc2a71ffc5825aefcb5bc3e46a27e9f886''' for test '''full_debug'''

RUN: taskset -c 0-51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx/ctest-build/full_debug
OVERALL STATUS: PASS
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/6045/scream/components/eamxx
Completed analysis on weaver'

  • [[ 0 != 0 ]]
  • [[ 1 == 0 ]]
  • [[ weaver == \m\a\p\p\y ]]
  • set +x
    Performing Post build task...
    Match found for : : True
    Logical operation result is TRUE
    Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins13898537437701095435.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: [email protected]
Finished: SUCCESS

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5822
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6046
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: tcclevenger/view_allocs_in_p3
  • SHA: 3a85062
  • Mode: TEST_REPO

Pull Request Author: tcclevenger

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5822
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6046
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: RETEST
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@E3SM-Bot
Copy link
Collaborator

The base branch has been updated since the last successful testing.

  • last PASS base branch sha: 4d52213
  • current base branch sha : 474edd8
    The AutoTester will discard the last PASS, and re-test the PR from scratch

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5824
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: AUTOMERGE
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6048
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: AUTOMERGE
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: tcclevenger/view_allocs_in_p3
  • SHA: 3a85062
  • Mode: TEST_REPO

Pull Request Author: tcclevenger

@E3SM-Bot
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5824
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: AUTOMERGE
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 6048
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS p3;AT: AUTOMERGE
PULLREQUESTNUM 2980
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 3a85062
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 504eb56
TEST_REPO_ALIAS SCREAM

@E3SM-Bot E3SM-Bot merged commit e65666f into master Sep 10, 2024
7 checks passed
@E3SM-Bot E3SM-Bot deleted the tcclevenger/view_allocs_in_p3 branch September 10, 2024 18:09
@ndkeen
Copy link
Contributor

ndkeen commented Sep 13, 2024

I haven't yet tried the profiler, but just looking at performance of some cases I've been running recently:

For pm-gpu, using P3 monolithic
ne30  on 1 node     ~15% improvement in p3::run
ne120 on 8 nodes    ~9% improvement in p3::run
ne256 on 48 nodes   ~11% improvement in p3::run

Would need more benchmarking to say more details. Not yet seeing an improvement overall, but could be timing noise.

@tcclevenger
Copy link
Contributor Author

Thanks @ndkeen! Currently testing to see if this broke the pm-gpu tests.

@ndkeen
Copy link
Contributor

ndkeen commented Sep 13, 2024

Ah, yep, both ne30, ne120, and ne256 cases that I tried above are not BFB with previous cases.

tcclevenger pushed a commit that referenced this pull request Sep 13, 2024
…/view_allocs_in_p3"

This reverts commit e65666f, reversing
changes made to 474edd8.
@ndkeen
Copy link
Contributor

ndkeen commented Sep 16, 2024

fyi, i just experimented with using SMALL_KERNELS - turning on small kernels for both P3 and SHOC will yield BFB results with other cases. That is, in the checkout before this PR, both ne30 and ne120 case are BFB with/without small kernels (as we would hope). And then with checkout using this PR, same story. Not sure how much that helps the issue here.

@tcclevenger
Copy link
Contributor Author

@ndkeen So SK is BFB with monolithic for this merge commit (and so then both are non-BFB with previous merge commit). That is good to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p3 regarding p3 microphysics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix Kokkos View allocations in small-kernels P3. Kokkos View allocations in monolithic P3
5 participants