Skip to content

Don't merge: static args and hacks #731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 143 commits into
base: main
Choose a base branch
from

Conversation

havogt
Copy link
Contributor

@havogt havogt commented Apr 29, 2025

Consider extracting some of this into separate PRs and possibly merge:

  • all compile with static args could be extracted into a separate PR and tested, here it's only done for the "combined programs", the individual ones in the dycore are only compiled but not with static args. At some point this worked with the granule, but didn't manage to get it working now.
  • the -1 hacks need to be cleaned, see the separate PR Don't merge: hack skip values #728
  • the changes in the greenline driver could be merged separately or just used for performance measurements in the greenline (there is an expensive debug print in the current main)

@@ -20,7 +20,7 @@
import cProfile
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these changes are in #729

Comment on lines +160 to +161
if __debug__:
log.debug(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important to have this change for performance measurement in the greenline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to compare to fortran openacc or granule:
We noticed that there is an expensive debug print in the blueline with loglevel > 7?
So for performance run with Fortran ICON log level 1

@@ -167,6 +179,9 @@ def time_integration(
second_order_divdamp_factor,
do_prep_adv,
)

cuda.runtime.deviceSynchronize()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this syncthread

@@ -145,8 +145,9 @@ def _mch_ch_r04b09_config():

def _jablownoski_Williamson_config():
icon_run_config = Icon4pyRunConfig(
dtime=datetime.timedelta(seconds=300.0),
dtime=datetime.timedelta(seconds=150.0),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adjusted timestep for r2b7, but probably not needed for performance measurement

@@ -132,7 +133,24 @@ def offset_providers(self):
for key, value in self.offset_provider_mapping.items():
try:
method, *args = value
offset_providers[key] = method(*args) if args else method()
offset_provider = method(*args) if args else method()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the -1 hacks, see also #728

Comment on lines +491 to +496
#cpd=[constants.CPD],
#iau_wgt_dyn=[self._config.iau_wgt_dyn],
#is_iau_active=[self._config.is_iau_active],
#limited_area=[self._grid.limited_area],
#iadv_rhotheta=[self._config.iadv_rhotheta],
#igradp_method=[self._config.igradp_method],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if all of these are enabled, then gtfn will error in some transformation (for the mch_dsl experiment), the current selection is a random set, not the minimal set...

Comment on lines +629 to +632
xp = data_alloc.import_array_ns(self._backend)
tmp = xp.where(self._metric_state_nonhydro.scaling_factor_for_3d_divdamp.ndarray > 0.0)
print(tmp, flush=True)
self.starting_vertical_index_for_3d_divdamp = xp.min(tmp)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is broken, triggered in mch_icon_ch2

xp.where(self._metric_state_nonhydro.scaling_factor_for_3d_divdamp.ndarray > 0.0)
)
xp = data_alloc.import_array_ns(self._backend)
tmp = xp.where(self._metric_state_nonhydro.scaling_factor_for_3d_divdamp.ndarray > 0.0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this where is missing 2 arguments? but didn't have time what the code should do...

Copy link

github-actions bot commented May 8, 2025

Mandatory Tests

Please make sure you run these tests via comment before you merge!

  • cscs-ci run default
  • launch jenkins spack

Optional Tests

To run benchmarks you can use:

  • cscs-ci run benchmark-bencher

To run tests and benchmarks with the DaCe backend you can use:

  • cscs-ci run dace

In case your change might affect downstream icon-exclaim, please consider running

  • launch jenkins icon

For more detailed information please look at CI in the EXCLAIM universe.

@edopao
Copy link
Contributor

edopao commented May 8, 2025

cscs-ci run dace

nfarabullini added a commit that referenced this pull request May 28, 2025
Following @havogt 's PR: #731 

Added pre-compiled static args to stencils, e.g.:
```python
 self.calculate_nabla2_for_theta = calculate_nabla2_for_theta.with_backend(
            self._backend
        ).compile(
            vertical_start=[0],
            vertical_end=[self._grid.num_levels],
            offset_provider=self._grid.offset_providers,
        )
```

---------

Co-authored-by: Edoardo Paone <[email protected]>
@edopao
Copy link
Contributor

edopao commented Jun 13, 2025

This PR should be rebased (to keep the edits in the driver for perf measurements) or closed.

iomaganaris pushed a commit that referenced this pull request Jun 16, 2025
Following @havogt 's PR: #731 

Added pre-compiled static args to stencils, e.g.:
```python
 self.calculate_nabla2_for_theta = calculate_nabla2_for_theta.with_backend(
            self._backend
        ).compile(
            vertical_start=[0],
            vertical_end=[self._grid.num_levels],
            offset_provider=self._grid.offset_providers,
        )
```

---------

Co-authored-by: Edoardo Paone <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants