Skip to content

GPU compiler error in tendency computation using DiscreteForcing with GPU + Float32 + immersed RectilinearGrid #4192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ali-ramadhan opened this issue Mar 10, 2025 · 23 comments · Fixed by #4193
Labels
bug 🐞 Even a perfect program still has bugs GPU 👾 Where Oceananigans gets its powers from

Comments

@ali-ramadhan
Copy link
Member

ali-ramadhan commented Mar 10, 2025

This might be similar to issue #4165 but this time the GPU compiler error is in the tendency computation, and specifically div_𝐯u.

Going to the CPU, or switching to Float64, or to LatitudeLongitudeGrid, or a not immersed RectilinearGrid causes the MWE to work and not produce an error. So the error only comes up in this very specific configuration.


MWE:

using Oceananigans

underlying_grid = RectilinearGrid(GPU(), Float32;
    topology = (Bounded, Bounded, Bounded),
    size = (10, 10, 10),
    x = (0, 1),
    y = (0, 1),
    z = (-1, 0)
)

height = 1/5
width = 1/5
mount(x, y) = height * exp(-x^2 / 2width^2) * exp(-y^2 / 2width^2)
bottom(x, y) = -1 + mount(x, y)

grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))

@inline relax(i, j, k, grid, clock, fields, p) = - p.rate * (fields.u[i, j, k] - p.u★)

params = (
    rate = 1.0,
    u★ = 0.0
)

u_forcing = Forcing(relax; discrete_form=true, parameters=params)

forcing = (;
    u = u_forcing
)

model = NonhydrostaticModel(; grid, forcing)

simulation = Simulation(model, Δt=0.01, stop_iteration=1)

run!(simulation)

Error:

ERROR: InvalidIRError: compiling MethodInstance for Oceananigans.Models.NonhydrostaticModels.gpu_compute_Gu!(::KernelAbstractions.CompilerMetadata{…}, ::OffsetArrays.OffsetArray{…}, ::ImmersedBoundaryGrid{…}, ::Nothing, ::Tuple{…}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to +)
Stacktrace:
 [1] div_𝐯u
   @ ~/atdepth/Oceananigans.jl/src/Advection/momentum_advection_operators.jl:47
 [2] u_velocity_tendency
   @ ~/atdepth/Oceananigans.jl/src/Models/NonhydrostaticModels/nonhydrostatic_tendency_kernel_functions.jl:68
 [3] gpu_compute_Gu!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [4] gpu_compute_Gu!
   @ ./none:0
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/validation.jl:167
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/mgx54/src/driver.jl:382 [inlined]
  [3] emit_llvm(job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/utils.jl:110
  [4] emit_llvm
    @ ~/.julia/packages/GPUCompiler/mgx54/src/utils.jl:108 [inlined]
  [5] compile_unhooked(output::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/driver.jl:95
  [6] compile_unhooked
    @ ~/.julia/packages/GPUCompiler/mgx54/src/driver.jl:80 [inlined]
  [7] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/driver.jl:67
  [8] compile
    @ ~/.julia/packages/GPUCompiler/mgx54/src/driver.jl:55 [inlined]
  [9] #1171
    @ ~/.julia/packages/CUDA/jkvdc/src/compiler/compilation.jl:255 [inlined]
 [10] JuliaContext(f::CUDA.var"#1171#1174"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/driver.jl:34
 [11] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/driver.jl:25
 [12] compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/jkvdc/src/compiler/compilation.jl:254
 [13] actual_compilation(cache::Dict{Any, CUDA.CuFunction}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/execution.jl:245
 [14] cached_compilation(cache::Dict{Any, CUDA.CuFunction}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/mgx54/src/execution.jl:159
 [15] macro expansion
    @ ~/.julia/packages/CUDA/jkvdc/src/compiler/execution.jl:373 [inlined]
 [16] macro expansion
    @ ./lock.jl:267 [inlined]
 [17] cufunction(f::typeof(Oceananigans.Models.NonhydrostaticModels.gpu_compute_Gu!), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, OffsetArrays.OffsetArray{…}, ImmersedBoundaryGrid{…}, Nothing, Tuple{…}}}; kwargs::@Kwargs{always_inline::Bool, maxthreads::Int64})
    @ CUDA ~/.julia/packages/CUDA/jkvdc/src/compiler/execution.jl:368
 [18] macro expansion
    @ ~/.julia/packages/CUDA/jkvdc/src/compiler/execution.jl:112 [inlined]
 [19] (::KernelAbstractions.Kernel{…})(::Field{…}, ::Vararg{…}; ndrange::Nothing, workgroupsize::Nothing)
    @ CUDA.CUDAKernels ~/.julia/packages/CUDA/jkvdc/src/CUDAKernels.jl:103
 [20] (::KernelAbstractions.Kernel{…})(::Field{…}, ::Vararg{…})
    @ CUDA.CUDAKernels ~/.julia/packages/CUDA/jkvdc/src/CUDAKernels.jl:89
 [21] _launch!(::GPU{…}, ::ImmersedBoundaryGrid{…}, ::Symbol, ::Function, ::Field{…}, ::ImmersedBoundaryGrid{…}, ::Vararg{…}; exclude_periphery::Bool, reduced_dimensions::Tuple{}, active_cells_map::Nothing)
    @ Oceananigans.Utils ~/atdepth/Oceananigans.jl/src/Utils/kernel_launching.jl:298
 [22] _launch!
    @ ~/atdepth/Oceananigans.jl/src/Utils/kernel_launching.jl:275 [inlined]
 [23] launch!
    @ ~/atdepth/Oceananigans.jl/src/Utils/kernel_launching.jl:258 [inlined]
 [24] #compute_interior_tendency_contributions!#17
    @ ~/atdepth/Oceananigans.jl/src/Models/NonhydrostaticModels/compute_nonhydrostatic_tendencies.jl:105 [inlined]
 [25] compute_interior_tendency_contributions!
    @ ~/atdepth/Oceananigans.jl/src/Models/NonhydrostaticModels/compute_nonhydrostatic_tendencies.jl:57 [inlined]
 [26] compute_tendencies!(model::NonhydrostaticModel{…}, callbacks::Vector{…})
    @ Oceananigans.Models.NonhydrostaticModels ~/atdepth/Oceananigans.jl/src/Models/NonhydrostaticModels/compute_nonhydrostatic_tendencies.jl:35
 [27] #apply_regionally!#56
    @ ~/atdepth/Oceananigans.jl/src/Utils/multi_region_transformation.jl:121 [inlined]
 [28] apply_regionally!
    @ ~/atdepth/Oceananigans.jl/src/Utils/multi_region_transformation.jl:118 [inlined]
 [29] macro expansion
    @ ~/atdepth/Oceananigans.jl/src/Utils/multi_region_transformation.jl:206 [inlined]
 [30] update_state!(model::NonhydrostaticModel{…}, callbacks::Vector{…}; compute_tendencies::Bool)
    @ Oceananigans.Models.NonhydrostaticModels ~/atdepth/Oceananigans.jl/src/Models/NonhydrostaticModels/update_nonhydrostatic_model_state.jl:53
 [31] update_state! (repeats 2 times)
    @ ~/atdepth/Oceananigans.jl/src/Models/NonhydrostaticModels/update_nonhydrostatic_model_state.jl:20 [inlined]
 [32] initialize!(sim::Simulation{NonhydrostaticModel{…}, Float32, Float32, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}})
    @ Oceananigans.Simulations ~/atdepth/Oceananigans.jl/src/Simulations/run.jl:208
 [33] time_step!(sim::Simulation{NonhydrostaticModel{…}, Float32, Float32, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}})
    @ Oceananigans.Simulations ~/atdepth/Oceananigans.jl/src/Simulations/run.jl:138
 [34] run!(sim::Simulation{NonhydrostaticModel{…}, Float32, Float32, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}}; pickup::Bool)
    @ Oceananigans.Simulations ~/atdepth/Oceananigans.jl/src/Simulations/run.jl:105
 [35] run!(sim::Simulation{NonhydrostaticModel{…}, Float32, Float32, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}, OrderedCollections.OrderedDict{…}})
    @ Oceananigans.Simulations ~/atdepth/Oceananigans.jl/src/Simulations/run.jl:92
 [36] top-level scope
    @ REPL[15]:1
Some type information was truncated. Use `show(err)` to see complete types.

Environment: Oceananigans main branch (v0.95.23, commit 40e0a8733) with

julia> versioninfo()
Julia Version 1.10.8
Commit 4c16ff44be8 (2025-01-22 10:06 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 48 × AMD Ryzen Threadripper 7960X 24-Cores
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 16 default, 0 interactive, 8 GC (on 48 virtual cores)
Environment:
  LD_PRELOAD = /usr/NX/lib/libnxegl.so
julia> CUDA.versioninfo()
CUDA runtime 12.8, artifact installation
CUDA driver 12.8
NVIDIA driver 570.86.16

CUDA libraries: 
- CUBLAS: 12.8.3
- CURAND: 10.3.9
- CUFFT: 11.3.3
- CUSOLVER: 11.7.2
- CUSPARSE: 12.5.7
- CUPTI: 2025.1.0 (API 26.0.0)
- NVML: 12.0.0+570.86.16

Julia packages: 
- CUDA: 5.6.1
- CUDA_Driver_jll: 0.12.0+0
- CUDA_Runtime_jll: 0.16.0+0

Toolchain:
- Julia: 1.10.8
- LLVM: 15.0.7

1 device:
  0: NVIDIA GeForce RTX 4090 (sm_89, 20.040 GiB / 23.988 GiB available)
@ali-ramadhan ali-ramadhan added bug 🐞 Even a perfect program still has bugs GPU 👾 Where Oceananigans gets its powers from labels Mar 10, 2025
@ali-ramadhan
Copy link
Member Author

I know it's not easy to debug this stuff and it's probably an upstream issue. So I'm just opening this issue to document the bug.

It's also a pretty specific configuration so this is a low impact issue/bug.

@glwagner
Copy link
Member

That's pretty interesting though.

Is it fixed by using

params = (
    rate = 1,
    u★ = 0
)

?

@ali-ramadhan
Copy link
Member Author

Good catch! Should always be careful about types. Still getting the same error with Ints and also with

params = (
    rate = 1.0f0,
    u★ = 0.0f0
)

@simone-silvestri
Copy link
Collaborator

Is this a forcing-related issue or advection? What happens if you remove the forcing?

@glwagner
Copy link
Member

Ah but actually you aren't adding forcing to the model anyways

@ali-ramadhan
Copy link
Member Author

ali-ramadhan commented Mar 10, 2025

Ah sorry for the typo. The forcing should be in there. I edited the MWE to include it. I was about to test with/without forcing.

Turns out actually you don't need the forcing! This MWE without the forcing produces the same error:

using Oceananigans

underlying_grid = RectilinearGrid(GPU(), Float32;
    topology = (Bounded, Bounded, Bounded),
    size = (10, 10, 10),
    x = (0, 1),
    y = (0, 1),
    z = (-1, 0)
)

height = 1/5
width = 1/5
mount(x, y) = height * exp(-x^2 / 2width^2) * exp(-y^2 / 2width^2)
bottom(x, y) = -1 + mount(x, y)

grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))

model = NonhydrostaticModel(; grid)

simulation = Simulation(model, Δt=0.01, stop_iteration=1)

run!(simulation)

@glwagner
Copy link
Member

You can also use bottom(x, y) = -0.5. Quite weird!

@glwagner
Copy link
Member

glwagner commented Mar 10, 2025

I'm finding there is a type instability in div_vu; sometimes it is Float64 , other times Float32. It's unclear exactly whether this is producing the error but it seems suspicious.

@simone-silvestri
Copy link
Collaborator

simone-silvestri commented Mar 10, 2025

Maybe using Metal and a hydrostatic free surface model might shed light on the instability
I guess probably we have to change the Oceananigans.defaults.FloatType to Float32

@glwagner
Copy link
Member

I tried that, but there is still promotion

@glwagner
Copy link
Member

may have found the bug

@glwagner
Copy link
Member

#4193

@glwagner
Copy link
Member

Yeah so #4193 closes this, provided that we add

Oceananigans.defaults.FloatType = Float32

Another way is to specify Float32 in the advection scheme (should be -- I didn't test explicitly).

In the context of #4193, the error can be reproduced by setting the default to Float64 (or leaving it alone) and manually setting the grid to Float32.

So basically, sometimes promotions works (which is actually bad but does not error) and other times it throws an error (actually what we want, but it is surprising).

@glwagner
Copy link
Member

I think we should document how to change number type somewhere

@ali-ramadhan
Copy link
Member Author

Quick catch! Weird how sometimes it promotes and sometimes it errors haha.

But definitely good to document how to properly change number type (probably mostly to Float32 for GPUs at least).

@glwagner
Copy link
Member

Right I think we are hitting a compiler heuristic. When promotion is not completely inlined, we get an error.

@glwagner
Copy link
Member

Quick catch! Weird how sometimes it promotes and sometimes it errors haha.

But definitely good to document how to properly change number type (probably mostly to Float32 for GPUs at least).

Where should we put this in the docs?

@ali-ramadhan
Copy link
Member Author

Honestly I would advocate for a top-level page alongside the grids and fields pages. Could be called "Number type" or "Float precision"?

Right now I think we just have this in the legacy docs which definitely need embellishing: https://clima.github.io/OceananigansDocumentation/stable/model_setup/number_type/

@glwagner
Copy link
Member

Honestly I would advocate for a top-level page alongside the grids and fields pages. Could be called "Number type" or "Float precision"?

Right now I think we just have this in the legacy docs which definitely need embellishing: https://clima.github.io/OceananigansDocumentation/stable/model_setup/number_type/

I was hoping we would eventually get around to adding tutorials for models. It might belong in such a page, or perhaps after it. Because then we can illustrate how it will affect all types simultaneously, not just one.

that said building a tutorial for the models is a little daunting, whereas a simple page to comment on number type is easy, so maybe we should just throw it up and worry about a model tutorial in the longer run

@glwagner
Copy link
Member

Although actually a tutorial for models would be lifting a lot of that existing material (eg just reorganizing it)

@ali-ramadhan
Copy link
Member Author

ali-ramadhan commented Mar 14, 2025

@glwagner Does #4193 fix the MWE for you? I'm still getting the same error with Oceananigans v0.95.27.

This more minimal MWE (as suggested above) still produces the GPU compilation error:

using Oceananigans

underlying_grid = RectilinearGrid(GPU(), Float32;
    topology = (Bounded, Bounded, Bounded),
    size = (10, 10, 10),
    x = (0, 1),
    y = (0, 1),
    z = (-1, 0)
)

bottom(x, y) = -0.5

grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))

model = NonhydrostaticModel(; grid)

simulation = Simulation(model, Δt=0.01, stop_iteration=1)

run!(simulation)

@ali-ramadhan ali-ramadhan reopened this Mar 14, 2025
@glwagner
Copy link
Member

Ah sorry! I could have been clearer. I think the correct answer is that what you have written is not supported. However, you can try this:

using Oceananigans

Oceananigans.defaults.FloatType = Float32

underlying_grid = RectilinearGrid(GPU();
    topology = (Bounded, Bounded, Bounded),
    size = (10, 10, 10),
    x = (0, 1),
    y = (0, 1),
    z = (-1, 0)
)

bottom(x, y) = -0.5

grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))

model = NonhydrostaticModel(; grid)

simulation = Simulation(model, Δt=0.01, stop_iteration=1)

run!(simulation)

One can still override default FloatType for specific purposes / research, but this may cause compilation to fail.

Another way to make this code to pass that avoids changing the default FloatType is to also specify the advection scheme:

using Oceananigans

underlying_grid = RectilinearGrid(GPU(), Float32;
    topology = (Bounded, Bounded, Bounded),
    size = (10, 10, 10),
    x = (0, 1),
    y = (0, 1),
    z = (-1, 0)
)

bottom(x, y) = -0.5
grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))
advection = Centered(Float32, order=2)
model = NonhydrostaticModel(; grid, advection)

simulation = Simulation(model, Δt=0.01, stop_iteration=1)

run!(simulation)

I didn't test that so let me know if it works.

@ali-ramadhan
Copy link
Member Author

ali-ramadhan commented Mar 14, 2025

Ah thanks for clarifying! I thought changing the default to Centered(FT::DataType=Oceananigans.defaults.FloatType, ...) would fix the issue but the default was still Float64 since I didn't change it which makes sense.

I can confirm that both your examples work so I'll re-close the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Even a perfect program still has bugs GPU 👾 Where Oceananigans gets its powers from
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants