Skip to content

Tilted BBL example not working on GPU #4438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xiaozhour opened this issue Apr 23, 2025 · 8 comments · Fixed by #4439
Closed

Tilted BBL example not working on GPU #4438

xiaozhour opened this issue Apr 23, 2025 · 8 comments · Fixed by #4439

Comments

@xiaozhour
Copy link
Collaborator

xiaozhour commented Apr 23, 2025

Hi! I’m trying to design a simulation based on the tilted BBL example, but it’s not working properly on GPUs. The error message is attached below. If I remove b = B∞_field from the background_fields = (; b = B∞_field, v = V∞_field) line, the simulation runs. Any help would be appreciated! (thanks for including this example - it’s been very helpful for me @tomchor )

ERROR: LoadError: GPU compilation of MethodInstance for Oceananigans.Models.NonhydrostaticModels.gpu_compute_Gu!(::KernelAbstractions.CompilerMetadata{…}, ::OffsetArrays.OffsetArray{…}, ::RectilinearGrid{…}, ::Nothing, ::Tuple{…}) failed
KernelError: passing non-bitstype argument

Argument 6 to your kernel function is of type Tuple{UpwindBiased{3, Float64, Nothing, Nothing, Nothing, UpwindBiased{2, Float64, Nothing, Nothing, Nothing, UpwindBiased{1, Float64, Nothing, Nothing, Nothing, Nothing, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}, Centered{2, Float64, Nothing, Nothing, Nothing, Centered{1, Float64, Nothing, Nothing, Nothing, Nothing}}}, ConstantCartesianCoriolis{Float64}, Nothing, ScalarDiffusivity{Oceananigans.TurbulenceClosures.ExplicitTimeDiscretization, Oceananigans.TurbulenceClosures.ThreeDimensionalFormulation, @NamedTuple{b::Float64}, Float64, @NamedTuple{b::Float64}}, BoundaryCondition{Oceananigans.BoundaryConditions.Flux, Nothing}, BuoyancyForce{BuoyancyTracer, Tuple{Float64, Float64, Float64}}, @NamedTuple{velocities::@NamedTuple{u::Oceananigans.Fields.ZeroField{Int64, 3}, v::Oceananigans.Fields.ConstantField{Float64, 3}, w::Oceananigans.Fields.ZeroField{Int64, 3}}, tracers::@NamedTuple{b::Oceananigans.Fields.FunctionField{Center, Center, Center, @NamedTuple{time::Float64, last_Δt::Float64, last_stage_Δt::Float64, iteration::Int64, stage::Int64}, @NamedTuple{ĝ::Vector{Float64}, N²::Float64}, typeof(constant_stratification), RectilinearGrid{Float64, Periodic, Flat, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}}, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Nothing, Nothing}, Float64}}}, @NamedTuple{u::OffsetArrays.OffsetArray{Float64, 3, CUDA.CuDeviceArray{Float64, 3, 1}}, v::OffsetArrays.OffsetArray{Float64, 3, CUDA.CuDeviceArray{Float64, 3, 1}}, w::OffsetArrays.OffsetArray{Float64, 3, CUDA.CuDeviceArray{Float64, 3, 1}}}, @NamedTuple{b::OffsetArrays.OffsetArray{Float64, 3, CUDA.CuDeviceArray{Float64, 3, 1}}}, @NamedTuple{}, Nothing, OffsetArrays.OffsetArray{Float64, 3, CUDA.CuDeviceArray{Float64, 3, 1}}, @NamedTuple{time::Float64, last_Δt::Float64, last_stage_Δt::Float64, iteration::Int64, stage::Int64}, typeof(Oceananigans.Forcings.zeroforcing)}, which is not a bitstype:
  .7 is of type @NamedTuple{velocities::@NamedTuple{u::Oceananigans.Fields.ZeroField{Int64, 3}, v::Oceananigans.Fields.ConstantField{Float64, 3}, w::Oceananigans.Fields.ZeroField{Int64, 3}}, tracers::@NamedTuple{b::Oceananigans.Fields.FunctionField{Center, Center, Center, @NamedTuple{time::Float64, last_Δt::Float64, last_stage_Δt::Float64, iteration::Int64, stage::Int64}, @NamedTuple{ĝ::Vector{Float64}, N²::Float64}, typeof(constant_stratification), RectilinearGrid{Float64, Periodic, Flat, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}}, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Nothing, Nothing}, Float64}}} which is not isbits.
    .tracers is of type @NamedTuple{b::Oceananigans.Fields.FunctionField{Center, Center, Center, @NamedTuple{time::Float64, last_Δt::Float64, last_stage_Δt::Float64, iteration::Int64, stage::Int64}, @NamedTuple::Vector{Float64}, N²::Float64}, typeof(constant_stratification), RectilinearGrid{Float64, Periodic, Flat, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}}, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Nothing, Nothing}, Float64}} which is not isbits.
      .b is of type Oceananigans.Fields.FunctionField{Center, Center, Center, @NamedTuple{time::Float64, last_Δt::Float64, last_stage_Δt::Float64, iteration::Int64, stage::Int64}, @NamedTuple::Vector{Float64}, N²::Float64}, typeof(constant_stratification), RectilinearGrid{Float64, Periodic, Flat, Bounded, Oceananigans.Grids.StaticVerticalDiscretization{OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}, OffsetArrays.OffsetVector{Float64, CUDA.CuDeviceVector{Float64, 1}}}, Float64, Float64, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Nothing, Nothing}, Float64} which is not isbits.
        .parameters is of type @NamedTuple::Vector{Float64}, N²::Float64} which is not isbits.
          .ĝ is of type Vector{Float64} which is not isbits.


Only bitstypes, which are "plain data" types that are immutable
and contain no references to other values, can be used in GPU kernels.
For more information, see the `Base.isbitstype` function.
@simone-silvestri
Copy link
Collaborator

simone-silvestri commented Apr 23, 2025

I think that example is not GPU-compatible. The problem is that is a Vector{Float64}, i.e. a CPU vector which is not an isbits type.
To make it such, you should change

ĝ = [sind(θ), 0, cosd(θ)]

to

= [sind(θ), 0, cosd(θ)]
ĝ = Oceananigans.on_architecture(arch, ĝ) # assuming you have an `arch` variable with `arch == GPU()`

Another way to do it is to convert the to an immutable isbits type like a tuple:

= (sind(θ), 0, cosd(θ))

@xiaozhour
Copy link
Collaborator Author

Thanks Simone! Converting to a tuple gives the following error:
ERROR: LoadError: MethodError: no method matching -(::Tuple{Float64, Int64, Float64})

Using the first change gives the following error:
ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations do not execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

@simone-silvestri
Copy link
Collaborator

ok, we should go the tuple route I think. Can you show me the location of the error? (i.e. at which line the errors are encountered in the stacktrace) Can you also attach the snippet of your code?

@xiaozhour
Copy link
Collaborator Author

The error appears in the next line "buoyancy = BuoyancyForce(BuoyancyTracer(), gravity_unit_vector = -ĝ)"

@xiaozhour
Copy link
Collaborator Author

xiaozhour commented Apr 23, 2025

using Oceananigans
using Oceananigans.Units

Lx = 200meters
Lz = 100meters
Nx = 64
Nz = 64

# Creates a grid with near-constant spacing `refinement * Lz / Nz`
# near the bottom:
refinement = 1.8 # controls spacing near surface (higher means finer spaced)
stretching = 10  # controls rate of stretching at bottom

# "Warped" height coordinate
h(k) = (Nz + 1 - k) / Nz

# Linear near-surface generator
ζ(k) = 1 + (h(k) - 1) / refinement

# Bottom-intensified stretching function
Σ(k) = (1 - exp(-stretching * h(k))) / (1 - exp(-stretching))

# Generating function
z_faces(k) = - Lz * (ζ(k) * Σ(k) - 1)


grid = RectilinearGrid(GPU(), topology = (Periodic, Flat, Bounded),
                       size = (Nx, Nz),
                       x = (0, Lx),
                       z = z_faces)


θ = 3 # degrees
ĝ = [sind(θ), 0, cosd(θ)]
# ĝ = Oceananigans.on_architecture(GPU(), ĝ)

buoyancy = BuoyancyForce(BuoyancyTracer(), gravity_unit_vector = -ĝ)
coriolis = ConstantCartesianCoriolis(f = 1e-4, rotation_axis = ĝ)

@inline constant_stratification(x, z, t, p) = p.* (x * p.ĝ[1] + z * p.ĝ[3])
N² = 1e-5 # s⁻² # background vertical buoyancy gradient
B∞_field = BackgroundField(constant_stratification, parameters=(; ĝ, N² = N²))

∂z_b_bottom = -* cosd(θ)
negative_background_diffusive_flux = GradientBoundaryCondition(∂z_b_bottom)
b_bcs = FieldBoundaryConditions(bottom = negative_background_diffusive_flux)


V∞ = 0.1 # m s⁻¹


u_bcs = FieldBoundaryConditions(top = FluxBoundaryCondition(nothing),
                               bottom = ValueBoundaryCondition(0.0))
v_bcs = FieldBoundaryConditions(top = FluxBoundaryCondition(nothing),
                               bottom = ValueBoundaryCondition(-V∞))



V∞_field = BackgroundField(V∞)


ν = 1e-4
κ = 1e-4
closure = ScalarDiffusivity=ν, κ=κ)

model = NonhydrostaticModel(; grid, buoyancy, coriolis, closure,
                            advection = UpwindBiased(order=5),
                            tracers = :b,
                            boundary_conditions = (u=u_bcs, v=v_bcs, b=b_bcs),
                            background_fields = (; b=B∞_field, v=V∞_field))


noise(x, z) = 1e-3 * randn() * exp(-(10z)^2 / grid.Lz^2)
set!(model, u=noise, w=noise)


# Δt₀ = 0.5 * minimum([minimum_zspacing(grid) / V∞, minimum_zspacing(grid)^2/κ])
# simulation = Simulation(model, Δt = Δt₀, stop_time = 1day)

simulation = Simulation(model, Δt=0.1, stop_time=8days)


wizard = TimeStepWizard(max_change=1.1, cfl=0.7)
simulation.callbacks[:wizard] = Callback(wizard, IterationInterval(4))


using Printf

start_time = time_ns() # so we can print the total elapsed wall time

progress_message(sim) =
    @printf("Iteration: %04d, time: %s, Δt: %s, max|w|: %.1e m s⁻¹, wall time: %s\n",
            iteration(sim), prettytime(time(sim)),
            prettytime(sim.Δt), maximum(abs, sim.model.velocities.w),
            prettytime((time_ns() - start_time) * 1e-9))

simulation.callbacks[:progress] = Callback(progress_message, IterationInterval(200))

u, v, w = model.velocities
b = model.tracers.b
B∞ = model.background_fields.tracers.b

B = b + B∞
V = v + V∞


outputs = (; u, V, w, B)


simulation.output_writers[:fields] =  JLD2Writer(model, outputs;
                                                    filename = "bottom_drag.jld2",
                                                    schedule = TimeInterval(180minutes),
                                                    overwrite_existing = true)

run!(simulation) 

@glwagner
Copy link
Member

Yeah, we just need to use a tuple, plus .- to compute the opposing vector.

@glwagner
Copy link
Member

which is done here: #4439

@xiaozhour
Copy link
Collaborator Author

Works now - thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants