Description
Hi,
I tried to build palace with CUDA on, and run the examples/cpw/cpw_lumped_uniform.json by changing the config["Solver"]["Device"]="GPU".
When the execution goes to power calculation, the segmentation fault occurs.
Running with 1 MPI process, 1 OpenMP thread
Detected 2 CUDA devices
Device configuration: cuda,omp,cpu
Memory configuration: host-std,cuda
libCEED backend: /gpu/cuda/magma
Added 1796 elements in 2 iterations of local bisection for under-resolved interior boundaries
Added 980 duplicate vertices for interior boundaries in the mesh
Added 2752 duplicate boundary elements for interior boundaries in the mesh
Added 466 boundary elements for material interfaces to the mesh
Finished partitioning mesh into 1 subdomain
Characteristic length and time scales:
L₀ = 4.500e-03 m, t₀ = 1.501e-02 ns
Mesh curvature order: 1
Mesh bounding box:
(Xmin, Ymin, Zmin) = (-2.500e-04, -5.000e-04, -1.000e-03) m
(Xmax, Ymax, Zmax) = (+4.250e-03, +2.432e-03, +1.000e-03) m
Parallel Mesh Stats:
minimum average maximum total
vertices 4171 4171 4171 4171
edges 24081 24081 24081 24081
faces 36718 36718 36718 36718
elements 16810 16810 16810 16810
neighbors 0 0 0
minimum maximum
h 0.00281235 0.167882
kappa 1.06818 14.748
Configuring Robin absorbing BC (order 1) at attributes:
4
Configuring Robin impedance BC for lumped ports at attributes:
5: Rs = 2.241e+02 Ω/sq, n = (+0.0,+0.0,+1.0)
9: Rs = 2.241e+02 Ω/sq, n = (-0.0,+0.0,+1.0)
6: Rs = 2.241e+02 Ω/sq, n = (+0.0,+0.0,+1.0)
10: Rs = 2.241e+02 Ω/sq, n = (-0.0,+0.0,+1.0)
7: Rs = 2.241e+02 Ω/sq, n = (+0.0,+0.0,+1.0)
11: Rs = 2.241e+02 Ω/sq, n = (+0.0,+0.0,+1.0)
8: Rs = 2.241e+02 Ω/sq, n = (-0.0,+0.0,+1.0)
12: Rs = 2.241e+02 Ω/sq, n = (+0.0,+0.0,+1.0)
Configuring lumped port circuit properties:
Index = 1: R = 5.602e+01 Ω
Index = 2: R = 5.602e+01 Ω
Index = 3: R = 5.602e+01 Ω
Index = 4: R = 5.602e+01 Ω
Configuring lumped port excitation source term at attributes:
5: Index = 1
9: Index = 1
6: Index = 2
10: Index = 2
Configuring Dirichlet PEC BC at attributes:
13
Computing frequency response for:
Excitation 1/2 with index 1 has contributions from:
Lumped port 1
Excitation 2/2 with index 2 has contributions from:
Lumped port 2
Assembling system matrices, number of global unknowns:
H1 (p = 1): 4171, ND (p = 1): 24081, RT (p = 1): 36718
Operator assembly level: Partial
Mesh geometries:
Tetrahedron: P = 6, Q = 4 (quadrature order = 2)
Sweeping excitation index 1 (1/2):
It 1/3: ω/2π = 2.000e+00 GHz (total elapsed time = 1.37e-04 s)
Assembling multigrid hierarchy:
Level 0 (p = 1): 24081 unknowns
Level 0 (auxiliary) (p = 1): 4171 unknowns
Residual norms for GMRES solve
0 (restart 0) KSP residual norm 3.781952e+01
1 (restart 0) KSP residual norm 2.726848e+01
2 (restart 0) KSP residual norm 7.693393e-01
3 (restart 0) KSP residual norm 5.982337e-01
4 (restart 0) KSP residual norm 5.411356e-01
5 (restart 0) KSP residual norm 8.857215e-03
6 (restart 0) KSP residual norm 1.755391e-03
7 (restart 0) KSP residual norm 1.007856e-03
8 (restart 0) KSP residual norm 7.717752e-05
9 (restart 0) KSP residual norm 2.299467e-05
10 (restart 0) KSP residual norm 4.280393e-06
11 (restart 0) KSP residual norm 9.448374e-07
12 (restart 0) KSP residual norm 1.959289e-07
GMRES solver converged in 12 iterations (avg. reduction factor: 2.040e-01)
Sol. ||E|| = 1.153545e+01 (||RHS|| = 3.261025e-01)
Field energy E (3.602e-02 J) + H (7.019e-03 J) = 4.304e-02 J
Segmentation fault (core dumped)
Any suggestions on this problem?
Thanks!