Skip to content

CUDA_ERROR_LAUNCH_FAILED when running GMMNLSE_driver_gpu_1550_linear.m #3

Open
@SeveNOlogy7

Description

@SeveNOlogy7

Hi,

I came across a CUDA_ERROR_LAUNCH_FAILED error when trying to run the GMMNLSE_driver_gpu_1550_linear.m file. The full error output is as follows.

Warning: An unexpected error occurred during CUDA execution. The CUDA error was:
CUDA_ERROR_LAUNCH_FAILED 
> In GMMNLSE_propagate (line 37)
  In GMMNLSE_driver_gpu_1550_linear (line 74) 
Error using gpuArray/ifft
An error occurred during PTX compilation of <image>.
The information log was:

The error log was:

The CUDA error code was: CUDA_ERROR_LAUNCH_FAILED.

Error in GMMNLSE_MPA_step (line 96)
    Vpl = dt*fft(hrw.*ifft(Vpl));

Error in GMMNLSE_propagate (line 308)
        [num_it, last_result] = GMMNLSE_MPA_step(last_result, initial_condition.dt, sim, nonlin_const, mode_info, omegas, D_pos,
        D_neg, hrw);

Error in GMMNLSE_driver_gpu_1550_linear (line 74)
prop_output = GMMNLSE_propagate(fiber, initial_condition, sim); % This actually does the propagation

I believe the CUDA and VC toolchains I installed were working just fine. And other .m files like GMMNLSE_driver_gpu_1550_GRINMMS_XPM.m etc worked normally on my PC.

The error may be related to Line 22, fiber.SR = 0*SR; , in the GMMNLSE_driver_gpu_1550_linear.m file. Once I changed that line to like fiber.SR = 1e-6*SR; , the error disappeared and the result is almost the same to those shown in the Advanced Example pdf.

This could be a temporary solution for me and I don't know if other people had the same issue. Also, it would be nice if anyone knows a properer solution for this error. Thanks.

I will post my computing environment here for your reference.

Intel Xeon CPU E5-2620 [email protected]

CUDADevice with properties:

                  Name: 'Quadro K620'
                 Index: 1
     ComputeCapability: '5.0'
        SupportsDouble: 1
         DriverVersion: 9.1000
        ToolkitVersion: 8
    MaxThreadsPerBlock: 1024
      MaxShmemPerBlock: 49152
    MaxThreadBlockSize: [1024 1024 64]
           MaxGridSize: [2.1475e+09 65535 65535]
             SIMDWidth: 32
           TotalMemory: 2.1475e+09
   MultiprocessorCount: 3
          ClockRateKHz: 1124000
           ComputeMode: 'Default'
  GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
      CanMapHostMemory: 1
       DeviceSupported: 1
        DeviceSelected: 1

Windows10 Ver.1709

Cuda compilation tools, release 9.1, V9.1.85

Visual C++ 2015 toolset

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions