The run efficiency of main target built based on CMakeList and main built based on Makefile differ greatly #440

chenqianhe · 2023-01-24T06:12:19Z

main built based on make run

whisper_init_from_file: loading model from 'ggml-large.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 5
whisper_model_load: mem required  = 4641.00 MB (+   71.00 MB per decoder)
whisper_model_load: kv self size  =   70.00 MB
whisper_model_load: kv cross size =  234.38 MB
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 2950.97 MB
whisper_model_load: model size    = 2950.66 MB

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 

main: processing 'samples_jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = zh, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:11.000]  And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.


whisper_print_timings:     fallbacks =   0 p /   0 h
whisper_print_timings:     load time =  2146.14 ms
whisper_print_timings:      mel time =    19.26 ms
whisper_print_timings:   sample time =    11.52 ms /    27 runs (    0.43 ms per run)
whisper_print_timings:   encode time =  6265.27 ms /     1 runs ( 6265.27 ms per run)
whisper_print_timings:   decode time =  1646.73 ms /    27 runs (   60.99 ms per run)
whisper_print_timings:    total time = 10142.16 ms

main built based on cmake run

whisper_init_from_file: loading model from '../../ggml-large.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 5
whisper_model_load: mem required  = 4641.00 MB (+   71.00 MB per decoder)
whisper_model_load: kv self size  =   70.00 MB
whisper_model_load: kv cross size =  234.38 MB
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     = 2950.97 MB
whisper_model_load: model size    = 2950.66 MB

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 

main: processing '../../samples_jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, lang = zh, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:11.000]  And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.


whisper_print_timings:     fallbacks =   0 p /   0 h
whisper_print_timings:     load time = 27118.13 ms
whisper_print_timings:      mel time =   107.82 ms
whisper_print_timings:   sample time =    46.74 ms /    27 runs (    1.73 ms per run)
whisper_print_timings:   encode time = 63709.25 ms /     1 runs (63709.25 ms per run)
whisper_print_timings:   decode time =  8359.73 ms /    27 runs (  309.62 ms per run)
whisper_print_timings:    total time = 99418.73 ms

I used the same device to get the above results.

I wonder why cmake's main is much slower. Is there something wrong with me

The text was updated successfully, but these errors were encountered:

chenqianhe · 2023-01-24T06:36:01Z

My goal is to optimize #260 ; I am implementing the addon of node, which can call the Whisper inference implemented by cpp. But it depends on cmake.

This is another case of addon that I have implemented. It is crucial that I can complete CMakeList.txt

ggerganov · 2023-01-24T09:16:48Z

Most likely you have CMake build in Debug.
Try the following:

rm CMakeCache.txt
cmake -DCMAKE_BUILD_TYPE=Release ../../
make

chenqianhe · 2023-01-24T09:57:38Z

Most likely you have CMake build in Debug. Try the following:
rm CMakeCache.txt
cmake -DCMAKE_BUILD_TYPE=Release ../../
make

Thank you very much!
It is really caused by this problem.

chenqianhe closed this as completed Jan 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The run efficiency of main target built based on CMakeList and main built based on Makefile differ greatly #440

The run efficiency of main target built based on CMakeList and main built based on Makefile differ greatly #440

chenqianhe commented Jan 24, 2023

chenqianhe commented Jan 24, 2023

ggerganov commented Jan 24, 2023

chenqianhe commented Jan 24, 2023

The run efficiency of main target built based on CMakeList and main built based on Makefile differ greatly #440

The run efficiency of main target built based on CMakeList and main built based on Makefile differ greatly #440

Comments

chenqianhe commented Jan 24, 2023

chenqianhe commented Jan 24, 2023

ggerganov commented Jan 24, 2023

chenqianhe commented Jan 24, 2023