-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Description
Current Behavior
The image below causes a floating-point exception (SIGFPE) under ubuntu (WSL) when using the legacy model with psm_mode = 7 due to an invalid input to the asinf function. The exception is triggered when the input to asinf is slightly out of its valid range, specifically -1.00000012. This results in a program termination with a SIGFPE error. Notably, this issue does not occur under Windows.
Backtrace:
The backtrace indicates that the error originates from the tesseract::Wordrec::angle_change function:
-> see "other information"
tesseract command:
tesseract.exe -l eng+deu "tesseract_fail.png" stdout --tessdata-dir "<TESSDATA_DIR>" --oem 0 --psm 7
i used the legacy models for english and german from tesseract-ocr/tessdata
interestingly, when moving the single "d" in the bottom part of the image one pixel up or to the right the exception will not be thrown anymore.
I will gladly provide additional information if needed.
image to reproduce the behavior:
Expected Behavior
Tesseract should handle the input gracefully without causing a floating-point exception.
Suggested Fix
No response
tesseract -v
tesseract 5.3.4
leptonica-1.83.1
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 3.0.0) : libpng 1.6.43 : libtiff 4.6.0 : zlib 1.2.13 : libwebp 1.4.0 : libopenjp2 2.5.2
Found AVX512BW
Found AVX512F
Found AVX512VNNI
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 201511
Found libarchive 3.7.2 zlib/1.2.13 liblzma/5.2.6 bz2lib/1.0.8 liblz4/1.9.3 libzstd/1.5.5
Operating System
No response
Other Operating System
Ubuntu inside Windows Subsystem for Linux (WSL)
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
uname -a
Linux 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Compiler
No response
CPU
No response
Virtualization / Containers
No response
Other Information
this is the output of bt: (gdb) bt
#0 0x00007f66916bc552 in __GI___feraiseexcept (excepts=excepts@entry=1)
at ../sysdeps/x86_64/fpu/fraiseexcpt.c:36
#1 0x00007f66916c2590 in __asinf (x=-1.00000012) at ./math/w_asinf_compat.c:34
#2 __asinf (x=-1.00000012) at ./math/w_asinf_compat.c:28
#3 0x00007f6691f8dd63 in tesseract::Wordrec::angle_change(tesseract::EDGEPT*, tesseract::EDGEPT*, tesseract::EDGEPT*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#4 0x00007f6691f8e243 in tesseract::Wordrec::pick_close_point(tesseract::EDGEPT*, tesseract::EDGEPT*, int*)
() from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#5 0x00007f6691f8e66b in tesseract::Wordrec::vertical_projection_point(tesseract::EDGEPT*, tesseract::EDGEPT*, tesseract::EDGEPT**, tesseract::EDGEPT_CLIST*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#6 0x00007f6691f935c6 in tesseract::Wordrec::try_vertical_splits(tesseract::EDGEPT**, short, tesseract::EDGEPT_CLIST*, tesseract::GenericHeap<tesseract::KDPtrPairInc<float, tesseract::SEAM> >, tesseract::GenericHeap<tesseract::KDPtrPairDec<float, tesseract::SEAM> >, tesseract::SEAM**, tesseract::TBLOB*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#7 0x00007f6691f93c56 in tesseract::Wordrec::pick_good_seam(tesseract::TBLOB*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#8 0x00007f6691f8fa43 in tesseract::Wordrec::attempt_blob_chop(tesseract::TWERD*, tesseract::TBLOB*, int, bool, std::vector<tesseract::SEAM*, std::allocatortesseract::SEAM* > const&) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#9 0x00007f6691f909b2 in tesseract::Wordrec::improve_one_blob(std::vector<tesseract::BLOB_CHOICE*, std::allocatortesseract::BLOB_CHOICE* > const&, std::vector<tesseract::DANGERR_INFO, std::allocatortesseract::DANGERR_INFO >, bool, bool, tesseract::WERD_RES, unsigned int*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#10 0x00007f6691f90bd0 in tesseract::Wordrec::improve_by_chopping(float, tesseract::WERD_RES*, tesseract::BestChoiceBundle*, tesseract::BlamerBundle*, tesseract::LMPainPoints*, std::vector<tesseract::SegSearchPending, st--Type for more, q to quit, c to continue without paging--c
d::allocatortesseract::SegSearchPending >) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#11 0x00007f6691fa0a78 in tesseract::Wordrec::SegSearch(tesseract::WERD_RES, tesseract::BestChoiceBundle*, tesseract::BlamerBundle*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#12 0x00007f6691f8f0c8 in tesseract::Wordrec::chop_word_main(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#13 0x00007f6691f8cc6d in tesseract::Wordrec::cc_recog(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#14 0x00007f6691e5f71c in tesseract::Tesseract::recog_word_recursive(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#15 0x00007f6691e5f8c4 in tesseract::Tesseract::recog_word(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#16 0x00007f6691e5cb62 in tesseract::Tesseract::tess_segment_pass_n(int, tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#17 0x00007f6691e04b52 in tesseract::Tesseract::match_word_pass_n(int, tesseract::WERD_RES*, tesseract::ROW*, tesseract::BLOCK*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#18 0x00007f6691e04d0b in tesseract::Tesseract::classify_word_pass1(tesseract::WordData const&, tesseract::WERD_RES**, tesseract::PointerVectortesseract::WERD_RES) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#19 0x00007f6691e0810a in tesseract::Tesseract::RetryWithLanguage(tesseract::WordData const&, void (tesseract::Tesseract::)(tesseract::WordData const&, tesseract::WERD_RES**, tesseract::PointerVectortesseract::WERD_RES), bool, tesseract::WERD_RES**, tesseract::PointerVectortesseract::WERD_RES) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#20 0x00007f6691e08b22 in tesseract::Tesseract::classify_word_and_language(int, tesseract::PAGE_RES_IT*, tesseract::WordData*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#21 0x00007f6691e0d41d in tesseract::Tesseract::RecogAllWordsPassN(int, tesseract::ETEXT_DESC*, tesseract::PAGE_RES_IT*, std::vector<tesseract::WordData, std::allocatortesseract::WordData >) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#22 0x00007f6691e0e464 in tesseract::Tesseract::recog_all_words(tesseract::PAGE_RES, tesseract::ETEXT_DESC*, tesseract::TBOX const*, char const*, int) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#23 0x00007f6691ddff64 in tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#24 0x00007f6691de056b in tesseract::TessBaseAPI::ProcessPage(Pix*, int, char const*, char const*, int, tesseract::TessResultRenderer*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#25 0x00007f6691de18e1 in tesseract::TessBaseAPI::ProcessPagesInternal(char const*, char const*, int, tesseract::TessResultRenderer*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#26 0x00007f6691de1adf in tesseract::TessBaseAPI::ProcessPages(char const*, char const*, int, tesseract::TessResultRenderer*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#27 0x0000556ecc08455b in main ()