Skip to content

Floating-point exception (SIGFPE) due to out-of-range input to asinf in Wordrec::angle_change #4242

@ChristianOsta

Description

@ChristianOsta

Current Behavior

The image below causes a floating-point exception (SIGFPE) under ubuntu (WSL) when using the legacy model with psm_mode = 7 due to an invalid input to the asinf function. The exception is triggered when the input to asinf is slightly out of its valid range, specifically -1.00000012. This results in a program termination with a SIGFPE error. Notably, this issue does not occur under Windows.

Backtrace:
The backtrace indicates that the error originates from the tesseract::Wordrec::angle_change function:
-> see "other information"

tesseract command:
tesseract.exe -l eng+deu "tesseract_fail.png" stdout --tessdata-dir "<TESSDATA_DIR>" --oem 0 --psm 7

i used the legacy models for english and german from tesseract-ocr/tessdata

interestingly, when moving the single "d" in the bottom part of the image one pixel up or to the right the exception will not be thrown anymore.

I will gladly provide additional information if needed.

image to reproduce the behavior:
tesseract_crash

Expected Behavior

Tesseract should handle the input gracefully without causing a floating-point exception.

Suggested Fix

No response

tesseract -v

tesseract 5.3.4
leptonica-1.83.1
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 3.0.0) : libpng 1.6.43 : libtiff 4.6.0 : zlib 1.2.13 : libwebp 1.4.0 : libopenjp2 2.5.2
Found AVX512BW
Found AVX512F
Found AVX512VNNI
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 201511
Found libarchive 3.7.2 zlib/1.2.13 liblzma/5.2.6 bz2lib/1.0.8 liblz4/1.9.3 libzstd/1.5.5

Operating System

No response

Other Operating System

Ubuntu inside Windows Subsystem for Linux (WSL)

Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy

uname -a

Linux 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Compiler

No response

CPU

No response

Virtualization / Containers

No response

Other Information

this is the output of bt: (gdb) bt
#0 0x00007f66916bc552 in __GI___feraiseexcept (excepts=excepts@entry=1)
at ../sysdeps/x86_64/fpu/fraiseexcpt.c:36
#1 0x00007f66916c2590 in __asinf (x=-1.00000012) at ./math/w_asinf_compat.c:34
#2 __asinf (x=-1.00000012) at ./math/w_asinf_compat.c:28
#3 0x00007f6691f8dd63 in tesseract::Wordrec::angle_change(tesseract::EDGEPT*, tesseract::EDGEPT*, tesseract::EDGEPT*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#4 0x00007f6691f8e243 in tesseract::Wordrec::pick_close_point(tesseract::EDGEPT*, tesseract::EDGEPT*, int*)
() from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#5 0x00007f6691f8e66b in tesseract::Wordrec::vertical_projection_point(tesseract::EDGEPT*, tesseract::EDGEPT*, tesseract::EDGEPT**, tesseract::EDGEPT_CLIST*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#6 0x00007f6691f935c6 in tesseract::Wordrec::try_vertical_splits(tesseract::EDGEPT**, short, tesseract::EDGEPT_CLIST*, tesseract::GenericHeap<tesseract::KDPtrPairInc<float, tesseract::SEAM> >, tesseract::GenericHeap<tesseract::KDPtrPairDec<float, tesseract::SEAM> >, tesseract::SEAM**, tesseract::TBLOB*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#7 0x00007f6691f93c56 in tesseract::Wordrec::pick_good_seam(tesseract::TBLOB*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#8 0x00007f6691f8fa43 in tesseract::Wordrec::attempt_blob_chop(tesseract::TWERD*, tesseract::TBLOB*, int, bool, std::vector<tesseract::SEAM*, std::allocatortesseract::SEAM* > const&) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#9 0x00007f6691f909b2 in tesseract::Wordrec::improve_one_blob(std::vector<tesseract::BLOB_CHOICE*, std::allocatortesseract::BLOB_CHOICE* > const&, std::vector<tesseract::DANGERR_INFO, std::allocatortesseract::DANGERR_INFO >, bool, bool, tesseract::WERD_RES, unsigned int*) ()
from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#10 0x00007f6691f90bd0 in tesseract::Wordrec::improve_by_chopping(float, tesseract::WERD_RES*, tesseract::BestChoiceBundle*, tesseract::BlamerBundle*, tesseract::LMPainPoints*, std::vector<tesseract::SegSearchPending, st--Type for more, q to quit, c to continue without paging--c
d::allocatortesseract::SegSearchPending >) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#11 0x00007f6691fa0a78 in tesseract::Wordrec::SegSearch(tesseract::WERD_RES
, tesseract::BestChoiceBundle*, tesseract::BlamerBundle*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#12 0x00007f6691f8f0c8 in tesseract::Wordrec::chop_word_main(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#13 0x00007f6691f8cc6d in tesseract::Wordrec::cc_recog(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#14 0x00007f6691e5f71c in tesseract::Tesseract::recog_word_recursive(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#15 0x00007f6691e5f8c4 in tesseract::Tesseract::recog_word(tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#16 0x00007f6691e5cb62 in tesseract::Tesseract::tess_segment_pass_n(int, tesseract::WERD_RES*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#17 0x00007f6691e04b52 in tesseract::Tesseract::match_word_pass_n(int, tesseract::WERD_RES*, tesseract::ROW*, tesseract::BLOCK*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#18 0x00007f6691e04d0b in tesseract::Tesseract::classify_word_pass1(tesseract::WordData const&, tesseract::WERD_RES**, tesseract::PointerVectortesseract::WERD_RES) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#19 0x00007f6691e0810a in tesseract::Tesseract::RetryWithLanguage(tesseract::WordData const&, void (tesseract::Tesseract::
)(tesseract::WordData const&, tesseract::WERD_RES**, tesseract::PointerVectortesseract::WERD_RES), bool, tesseract::WERD_RES**, tesseract::PointerVectortesseract::WERD_RES) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#20 0x00007f6691e08b22 in tesseract::Tesseract::classify_word_and_language(int, tesseract::PAGE_RES_IT*, tesseract::WordData*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#21 0x00007f6691e0d41d in tesseract::Tesseract::RecogAllWordsPassN(int, tesseract::ETEXT_DESC*, tesseract::PAGE_RES_IT*, std::vector<tesseract::WordData, std::allocatortesseract::WordData >) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#22 0x00007f6691e0e464 in tesseract::Tesseract::recog_all_words(tesseract::PAGE_RES
, tesseract::ETEXT_DESC*, tesseract::TBOX const*, char const*, int) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#23 0x00007f6691ddff64 in tesseract::TessBaseAPI::Recognize(tesseract::ETEXT_DESC*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#24 0x00007f6691de056b in tesseract::TessBaseAPI::ProcessPage(Pix*, int, char const*, char const*, int, tesseract::TessResultRenderer*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#25 0x00007f6691de18e1 in tesseract::TessBaseAPI::ProcessPagesInternal(char const*, char const*, int, tesseract::TessResultRenderer*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#26 0x00007f6691de1adf in tesseract::TessBaseAPI::ProcessPages(char const*, char const*, int, tesseract::TessResultRenderer*) () from /home/chris/mambaforge/envs/tess_bug/bin/../lib/libtesseract.so.5
#27 0x0000556ecc08455b in main ()

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions