-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Description
Note by @amitdo. The original title was: Inconsistency Between --oem and -c "tessedit_ocr_engine_mode"
Current Behavior
As far as I can tell, these two options should be identical, however the attached image produces different output. With --oem it says 11, with the config var it says 1. I should add that with any --oem value other than 0, it returns '1' not '11'.
Full command:
tesseract -l eng --oem 0 --psm 7 -c "tessedit_char_whitelist=0123456789" ramdisk/test_0.png -
tesseract -l eng --psm 7 -c "tessedit_char_whitelist=0123456789" -c "tessedit_ocr_engine_mode=0" ramdisk/test_0.png -
Expected Behavior
Both commands should return the same result, ideally 11.
Suggested Fix
They should either return the same information or the docs should be updated to clarify the difference.
tesseract -v
tesseract 5.4.1
leptonica-1.84.1
libgif 5.2.2 : libjpeg 6b (libjpeg-turbo 3.0.2) : libpng 1.6.40 : libtiff 4.6.0 : zlib 1.3.1.zlib-ng : libwebp 1.4.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found libcurl/8.9.1 OpenSSL/3.2.2 zlib/1.3.1.zlib-ng libidn2/2.3.7 nghttp2/1.62.1
Operating System
No response
Other Operating System
Fedora 42.
uname -a
Linux deux-ex 6.11.7-300.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 8 19:23:10 UTC 2024 x86_64 GNU/Linux
Compiler
No response
CPU
Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz (12)
Virtualization / Containers
None.
Other Information
No response