@@ -17,12 +17,6 @@ between 1985 and 1995. In 1995, this engine was among the top 3 evaluated by
17
17
UNLV. It was open-sourced by HP and UNLV in 2005, and has been developed
18
18
at Google since then.
19
19
20
- Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused
21
- on line recognition, but also still supports the legacy Tesseract OCR engine of
22
- Tesseract 3 which works by recognizing character patterns. Compatibility with
23
- Tesseract 3 is enabled by -- oem 0. It also needs traineddata files which support
24
- the legacy engine, for example those from the tessdata repository.
25
-
26
20
27
21
IN/OUT ARGUMENTS
28
22
----------------
@@ -97,7 +91,7 @@ OPTIONS
97
91
* hocr - Output in hOCR format instead of as a text file.
98
92
* pdf - Output in pdf instead of a text file.
99
93
100
- *Nota Bene:* The options ' -l lang' and ' --psm N' must occur
94
+ *Nota Bene:* The options ` -l lang` and ` --psm N` must occur
101
95
before any 'configfile'.
102
96
103
97
@@ -116,7 +110,7 @@ SINGLE OPTIONS
116
110
Returns the current version of the tesseract(1) executable.
117
111
118
112
'--list-langs'::
119
- List available languages for tesseract engine. Can be used with --tessdata-dir.
113
+ List available languages for tesseract engine. Can be used with ` --tessdata-dir` .
120
114
121
115
'--print-parameters'::
122
116
Print tesseract parameters.
@@ -251,7 +245,7 @@ for the following languages are in
251
245
To use a non-standard language pack named *foo.traineddata*, set the
252
246
*TESSDATA_PREFIX* environment variable so the file can be found at
253
247
*TESSDATA_PREFIX*/tessdata/*foo*.traineddata and give Tesseract the
254
- argument ' -l foo' .
248
+ argument ` -l foo` .
255
249
256
250
SCRIPTS
257
251
-------
@@ -377,7 +371,15 @@ language data.
377
371
Tesseract 3.02 adds BiDirectional text support, the ability to recognize
378
372
multiple languages in a single image, and improved layout analysis.
379
373
380
- For further details, see the file ReleaseNotes included with the distribution.
374
+ Tesseract 4 adds a new neural net (LSTM) based OCR engine which is focused
375
+ on line recognition, but also still supports the legacy Tesseract OCR engine of
376
+ Tesseract 3 which works by recognizing character patterns. Compatibility with
377
+ Tesseract 3 is enabled by `--oem 0`. It also needs traineddata files which
378
+ support the legacy engine, for example those from the tessdata repository.
379
+
380
+ For further details, see the file ReleaseNotes in the Tesseract wiki
381
+ (<https://github.com/tesseract-ocr/tesseract/wiki/ReleaseNotes>).
382
+
381
383
382
384
RESOURCES
383
385
---------
@@ -402,6 +404,9 @@ Pingping Xiu, Pong Eksombatchai (Chantat), Ranjith Unnikrishnan, Raquel
402
404
Romano, Ray Smith, Rika Antonova, Robert Moss, Samuel Charron, Sheelagh
403
405
Lloyd, Shobhit Saxena, and Thomas Kielbus.
404
406
407
+ For a list of contributors see
408
+ <https://github.com/tesseract-ocr/tesseract/blob/master/AUTHORS>.
409
+
405
410
COPYING
406
411
-------
407
412
Licensed under the Apache License, Version 2.0
0 commit comments