@@ -34,7 +34,9 @@ IN/OUT ARGUMENTS
34
34
35
35
'outputbase' ::
36
36
The basename of the output file (to which the appropriate extension
37
- will be appended). By default the output will be named 'outbase.txt' .
37
+ will be appended). By default the output will be a text file
38
+ with `.txt` added to the basename unless there are one or more
39
+ 'configfile' options which explicitly specify the desired output.
38
40
39
41
'stdout' ::
40
42
Instruction to sent output data to standard output
@@ -88,8 +90,19 @@ OPTIONS
88
90
contains a list of variables and their values, one per line, with a
89
91
space separating variable from value. Interesting config files
90
92
include: +
91
- * hocr - Output in hOCR format instead of as a text file.
92
- * pdf - Output in pdf instead of a text file.
93
+ * `hocr` - Output in hOCR format (file extension `.hocr`).
94
+ * `pdf` - Output PDF (file extension `.pdf`).
95
+ * `tsv` - Output TSV (file extension `.tsv`).
96
+ * `txt` - Output plain text (file extension `.txt`).
97
+ * `get.images` - Write images.
98
+ * `logfile` - Write debug file `tesseract.log`.
99
+ * `lstm.train` - Used for LSTM training.
100
+ * `makebox` - Output box file.
101
+ * `quiet` - Write debug file to /dev/null.
102
+
103
+ It is possible to select several config files, for example
104
+ `tesseract image.png demo hocr pdf txt` will create three output files
105
+ `demo.hocr`, `demo.pdf` and `demo.txt` with the OCR results.
93
106
94
107
*Nota Bene:* The options `-l lang` and `--psm N` must occur
95
108
before any 'configfile'.
@@ -122,7 +135,7 @@ LANGUAGES
122
135
123
136
The currently available traineddata files for tesseract 4.0
124
137
for the following languages are in
125
- (in https://github.com/tesseract-ocr/tessdata_fast) :
138
+ https://github.com/tesseract-ocr/tessdata_fast:
126
139
127
140
*afr* (Afrikaans),
128
141
*amh* (Amharic),
0 commit comments