Last change
on this file since 34178 was 34178, checked in by ak19, 4 years ago |
CASCADE-MAKE for Tesseract, the OCR tool. I'm thinking of expanding the UnknownPlugin tutorial to include using it with Tika for processing docx and for using the Pluging with Tika and Tesseract to OCR image-only pdfs. I have tested the compiled tesseract and on a sample tif image, and it works. But I've still to test the Tika with Tesseract combination. The libz, libpng, (lib)jpeg, (lib)tif and jpeg2000 packages are from Imagemagick. Leptonica needs them (not sure about jpeg2000) and libgif. No libgif yet. Libtool and Leptonica are the dependencies for Tesseract itself. I'm including just the English language data in tessdata folder. Others are available from https://github.com/tesseract-ocr/tessdata . I've added a file called LinksAndNotesOnCompilingManually.txt documenting reading on TikaOCR, how to compile up Tesseract and my pre cascade-make attempts to compile tesseract on Ubuntu. But then I followed the existing use of Cascade-Make in GS2-extensions gnome-lib and imagemagick to get Tesseract compiled up. I don't know how to add in support for cross compilation.
|
-
Property svn:executable
set to
*
|
File size:
1.4 KB
|
Line | |
---|
1 | @echo off
|
---|
2 | pushd "%CD%"
|
---|
3 | CD /D "%~dp0"
|
---|
4 | set GSDLLANG=en
|
---|
5 | set extdesc="the Tesseract OCR support library"
|
---|
6 |
|
---|
7 | if "%OS%" == "Windows_NT" goto WinNT
|
---|
8 | if "%OS%" == "" goto Win95
|
---|
9 | if "%GSDLLANG%" == "en" echo Setup failed - your PATH has not been set
|
---|
10 | if "%GSDLLANG%" == "es" echo No se pudo realizar la configuraci¢n - no se ha establecido la RUTA.
|
---|
11 | if "%GSDLLANG%" == "fr" echo Echc de l'installation - votre variable PATH n'a pas t ajuste
|
---|
12 | if "%GSDLLANG%" == "ru" echo áâ ®¢ª ¥ 〠« áì - ¥ ¡ë« ãáâ ®¢«¥
|
---|
13 | goto End
|
---|
14 |
|
---|
15 | :WinNT
|
---|
16 | set GEXTTESS=%CD%
|
---|
17 | set GEXTTESS_INSTALLED=%GEXTTESS%\windows
|
---|
18 |
|
---|
19 | set PATH=%GEXTTESS_INSTALLED%\bin;%PATH%
|
---|
20 | set GS_CP_SET=yes
|
---|
21 | goto Success
|
---|
22 |
|
---|
23 | :Win95
|
---|
24 | if "%1" == "SetEnv" goto Win95Env
|
---|
25 | REM We'll invoke a second copy of the command processor to make
|
---|
26 | REM sure there's enough environment space
|
---|
27 | COMMAND /E:2048 /K %0 SetEnv
|
---|
28 | goto End
|
---|
29 |
|
---|
30 | :Win95Env
|
---|
31 | set GEXTTESS=%CD%
|
---|
32 | set PATH="%GEXTTESS_INSTALLED%\bin";"%PATH%"
|
---|
33 | set GS_CP_SET=yes
|
---|
34 | goto Success
|
---|
35 |
|
---|
36 | :Success
|
---|
37 | rem tesseract needs the TESSDATA_PREFIX env var set to the languages folder (tessdata)
|
---|
38 | set TESSDATA_PREFIX="%GEXTTESS_INSTALLED%\tessdata"
|
---|
39 |
|
---|
40 | set fulldir=%~dp0
|
---|
41 |
|
---|
42 | :: strip off everything up to (and including) ext dir
|
---|
43 | set extdir=%fulldir:*ext\=%
|
---|
44 |
|
---|
45 | :: remove trailing slash
|
---|
46 | set extdir=%extdir:\=%
|
---|
47 |
|
---|
48 |
|
---|
49 | if "x%GSDLEXTS%" == "x" (
|
---|
50 | set GSDLEXTS=%extdir%
|
---|
51 | ) else (
|
---|
52 | set GSDLEXTS=%GSDLEXTS%:%extdir%
|
---|
53 | )
|
---|
54 |
|
---|
55 | echo +Your environment is now setup for %extdesc%
|
---|
56 |
|
---|
57 | :End
|
---|
58 |
|
---|
59 | popd |
---|
Note:
See
TracBrowser
for help on using the repository browser.