1 | -------------------------------------------------
|
---|
2 | COMPILING TESSERACT GS2-EXTENSION
|
---|
3 | & CREATING THE CUT-DOWN BINARY-ONLY TARBALL
|
---|
4 | -------------------------------------------------
|
---|
5 |
|
---|
6 | To compile the Tesseract gs2-extension and then create the "binary" tarball needed to run
|
---|
7 | Tesseract, we follow an equivalent version of the instructions for the imagemagick gs2-extension
|
---|
8 | at http://trac.greenstone.org/browser/gs2-extensions/imagemagick/trunk/README
|
---|
9 |
|
---|
10 | 1. Find a location on your machine
|
---|
11 |
|
---|
12 | 2. Check out the tesseract extension from gs2-extensions
|
---|
13 | svn co http://trac.greenstone.org/browser/gs2-extensions/tesseract/trunk tesseract
|
---|
14 |
|
---|
15 | 3. Compile it all up (tesseract and dependencies):
|
---|
16 | cd tesseract
|
---|
17 | ./CASCADE-MAKE.sh
|
---|
18 |
|
---|
19 | 4. Open a fresh terminal and check that the tesseract now installed in src/linux/bin works:
|
---|
20 |
|
---|
21 | cd src
|
---|
22 | source ./setup.bash
|
---|
23 |
|
---|
24 | This should have set up env vars like GEXTTESS, GEXTTESS_INSTALLED, and TESSDATA_PREFIX
|
---|
25 | which Tesseract needs to have set
|
---|
26 |
|
---|
27 | tesseract --list-langs
|
---|
28 | tesseract sample.tif out
|
---|
29 |
|
---|
30 | OCRs sample.tif and generates out.txt from it.
|
---|
31 |
|
---|
32 | cat out.txt
|
---|
33 |
|
---|
34 | 5. If successful, create a folder at the same level as src alled tesseract
|
---|
35 | cd src
|
---|
36 | cd ..
|
---|
37 | mkdir tesseract
|
---|
38 |
|
---|
39 | COPY the setup files and MOVE the installed folder (src/linux) into the new cut-down tesseract folder:
|
---|
40 |
|
---|
41 | cp src/setup.ba* tesseract/.
|
---|
42 | mv src/linux tesseract/.
|
---|
43 |
|
---|
44 | COPY the TESSERACT-APACHE-LICENSE and LEPTONICA-LICENSE txt files (note it uses
|
---|
45 | Amercan spelling!) from src/packages into the cut-down tesseract/linux:
|
---|
46 |
|
---|
47 | cp src/packages/*LICENSE.txt tesseract/linux/.
|
---|
48 |
|
---|
49 |
|
---|
50 | 6. Create a tarball of the cut down tesseract folder named tesseract-<os>-<arch>.tar.gz:
|
---|
51 | tar -cvzf tesseract-linux-x64.tar.gz tesseract
|
---|
52 |
|
---|
53 | 7. (Add/SVN up and) commit that to svn:
|
---|
54 | svn up
|
---|
55 | svn add tesseract-linux-x64.tar.gz
|
---|
56 | (or svn diff tesseract-linux-x64.tar.gz if there was an earlier version to confirm modified)
|
---|
57 | svn commit -m "MESSAGE" tesseract-linux-x64.tar.gz
|
---|
58 |
|
---|
59 |
|
---|
60 | -------------------------------------------------
|
---|