Changeset 34192

Show
Ignore:
Timestamp:
16.06.2020 17:53:04 (2 weeks ago)
Author:
ak19
Message:

Further useful links before I rename the tika-config file

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/ext/tika/tika-config.xml

    r34188 r34192  
    88    - new way of one tika-config.xml: https://github.com/o19s/pdf-discovery-demo/blob/crazy_tika_tesseract_inside_of_solr/ocr/tika-config.xml 
    99    - old way of 2 props files: https://github.com/o19s/pdf-discovery-demo/tree/6f5b37305dd863a73af4617db64cbe853c5ecd2a/ocr/tika-properties/org/apache/tika/parser 
    10      
    11     https://tika.apache.org/1.16/configuring.html 
    12     https://issues.apache.org/jira/browse/TIKA-2624 
     10 
     11    Further useful information on configuring tika for OCR (or no OCR) at: 
     12    - https://tika.apache.org/1.16/configuring.html 
     13    - https://issues.apache.org/jira/browse/TIKA-2624 
     14    - https://stackoverflow.com/questions/51655510/how-do-you-enable-the-tesseractocrparser-using-tikaconfig-and-the-tika-command-l#51668962 (out of date?) 
     15    - https://stackoverflow.com/questions/56232720/is-there-a-way-to-disable-ocr-mode-in-tika-without-uninstalling-tesseract 
    1316--> 
    1417<properties>