Opened 3 years ago

Last modified 4 weeks ago

#940 new defect

Apache Tika - see if Sam's GS2 extension works and write up tutorial

Reported by: ak19 Owned by: nobody
Priority: moderate Milestone: 3.11 Release
Component: Collection Building Severity: enhancement
Keywords: Cc:

Description

One of questions by Tom Ip on the mailing list was whether there was support in GS for Apache Tika's comprehensive document format conversion tool.

It turns out that Sam had written an extension for Tika, including a document conversion plugin (pm file), see http://trac.greenstone.org/changeset/22690

  1. Try to download his jar http://trac.greenstone.org/browser/gs2-extensions/tika/trunk/tika-java.tar.gz

and see if the existing version works

  1. Try to get it working otherwise.
  1. Maybe upgrade to the latest version of Tika and ensure it still works.
  1. Write up a tutorial or else at least a wiki page on how to use this extension with GLI.

Change History (1)

comment:1 by kjdon, 4 weeks ago

Milestone: 3.10 Release3.11 Release

Ticket retargeted after milestone closed

Note: See TracTickets for help on using tickets.