Opened 15 years ago
Closed 14 years ago
#426 closed defect (fixed)
Investigate new conversion tools for Word and other Office documents
Reported by: | ak19 | Owned by: | nobody |
---|---|---|---|
Priority: | moderate | Milestone: | 2.84 Release |
Component: | Collection Building | Severity: | major |
Keywords: | Word, office, conversion, wvware | Cc: |
Description
The WVWare page's latest news was in 2006. Since then Office 2007 has come along and Word 2007 documents are not compatible for conversion with wvware. There has been at least one email saying that PPT conversion in GLI is not going to smoothly either.
Dr Nichols thinks it is time we try to find alternative conversion tools.
Here are some of the URLs he found:
http://www.nativewinds.montana.com/software/docx2rtf.html "NW Docx Converter, Docx2Rtf v3.2" http://swik.net/Word+conversion Some links to conversion software, including of Word http://www.xml.com/pub/a/2003/12/31/qa.html "From Word to XML" http://poi.apache.org/ "Apache POI - Java API To Access Microsoft Format Files" http://drupal.org/node/139851 "Word Doc to HTML Converters?" http://sourceforge.net/projects/wordhtml/ "WordHTML CV" http://pastcounts.wordpress.com/2008/01/30/word-to-latex/ "Word to LaTeX" http://m.linuxjournal.com/article/9493 "Cooking with Linux - Words, Words, Words..."
WvWare related pages: http://www.abisource.com/ "AbiWord" http://wvware.sourceforge.net/
Change History (5)
comment:1 by , 15 years ago
Milestone: | Release 2.82 → Release 2.83 |
---|
comment:2 by , 14 years ago
Milestone: | Greenstone 2 wishlist → Collection building wishlist |
---|
comment:3 by , 14 years ago
Milestone: | Collection building wishlist → 2.84 Release |
---|
comment:4 by , 14 years ago
comment:5 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Katherine has developed OpenOfficeConverter and OpenOfficePlugin. These are available as an extension to Greenstone. At 200 MB installed, OpenOffice is of a significant size to install, and so we do not bundle this with the extension, rather we expect the user to install this themselves. Apache Tika (100% java) is of a more modest size, and Sam has started a comparable extension based around this ... the difference being the relevant Jar file is included in the extension.
Prob will do open office and/or tika. see #430, #664