Timeline



2020-06-16:

20:01 Changeset [34203] by ak19
Reminder.
19:43 Changeset [34202] by ak19
I think the svn:externals is working, so removing the tmp folder
19:42 Changeset [34201] by ak19
Have now attempted to set the svn:externals property on tesseract's …
19:27 Changeset [34200] by ak19
Going to attempt to set svn externals to grab zlib, libpng, tiff, jpg, …
19:24 Changeset [34199] by ak19
A makedists.sh script for gstika to make the cutdown zip and tarball. …
18:59 Changeset [34198] by ak19
1. Added a script to generate the cut-down ('binary only') tesseract …
18:15 Changeset [34197] by ak19
Name of Tika config file for ocr-ing pdfs has been updated.
18:13 Changeset [34196] by ak19
Updating gstika tarballs too with the latest changes to the tika …
18:05 Changeset [34195] by ak19
Renaming config files so one is configured for OCR-ing PDFs, the other …
18:03 Changeset [34194] by ak19
Renaming config files so one is configured for OCR-ing PDFs, the other …
17:54 Changeset [34193] by ak19
Further useful links before I rename the tika-config file
17:53 Changeset [34192] by ak19
Further useful links before I rename the tika-config file
17:44 Changeset [34191] by ak19
Model colectionConfig.xml with commented out UnknownConverterPlugin
17:20 Changeset [34190] by ak19
1. The tessdata folder was being created when compiling tesseract, and …
16:32 Changeset [34189] by ak19
Added osd (onscreen display) OCR language support. Said to be the …
15:23 Changeset [34188] by ak19
Tika config file to get Tika+Tesseract to OCR PDFs. This file must be …
15:22 Changeset [34187] by ak19
Committing the tika-config.xml that sets up Tika's PDFParser and …
15:00 Changeset [34186] by ak19
In order to get tika + tesseract to OCR PDFs (note that tesseract …
09:53 Changeset [34185] by kjdon
changed doClassifier to doClassifiers as it was misleading and bugged …
00:36 Changeset [34184] by ak19
The Leptonica license reminded me I had forgotten to look into the …
00:28 Changeset [34183] by ak19
Forgot to commit the distnct license of Leptonica.

2020-06-15:

23:48 Changeset [34182] by ak19
A sample image to test the built tesseract extension on. After …
23:40 Changeset [34181] by ak19
Committing the cut-down, binaries-only tesseract tarball for x64 linux …
23:16 Changeset [34180] by ak19
Gnome-lib has setup.bash_old and setup.bat_old, but imagemagick and …
22:51 Changeset [34179] by ak19
Imitating the gnome-lib gs2-extension by Setting the svn externals …
22:44 Changeset [34178] by ak19
CASCADE-MAKE for Tesseract, the OCR tool. I'm thinking of expanding …
03:51 Changeset [34177] by ak19
Minor
03:34 Changeset [34176] by ak19
Zipping and tarring just the binary version of the extension
03:28 Changeset [34175] by ak19
Minor changes to folder names
03:23 Changeset [34174] by ak19
1. Created GSTikaCLI.java based off TikaCLI.java of the apache …
01:34 Changeset [34173] by ak19
The more general way of launching the apache tika-app jar file. This …

2020-06-14:

19:11 Changeset [34172] by ak19
Some minor improvements to the UnknownConverterPlugin settings for …
03:50 Changeset [34171] by ak19
Minor
03:46 Changeset [34170] by ak19
Helpful instruction
03:40 Changeset [34169] by ak19
All GS3 needs to convert docx files to basic html (no images) out of …
02:34 Changeset [34168] by ak19
Stupid oversight on my part yesterday: when fixing up client-GLI so …

2020-06-13:

21:05 Changeset [34167] by ak19
Added svn externals properties to gs3colcfg module for the Italian …
20:52 Changeset [34166] by ak19
Adding Italian language translations of the gs3colcfg module. Many …
20:40 Changeset [34165] by ak19
Italian language updates to GS2 core module and translations for GS3 …
18:13 Changeset [34164] by ak19
Adding warning comments about where stderr messages n …
16:49 Changeset [34163] by ak19
Minor changes to the recent commit
16:37 Changeset [34162] by ak19
Tutorial document now contains the crucial id=gs_content in the …
08:43 Changeset [34161] by ak19
Fixed last of the client-gli/remoe GS3 bugs discovered yesterday. When …
06:50 Changeset [34160] by ak19
Completing TODO from Kathy's commit message for 34116 for …
06:35 Changeset [34159] by ak19
Kathy had earlier requested that I recommit the gliserver.pl file she …

2020-06-12:

23:47 Changeset [34158] by ak19
Fixing discovery of client-gli issues with previewing a different …
23:36 Changeset [34157] by ak19
Undoing accidental commit of unintended files, part3
23:32 Changeset [34156] by ak19
Undoing accidental commit of unintended files, part2
23:30 Changeset [34155] by ak19
Undoing accidental commit of unintended files
23:23 Changeset [34154] by ak19
Useful debugging statement. Would have helped me solve a bug sooner by …
21:23 Changeset [34153] by ak19
While trying to debug client-gli to remote issues, found some more …
20:40 Changeset [34152] by ak19
When sending a request to activate and deactivate, can request …

2020-06-11:

14:16 Changeset [34151] by ak19
Part 1. Untested. (The commit didn't go through previously for whateer …
14:14 Changeset [34150] by ak19
Part 2. Untested. Setting svn externals property to pull …
13:16 Changeset [34149] by ak19
Adding note to stress the importance of ensuring the div containing …
05:43 Changeset [34148] by ak19
Fix for broken remote greenstone server, which wouldn't load …
02:11 Changeset [34147] by ak19
Related to prev commit. Part 2 of: The Expat.so I built on the uni …
02:08 Changeset [34146] by ak19
The Expat.so I built on the uni machine against a perl 5.30 I compiled …
02:01 Changeset [34145] by ak19
Copyright info and link to OS templates must remain intact as seen in …

2020-06-10:

14:57 Changeset [34144] by ak19
Adding in the Depositor link (visible only to logged in users) to the …
14:56 Changeset [34143] by ak19
Some interface changes made by Dr Bainbridge and some by me on his request.

2020-06-09:

15:15 Changeset [34142] by ak19
Manually force-adding the Expat.so to svn which STILL didn't get …
15:11 Changeset [34141] by ak19
Redoing commit for XML-Parser of perl-5.30 as the previous commit …
15:07 Changeset [34140] by ak19
Recommitting since there are all kinds of questions marks about what I …
13:43 Changeset [34139] by ak19
Not sure why the perl-5.22's Expat folder was empty, adding in …

2020-06-08:

19:00 Changeset [34138] by ak19
XML-Parser for perl version 5.30 (specifically perl 5.30.3 was …

2020-06-05:

19:51 Changeset [34137] by ak19
Have only been able to incorporate one of Dr Bainbridge's improvements …

2020-06-03:

15:52 Changeset [34136] by ak19
Incorporating Anita Kurei's improvements to display strings for …

2020-05-30:

16:42 Changeset [34135] by ak19
Changed the name of a collection making it more descriptive and also …
16:15 Changeset [34134] by ak19
Added an empty text file with instruction for the allismri collection too
16:14 Changeset [34133] by ak19
Added an empty text file with instruction
16:01 Changeset [34132] by ak19
Committing the commoncrawl site of Nutch recrawls of our CC data where …
15:18 Changeset [34131] by ak19
Allowing input keep-urls-file to contain a comma followed by country …
01:27 Changeset [34130] by ak19
Some more tidying up while isMRI filtered collection rebuilding
01:01 Changeset [34129] by ak19
Implemented Kathy's suggestions: 1. Explicit ex prefix to ex meta …

2020-05-27:

20:06 Changeset [34128] by ak19
When rebuilding the opotiki site today, had noticed that …
19:43 Changeset [34127] by ak19
Spelling correction in filename: screeMshot to screeNshot
19:10 Changeset [34126] by ak19
When I'd modified the code to make the keep_urls_file non-compulsory, …
18:07 Changeset [34125] by ak19
Commit message went awry. Cleaned up some comments to recommit with …
18:03 Changeset [34124] by ak19
Decoding the title and text using the encoding seemed to have turned …

2020-05-26:

02:18 Changeset [34123] by ak19
Some more minor changes
01:13 Changeset [34122] by ak19
1. After some testing of building the complete commoncrawl collection, …

2020-05-25:

23:53 Changeset [34121] by ak19
1. Introducing NutchTextDumpPlugin to process the records …

2020-05-21:

17:47 Changeset [34120] by ak19
CSV version of .ods file, so openoffice isn't required
17:28 Changeset [34119] by ak19
Committing the auto-generated analysis results folder, …
14:16 Changeset [34118] by ak19
Kathy's hard work for commit 34117 was done on a Windows machine where …

2020-05-20:

15:53 Changeset [34117] by kjdon
tidied up the code. Moved a few commands that don't actually need site …
14:44 Changeset [34116] by kjdon
use global.properties, not build.properties. therefore call this with …

2020-05-19:

15:03 Changeset [34115] by kjdon
a couple changes. 1 don't explicitly need to remove the lock file from …
13:22 Ticket #945 (GS3 needs to allow for https URLs) closed by ak19
fixed: GLI updated to use ProtocolPortProperties when GS3 to work out port …
12:25 Changeset [34114] by kjdon
for gs3, gwcgi is the tomcat context, i.e. greenstone3 by default. If …
11:34 Changeset [34113] by ak19
1. tomcat.port no longer exists in build.properties after https also …

2020-05-18:

13:40 Changeset [34112] by ak19
GS3 source code seems to already use FileInputStream with UTF-8 …
11:24 Changeset [34111] by ak19
Undoing additions surrounding JAVA_TOOL_OPTIONS where file.encoding is …
Note: See TracTimeline for information about the timeline view.