Timeline
2019-11-20:
- 23:23 Changeset [33710] by
- Working queries and map coords for geojson.tools (ironically, Lat and …
- 18:49 Changeset [33709] by
- Forgot to commit the zip file before deleting it
- 11:30 Changeset [33708] by
- Changed code so api key can be in separate file, and passed in on the …
- 11:27 Changeset [33707] by
- Adds in maven to path if untarred in java/packages area
- 11:14 Changeset [33706] by
- Tempalte file
2019-11-19:
- 14:08 Changeset [33705] by
- reindented the file, no code changes
- 14:04 Changeset [33704] by
- added some authentication error strings. some are used by the …
- 14:03 Changeset [33703] by
- added more breadcrumbs for ease of finding your way back to the start
- 14:00 Changeset [33702] by
- added depositorTitleAndLink template. TODO - get depositor text from …
- 13:59 Changeset [33701] by
- added more breadcrumbs to the page. And now it displays an error if …
- 13:56 Changeset [33700] by
- starting to put some of the strings into a dictionary - using …
- 13:53 Changeset [33699] by
- first stab at requiring a user to be logged in to use the depositor, …
2019-11-15:
- 23:14 Changeset [33698] by
- Links to more reading
- 23:10 Changeset [33697] by
- Changes to the README to provide instructions on making the fewest …
- 22:09 Changeset [33696] by
- Moved the individual READMEs into the top level too along with …
- 20:29 Changeset [33695] by
- Minor corrects and file rename
- 20:03 Changeset [33694] by
- interfaces\images folder structure can be customised for a site as an …
- 20:01 Changeset [33693] by
- Forgot we needed a toplevel collect folder
- 19:57 Changeset [33692] by
- Math collection toplevel folder restructure
- 19:56 Changeset [33691] by
- Math collection renames and moving things about
- 19:54 Changeset [33690] by
- Moving the science collection to the top level
- 19:52 Changeset [33689] by
- Rename again
- 19:51 Changeset [33688] by
- Moving the science collection related README, screenshot and …
- 19:49 Changeset [33687] by
- Renaming science collection to not have period mark in its name, on …
- 19:46 Changeset [33686] by
- 18:50 Changeset [33685] by
- Upstream change related to Solr ext
- 18:49 Changeset [33684] by
- Changes made around the time of the launch
- 18:46 Changeset [33683] by
- Updated to process latest version of spreadsheet
- 18:45 Changeset [33682] by
- Changes made around the time of the launch
- 18:44 Changeset [33681] by
- Added in flock technique to avoid multiple people running the same script
- 18:43 Changeset [33680] by
- Greenstone3 is fixed, so don't need to print out message about runing …
- 18:38 Changeset [33679] by
- Folder for working on updates (PDFs to del, PDFs to add) from Kiri
- 17:57 Changeset [33678] by
- setup for greenstone ext
- 17:57 Changeset [33677] by
- Intro text
- 17:55 Changeset [33676] by
- Some initial work getting a plugin going that call's Alex's VirusTotal …
- 00:22 Changeset [33675] by
- Committing the newer query results (but from before today's …
- 00:21 Changeset [33674] by
- Changes to support the top 5 predicted langcodes and their confidence …
- 00:17 Changeset [33673] by
- Waikato Education Department's Science Activities and Maths Activities …
2019-11-14:
- 14:14 Changeset [33672] by
- modified slightly so that the error messages come from the dictionary …
- 14:12 Changeset [33671] by
- added a static getTextString method - currently this is in Action.java …
- 14:10 Changeset [33670] by
- added editEnabled att string
- 14:10 Changeset [33669] by
- removed an annoying debug message
- 10:03 Changeset [33668] by
- a few changes to debuginfo texts
- 09:55 Changeset [33667] by
- preProcess.xsl renamed to expand-gslib.xsl to better indicate what it does
2019-11-13:
- 23:08 Changeset [33666] by
- Having finished sending all the crawl data to mongodb 1. Recrawled the …
- 17:18 Changeset [33665] by
- Fixed jar name
- 17:17 Changeset [33664] by
- Initial version code for running VirusTotal API against files, CLI scripts
- 17:12 Changeset [33663] by
- Changes after testing the scripts
- 17:04 Changeset [33662] by
- Scripts to compile and run java code
- 16:54 Changeset [33661] by
- Compiling needs to use Maven
- 16:53 Changeset [33660] by
- For Java source code
- 16:40 Changeset [33659] by
- Top-level folder for new extension based on TotalVirus API which scans …
- 16:40 Changeset [33658] by
- Top-level folder for new extension based on TotalVirus API which scans …
2019-11-12:
- 21:33 Changeset [33657] by
- Some fixes after brief testing against 1/3 of the crawl. Restarted …
- 21:11 Changeset [33656] by
- Final minor changes before I start processing the crawls of node2.
- 20:56 Changeset [33655] by
- Minor change to print statement
- 20:54 Changeset [33654] by
- Removing jar file that wasn't used after all.
- 20:51 Changeset [33653] by
- 1. As suggested by Dr Bainbridge, made the code changes to use Morphia …
- 20:41 Changeset [33652] by
- Introducing morphia subpackage
- 18:11 Changeset [33651] by
- 1. Bugfix: overlappingSentences works. 2. storing numSentencesInMaor
- 12:06 Changeset [33650] by
- updated to match the new xsl file names; lots of variable renames to …
- 12:04 Changeset [33649] by
- renamed config_format and text_fragment_format to better represent …
- 12:04 Changeset [33648] by
- changed the debuginfo xsl and strings to match the new o=xxx debug options
- 09:30 Changeset [33647] by
- added/changed a few of the output values for debugging the transform
2019-11-11:
- 18:46 Changeset [33646] by
- Saving the mongodb queries and learning links that Dr Bainbridge found …
- 18:45 Changeset [33645] by
- Fix to 2 bugs when sending data to MongoDB: 1. overlappingSentences …
- 11:50 Changeset [33644] by
- Just committing the growing mongodb.txt file with links and …
- 11:46 Changeset [33643] by
- Brought the template log4j.properties.in back up to speed. I forgot it …
- 11:06 Changeset [33642] by
- Forgot to commit the java driver for mongodb when I committed the Java …
- 10:53 Changeset [33641] by
- commented out some debug statements
- 10:48 Changeset [33640] by
- oops, I must have 'tidied' up the file and then not compiled it to …
- 10:23 Changeset [33639] by
- need to select child nodes, otherwise the gsf:default node ends up in …
- 10:22 Changeset [33638] by
- gslib doesn't use xml-to-string.xsl. its only used by formatmanager, …
- 10:21 Changeset [33637] by
- we can now use gsf and gslib in layout files.
- 10:04 Changeset [33636] by
- include means the stylesheet gets added inline, import mea s it gets …
- 09:38 Changeset [33635] by
- Maori-language-detection doesn't use Greenstone 3 at present, it's not …
2019-11-08:
- 23:59 Changeset [33634] by
- Rewrote NutchTextDumpProcessor as NutchTextDumpToMongoDB.java, which …
- 19:43 Changeset [33633] by
- 1. TextLanguageDetector now has methods for collecting all sentences …
2019-11-07:
- 14:53 Changeset [33632] by
- overhaul of TransformingReceptionist. changed the order of inlining …
- 14:52 Changeset [33631] by
- added a bit more error reporting
- 14:44 Changeset [33630] by
- minor comment changes
- 14:20 Changeset [33629] by
- added methods using Parameter2 - for params with text node values
- 13:52 Changeset [33628] by
- not sure why documentNode was a gsf:template here. Can't be like that …
- 09:28 Changeset [33627] by
- removed unnecessary comments
2019-11-05:
- 21:59 Changeset [33626] by
- TODOs
- 21:58 Changeset [33625] by
- A file listing domains with seedurls containing /mi(/) that are …
- 21:48 Changeset [33624] by
- Some cleanup surrounding the now renamed function createSeedURLsFile, …
- 21:04 Changeset [33623] by
- 1. Incorporated Dr Nichols earlier suggestion of storing page modified …
- 15:42 Changeset [33622] by
- File rename
2019-11-04:
- 20:35 Changeset [33621] by
- Comitting jotted down mongodb related instructions from what Dr …
- 14:24 Changeset [33620] by
- Final crawl, done on vagrant VM node6. Crawl site IDs 01407-01462.
- 11:36 Changeset [33619] by
- need to handle the case where a collection file (eg image) gets …
2019-11-01:
- 20:14 Changeset [33618] by
- Adding in the download URL
- 17:13 Changeset [33617] by
- Node5 is now full and here is the finished crawl (up to and including …
2019-10-31:
- 20:05 Changeset [33616] by
- Beginnings of Java class that is to interact with MongoDB. I don't yet …
- 20:03 Changeset [33615] by
- 1. Worked out how to configure log4j to log both to console and …
- 11:22 Changeset [33614] by
- added a new line
- 11:18 Changeset [33613] by
- added allowdocumentediting and allowmapgpsediting options, plus also …
- 11:00 Changeset [33612] by
- work to do with params. add in default values to params if they are …
- 10:55 Changeset [33611] by
- added global setting to params - thesea re for params that are valid …
- 10:54 Changeset [33610] by
- USER_SESSION_CACHE_ATT moved to GSParams, as it is stored in session …
2019-10-30:
- 23:03 Changeset [33609] by
- The tar files containing the crawled sites data shouldn't be called …
- 23:02 Changeset [33608] by
- 1. New script to export from HBase so that we could in theory reimport …
2019-10-29:
- 18:33 Changeset [33607] by
- Updated with the remaining successfully crawled sites on node4 before …
- 15:18 Changeset [33606] by
- 1. Committing crawl data from node3 (2nd VM for nutch crawling). 2. …
- 14:54 Changeset [33605] by
- Node 4 VM still works, but committing first set of crawled sites on there
2019-10-24:
- 23:22 Changeset [33604] by
- 1. Better output into possible-product-sites.txt including the …
- 22:04 Changeset [33603] by
- Incorporating Dr Nichols suggestion to help weed out product sites: if …
2019-10-23:
- 23:49 Changeset [33602] by
- 1. The final csv file, mri-sentences.csv, is now written out. 2. Only …
- 23:22 Changeset [33601] by
- Creates the 2nd csv file, with info about webpages. At present stores …
- 23:05 Changeset [33600] by
- Work in progress of writing out CSV files. In future, may write the …
2019-10-22:
- 20:49 Changeset [33599] by
- First one-third sites crawled. Committing to SVN despite the tarred …
- 20:19 Changeset [33598] by
- More instructions on setting up Nutch now that I've remembered to …
- 20:05 Changeset [33597] by
- Committing active version of template file which has a newline at end …
- 18:44 Changeset [33596] by
- Adding in the nutch-site.xml and regex-urlfilter.GS_TEMPLATE template …
- 14:05 Changeset [33595] by
- new displayBaskets template - to avoid replicating code in query and …
- 14:00 Changeset [33594] by
- call gslib:displayBasket instead of replicating the code here
- 13:59 Changeset [33593] by
- the test for facets should be facetList/facet/count, as the facets get …
- 13:51 Changeset [33592] by
- reindented the file
- 11:51 Changeset [33591] by
- added in some strings for 'this collection contains x documents and …
- 11:12 Changeset [33590] by
- added 'this colleciton contains X documents and was last build Y days …
2019-10-21:
- 21:45 Changeset [33589] by
- final01. Need Map results still
Note:
See TracTimeline
for information about the timeline view.