# # ChangeLog for gs2-extensions/parallel-building # # Generated by Trac 1.4.2 # 2024-03-29T00:04:25+13:00 Sun, 18 Oct 2015 21:46:21 GMT jmt12 [30308] * gs2-extensions/parallel-building/trunk/src/packages/cpan/Sort-Naturally-1.03.tar.gz (added) Another CPAN module required by the Gantt Chart generation script Sun, 11 Oct 2015 22:24:21 GMT jmt12 [30306] * gs2-extensions/parallel-building/trunk/src/bin/script/GDBMServer.pl (modified) Making the setup of CPAN path more robust based on the better control ... Sun, 11 Oct 2015 22:23:27 GMT jmt12 [30305] * gs2-extensions/parallel-building/trunk/src/perllib/dbutil/gdbmserver.pm (modified) replacing deprecated function calls to newer ones in FileUtils Fri, 09 Oct 2015 04:30:43 GMT jmt12 [30302] * gs2-extensions/parallel-building/trunk/src/build.xml (modified) Altering os-specific installed path and trying to improve clean to ... Fri, 09 Oct 2015 04:29:42 GMT jmt12 [30301] * gs2-extensions/parallel-building/trunk/src/src/mpiterrierfileindexer-src/mpiterrierfileindexer.cpp (modified) Missed displaying one variable in a fprintf statement (threw a ... Fri, 09 Oct 2015 04:28:22 GMT jmt12 [30300] * gs2-extensions/parallel-building/trunk/src/src/mpidspacemediafilter-src/configure (modified) Bunch of changes (including whitespace safety ones) due to different ... Fri, 09 Oct 2015 04:27:09 GMT jmt12 [30299] * gs2-extensions/parallel-building/trunk/src/src/CASCADE-MAKE/HADOOPGREENSTONEINGEST.sh (modified) Making the removal of the jar file conditional on it actually being ... Fri, 09 Oct 2015 04:26:26 GMT jmt12 [30298] * gs2-extensions/parallel-building/trunk/src/src/mpiimport-src/mpiimport.cpp (modified) Correcting path to manifest files for use in Greenstone3. Started to ... Fri, 09 Oct 2015 04:05:03 GMT jmt12 [30297] * gs2-extensions/parallel-building/trunk/src/src/db2txtl-src/Makefile.in (modified) * gs2-extensions/parallel-building/trunk/src/src/gdbmcli-src/Makefile.in (modified) * gs2-extensions/parallel-building/trunk/src/src/txt2dbl-src/Makefile.in (modified) Altering the Makefile.in to determine whether it is in GSDL2 or GSDL3 ... Fri, 09 Oct 2015 04:03:19 GMT jmt12 [30296] * gs2-extensions/parallel-building/trunk/src/setup.bash (modified) extending to support GSDL3 as well Fri, 09 Oct 2015 03:21:57 GMT jmt12 [30295] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Using the proper environment variable, GSDL3SRCHOME, rather than ... Fri, 09 Oct 2015 02:18:49 GMT jmt12 [30294] * gs2-extensions/parallel-building/trunk/src/src/txt2dbl-src/configure.ac (modified) Typo would have prevented the generated configure script from working ... Fri, 09 Oct 2015 02:17:33 GMT jmt12 [30293] * gs2-extensions/parallel-building/trunk/src/src/txt2dbl-src/missing (added) The missing script was, ironically, missing Fri, 09 Oct 2015 01:01:38 GMT jmt12 [30292] * gs2-extensions/parallel-building/trunk/src/perllib/inexport.pm (modified) Removing reference to debugging module Devel::Peek Fri, 09 Oct 2015 00:32:54 GMT jmt12 [30291] * gs2-extensions/parallel-building/trunk/src/src/mpibuildcol-src/configure (modified) * gs2-extensions/parallel-building/trunk/src/src/mpiterrierfileindexer-src/configure (modified) Minor changes in generated configs mostly to do with whitespace safety Fri, 09 Oct 2015 00:30:39 GMT jmt12 [30290] * gs2-extensions/parallel-building/trunk/src/src/CASCADE-MAKE/HADOOPGREENSTONEINGEST.sh (modified) Extended build script to make Hadoop support optional. If no ... Fri, 09 Oct 2015 00:29:06 GMT jmt12 [30289] * gs2-extensions/parallel-building/trunk/src/perllib/plugins/DirectoryPlugin.pm (modified) Significant changes to read() function - essentially split in half ... Fri, 09 Oct 2015 00:27:43 GMT jmt12 [30288] * gs2-extensions/parallel-building/trunk/src/perllib/plugouts/BasePlugout.pm (deleted) No longer different that the vanilla Greenstone version Fri, 09 Oct 2015 00:24:30 GMT jmt12 [30287] * gs2-extensions/parallel-building/trunk/src/perllib/FileUtils.pm (modified) Extending error messages a bit to differentiate between linking that ... Fri, 09 Oct 2015 00:23:38 GMT jmt12 [30286] * gs2-extensions/parallel-building/trunk/src/perllib/inexport.pm (added) Adding a customized version of inexport.pm allowing us to handle ... Fri, 09 Oct 2015 00:20:47 GMT jmt12 [30285] * gs2-extensions/parallel-building/trunk/src/packages/CASCADE-MAKE.sh (modified) Adding in a call to uptar/compile/install Hadoop support package Fri, 09 Oct 2015 00:17:18 GMT jmt12 [30284] * gs2-extensions/parallel-building/trunk/src/packages/cpan (modified) updated svnignore Fri, 09 Oct 2015 00:16:57 GMT jmt12 [30283] * gs2-extensions/parallel-building/trunk/src/packages/cpan/.svnignore (modified) Ignoring the unpacked versions of a couple of new packages used to ... Fri, 09 Oct 2015 00:15:04 GMT jmt12 [30282] * gs2-extensions/parallel-building/trunk/src/packages/CASCADE-MAKE/CPAN.sh (modified) Ensure the perl/cpan install directories exist before trying to copy ... Fri, 09 Oct 2015 00:13:22 GMT jmt12 [30281] * gs2-extensions/parallel-building/trunk/src/packages/CASCADE-MAKE/HADOOP.sh (added) Cascade-Make file to provide Hadoop functionality Fri, 09 Oct 2015 00:11:53 GMT jmt12 [30280] * gs2-extensions/parallel-building/trunk/src/CASCADE-MAKE.sh (modified) Ensure the platform specific directory for built files exists. It may ... Sun, 27 Sep 2015 22:53:59 GMT jmt12 [30278] * gs2-extensions/parallel-building/trunk/src/setup.bash (added) Might as well add this, with default setting for Hadoop, to SVN... ... Thu, 18 Dec 2014 23:30:03 GMT jmt12 [29663] * gs2-extensions/parallel-building/trunk/src/bin/script/generate_gantt.pl (modified) Supporting grayscale printing, fixing mismatched tags and ... Thu, 18 Dec 2014 23:28:36 GMT jmt12 [29662] * gs2-extensions/parallel-building/trunk/src/bin/script/rm_archives.pl (modified) Now removes building and index directories if found Thu, 18 Dec 2014 23:26:36 GMT jmt12 [29661] * gs2-extensions/parallel-building/trunk/src/bin/script/deletinator.pl (added) A helper script to clean-up the bogus directories sometimes created ... Thu, 18 Dec 2014 23:16:18 GMT jmt12 [29660] * gs2-extensions/parallel-building/trunk/src/perllib/dbutil/tdbcluster.pm (modified) making the debug variable global... can't remember why though Thu, 18 Dec 2014 22:30:42 GMT jmt12 [29649] * gs2-extensions/parallel-building/trunk/src/opt/Perseus (added) * gs2-extensions/parallel-building/trunk/src/opt/Perseus/perseus-medusa.pl (added) * gs2-extensions/parallel-building/trunk/src/opt/Perseus/perseus.pl (added) * gs2-extensions/parallel-building/trunk/src/opt/Perseus/perseusclient.pl (added) Perseus was an attempt to add functionality to automatically and ... Thu, 11 Sep 2014 22:43:44 GMT jmt12 [29276] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) I need to measure the time spent on generating the initial manifest, ... Tue, 09 Sep 2014 22:32:54 GMT jmt12 [29261] * gs2-extensions/parallel-building/trunk/src/perllib/plugins/CPULoadTextPlugin.pm (modified) Removing some of the extraneous IO from high cpu importing... ... Tue, 09 Sep 2014 22:31:15 GMT jmt12 [29260] * gs2-extensions/parallel-building/trunk/src/perllib/plugins/DirectoryPlugin.pm (modified) Replacing the obsolete call to util::file_lastmodified() with the ... Tue, 09 Sep 2014 22:30:18 GMT jmt12 [29259] * gs2-extensions/parallel-building/trunk/src/perllib/Kea.pm (modified) Kea override allowing for fixed processor affinity if necessary ... Tue, 09 Sep 2014 22:29:14 GMT jmt12 [29258] * gs2-extensions/parallel-building/trunk/src/perllib/dbutil/tdbcluster.pm (added) Initial checkin of a new TDB infodb that allows each worker thread in ... Tue, 09 Sep 2014 22:19:19 GMT jmt12 [29257] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Allow for collection configuration to be passed down to parallel ... Thu, 28 Aug 2014 02:18:25 GMT jmt12 [29243] * gs2-extensions/parallel-building/trunk/src/perllib/FileUtils.pm (modified) Allowing for file linking to be disabled Thu, 24 Jul 2014 02:15:14 GMT jmt12 [29162] * gs2-extensions/parallel-building/trunk/src/packages/cpan/Lingua-EN-Syllable-0.251.tar.gz (added) The Lingua module for detecting syllables - used when determining ... Thu, 24 Jul 2014 02:13:19 GMT jmt12 [29161] * gs2-extensions/parallel-building/trunk/src/perllib/plugins/CPULoadTextPlugin.pm (modified) Some modules aren't availalbe on cluster... add test and include path ... Thu, 24 Jul 2014 01:56:47 GMT jmt12 [29160] * gs2-extensions/parallel-building/trunk/src/packages/cpan/Crypt-Blowfish_PP-1.12.tar.gz (added) Adding blowfish encryption package to give text processing some work ... Mon, 21 Jul 2014 22:46:42 GMT jmt12 [29158] * gs2-extensions/parallel-building/trunk/src/bin/script/logreportinator.pl (added) Initial checkin of script to convert a number of Greenstone|| logs ... Thu, 19 Jun 2014 05:28:20 GMT jmt12 [29106] * gs2-extensions/parallel-building/trunk/src/bin/script/linkinator.pl (added) Check-in of script to symlink lorem files to matching files in ... Wed, 18 Jun 2014 23:26:28 GMT jmt12 [29104] * gs2-extensions/parallel-building/trunk/src/bin/script/text_metricinator.pl (added) A script for extracting textual metrics from a collection of text ... Wed, 18 Jun 2014 23:26:01 GMT jmt12 [29103] * gs2-extensions/parallel-building/trunk/src/bin/script/importsubsetinator.pl (modified) updated - not any more efficient (Schlemiel the painter performance) ... Wed, 18 Dec 2013 00:02:19 GMT jmt12 [28779] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Making timing message all sorts of purty Tue, 17 Dec 2013 23:58:04 GMT jmt12 [28778] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Typo - underscore where I meant hyphen Tue, 17 Dec 2013 23:56:49 GMT jmt12 [28777] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Need to include path to mpiimport on Medusa Tue, 17 Dec 2013 22:11:16 GMT jmt12 [28771] * gs2-extensions/parallel-building/trunk/src/perllib/plugouts/BasePlugout.pm (added) A version of BasePlugout where the RSS feed update attempts to write ... Tue, 17 Dec 2013 22:08:13 GMT jmt12 [28770] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Adding microtiming... a little tricky what with TDBServer taking ... Tue, 17 Dec 2013 21:53:57 GMT jmt12 [28769] * gs2-extensions/parallel-building/trunk/src/bin/script/parallel_import.pl (deleted) No longer used. import.pl now smart enough to dynamically load ... Tue, 17 Dec 2013 21:53:15 GMT jmt12 [28768] * gs2-extensions/parallel-building/trunk/src/bin/script/parallel_import.pl (modified) Initially added microtime to this script, but then remembered it ... Tue, 17 Dec 2013 21:21:53 GMT jmt12 [28767] * gs2-extensions/parallel-building/trunk/src/bin/script/import_with_io_metric.pl (modified) Drastically increased the script to allow 1) battery of imports ... Tue, 17 Dec 2013 21:20:09 GMT jmt12 [28766] * gs2-extensions/parallel-building/trunk/src/bin/script/strace_to_tsv.pl (modified) Removing an occasional few characters of garbage that turn up in the ... Mon, 16 Dec 2013 23:08:10 GMT jmt12 [28764] * gs2-extensions/parallel-building/trunk/src/bin/script/parallel_dspace_filtermedia.pl (modified) Adding microsecond timing messages Thu, 21 Nov 2013 00:36:40 GMT jmt12 [28666] * gs2-extensions/parallel-building/trunk/src/bin/script/strace_to_tsv.pl (added) A script to transform a strace.out into a Tab separated file worthy ... Thu, 21 Nov 2013 00:35:52 GMT jmt12 [28665] * gs2-extensions/parallel-building/trunk/src/bin/script/import_with_io_metric.pl (modified) Latest changes to workaround resumed syscalls massive duration problem Wed, 20 Nov 2013 00:00:09 GMT jmt12 [28654] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Removed recordEarliestDatestamp() function as that no lurks in the ... Tue, 19 Nov 2013 23:58:26 GMT jmt12 [28653] * gs2-extensions/parallel-building/trunk/src/perllib/FileUtils.pm (modified) Changed the way a require was 'eval'd - but I have no idea why Tue, 19 Nov 2013 23:57:27 GMT jmt12 [28652] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_report.pl (modified) Changes to support running the reports over logs produced from ... Tue, 19 Nov 2013 23:53:02 GMT jmt12 [28649] * gs2-extensions/parallel-building/trunk/src/perllib/plugins/CPULoadTextPlugin.pm (added) A version of a Textfile reading plugin that has a configurable load ... Tue, 19 Nov 2013 23:51:45 GMT jmt12 [28648] * gs2-extensions/parallel-building/trunk/src/bin/script/flush_caches.pl (modified) Adding a short delay after writing to the flush_cache file just to ... Tue, 19 Nov 2013 23:49:26 GMT jmt12 [28647] * gs2-extensions/parallel-building/trunk/src/bin/script/update_data_locality.pl (modified) Adding progress messages and making a debug message optional Tue, 19 Nov 2013 22:31:31 GMT jmt12 [28646] * gs2-extensions/parallel-building/trunk/src/bin/script/import_with_io_metric.pl (added) A script that uses strace to produce IO metrics of a Greenstone import Tue, 19 Nov 2013 22:31:07 GMT jmt12 [28645] * gs2-extensions/parallel-building/trunk/src/bin/script/dlreport.pl (added) Script to generate a report on data locality from GreenstoneHadoop logs Sun, 06 Oct 2013 21:04:32 GMT jmt12 [28358] * gs2-extensions/parallel-building/trunk/src/bin/script/generate_gantt.pl (modified) Replacing my earlier decision to only have data locality information ... Sun, 06 Oct 2013 21:02:54 GMT jmt12 [28357] * gs2-extensions/parallel-building/trunk/src/bin/script/update_data_locality.pl (added) used to update the data_locality.csv file in the case where other ... Sun, 06 Oct 2013 21:01:39 GMT jmt12 [28356] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_report.pl (modified) Support the legacy version of taskno in the data_locality.csv file ... Wed, 25 Sep 2013 23:13:14 GMT jmt12 [28312] * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/HadoopGreenstoneIngest2.java (modified) Working on finer control over data locality - so I can configure a ... Thu, 29 Aug 2013 21:21:30 GMT jmt12 [28192] * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/HadoopGreenstoneIngest2.java (modified) Need to still output Greenstone messages to log otherwise I can't ... Thu, 29 Aug 2013 21:18:21 GMT jmt12 [28191] * gs2-extensions/parallel-building/trunk/src/bin/script/replication_tests.pl (modified) Removing redundant error stream redirect - this wasn't causing the ... Thu, 29 Aug 2013 21:08:04 GMT jmt12 [28190] * gs2-extensions/parallel-building/trunk/src/bin/script/replication_tests.pl (modified) Had accidently hardcoded the max replication number - allow it to be ... Thu, 29 Aug 2013 21:06:56 GMT jmt12 [28189] * gs2-extensions/parallel-building/trunk/src/bin/script/generate_gantt.pl (modified) Replace the newer (and faster) while(@file) loop with the older (and ... Thu, 29 Aug 2013 20:58:33 GMT jmt12 [28188] * gs2-extensions/parallel-building/trunk/src/bin/script/generate_gantt.pl (modified) Minor fix to allow for tasks that start in the same second (now each ... Thu, 29 Aug 2013 20:56:57 GMT jmt12 [28187] * gs2-extensions/parallel-building/trunk/src/perllib/Kea.pm (added) A customized version of Kea.pm that looks in the correct place for ... Thu, 29 Aug 2013 20:55:57 GMT jmt12 [28186] * gs2-extensions/parallel-building/trunk/src/bin/script/iotop_report.pl (added) A (failed) attempt to use the unix iotop tool to determine IO percentage Fri, 09 Aug 2013 01:30:35 GMT jmt12 [28018] * gs2-extensions/parallel-building/trunk/src/bin/script/replication_tests.pl (modified) Try really hard to capture the output from 'time' function as Medusa ... Fri, 09 Aug 2013 01:26:02 GMT jmt12 [28017] * gs2-extensions/parallel-building/trunk/src/bin/script/replication_tests.pl (modified) Forgot to add processing comment before call to hadoop_import.pl Fri, 09 Aug 2013 01:16:44 GMT jmt12 [28016] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_report.pl (modified) Allow the hadoop report generator to parse start and end times ... Fri, 09 Aug 2013 01:16:06 GMT jmt12 [28015] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_import.pl (modified) Add an extra option that allows me to pass in the directory to write ... Fri, 09 Aug 2013 01:15:02 GMT jmt12 [28014] * gs2-extensions/parallel-building/trunk/src/bin/script/parse_task_info_from_hadoop_log.pl (modified) Remove tasks that have had data locality established from the array ... Fri, 09 Aug 2013 01:14:22 GMT jmt12 [28013] * gs2-extensions/parallel-building/trunk/src/bin/script/replication_tests.pl (added) A new script to run a battery of Hadoop ingests at varying ... Fri, 09 Aug 2013 01:13:50 GMT jmt12 [28012] * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/HadoopGreenstoneIngest2.java (modified) Express start time as a double as well Fri, 09 Aug 2013 01:13:01 GMT jmt12 [28011] * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/GSInfoDB.java (modified) Turn off debugging in the copy in SVN Fri, 09 Aug 2013 01:11:46 GMT jmt12 [28010] * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/GSInfoDB.java (modified) Correctly set up the environment for calls to txt2tdb and also ... Thu, 08 Aug 2013 00:46:06 GMT jmt12 [28001] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Write datestamp using dbutil if applicable Wed, 07 Aug 2013 22:13:59 GMT jmt12 [27996] * gs2-extensions/parallel-building/trunk/src/packages/hdfs-nfs-proxy-release-0.8.1.tar.gz (modified) A new version of the archive with minor changes to log4j configuration Wed, 07 Aug 2013 22:12:52 GMT jmt12 [27995] * gs2-extensions/parallel-building/trunk/src/perllib/parallelbuildinginexport.pm (modified) Just adding some code comments Sun, 21 Jul 2013 22:40:02 GMT jmt12 [27915] * gs2-extensions/parallel-building/trunk/src/perllib/dbutil/stdoutxml.pm (added) A new PlugOut that doesn't write any intermediate files (bar those ... Sun, 21 Jul 2013 22:38:06 GMT jmt12 [27914] * gs2-extensions/parallel-building/trunk/src/bin/script/generate_gantt.pl (modified) Trying to get around a couple of divide-by-zero issues when ... Sun, 21 Jul 2013 22:37:02 GMT jmt12 [27913] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_import.pl (modified) Made the ingester to be used (version 1 without reduce phase, or ... Sun, 21 Jul 2013 22:36:02 GMT jmt12 [27912] * gs2-extensions/parallel-building/trunk/src/src/CASCADE-MAKE/HADOOPGREENSTONEINGEST.sh (modified) Modified the compilation to include the new ingester and its co- ... Sun, 21 Jul 2013 22:35:43 GMT jmt12 [27911] * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/compile.sh (modified) Modified the compilation to include the new ingester and its co- ... Sun, 21 Jul 2013 22:35:04 GMT jmt12 [27910] * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/GSGroupingComparator.java (added) * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/GSInfoDB.java (added) * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/GSPartitioner.java (added) * gs2-extensions/parallel-building/trunk/src/src/java/org/nzdl/gsdl/HadoopGreenstoneIngest2.java (added) Extended the existing HadoopGreenstoneIngest with proper Reduce phase ... Thu, 04 Jul 2013 01:45:08 GMT jmt12 [27753] * gs2-extensions/parallel-building/trunk/src/bin/script/generate_gantt.pl (modified) Adding Handbrake's percentage complete to report - although this is ... Thu, 04 Jul 2013 01:44:22 GMT jmt12 [27752] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_report.pl (modified) Data locality file not being found is no longer fatal (HDFS-NFS-Proxy ... Tue, 02 Jul 2013 02:35:42 GMT jmt12 [27732] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_import.pl (modified) Nice the copy itself too Fri, 21 Jun 2013 00:25:32 GMT jmt12 [27686] * gs2-extensions/parallel-building/trunk/src/bin/script/hadoop_import.pl (modified) A little more progress comments Fri, 21 Jun 2013 00:24:54 GMT jmt12 [27685] * gs2-extensions/parallel-building/trunk/src/bin/script/parse_task_info_from_hadoop_log.pl (modified) in the case of multiple attempts you need to retain the information ...