source: gs2-extensions/parallel-building/trunk

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @30296   9 years jmt12 extending to support GSDL3 as well
(edit) @30295   9 years jmt12 Using the proper environment variable, GSDL3SRCHOME, rather than GSDL3HOME
(edit) @30294   9 years jmt12 Typo would have prevented the generated configure script from working …
(edit) @30293   9 years jmt12 The missing script was, ironically, missing
(edit) @30292   9 years jmt12 Removing reference to debugging module Devel::Peek
(edit) @30291   9 years jmt12 Minor changes in generated configs mostly to do with whitespace safety
(edit) @30290   9 years jmt12 Extended build script to make Hadoop support optional. If no …
(edit) @30289   9 years jmt12 Significant changes to read() function - essentially split in half …
(edit) @30288   9 years jmt12 No longer different that the vanilla Greenstone version
(edit) @30287   9 years jmt12 Extending error messages a bit to differentiate between linking that …
(edit) @30286   9 years jmt12 Adding a customized version of inexport.pm allowing us to handle …
(edit) @30285   9 years jmt12 Adding in a call to uptar/compile/install Hadoop support package
(edit) @30284   9 years jmt12 updated svnignore
(edit) @30283   9 years jmt12 Ignoring the unpacked versions of a couple of new packages used to …
(edit) @30282   9 years jmt12 Ensure the perl/cpan install directories exist before trying to copy …
(edit) @30281   9 years jmt12 Cascade-Make file to provide Hadoop functionality
(edit) @30280   9 years jmt12 Ensure the platform specific directory for built files exists. It may …
(edit) @30278   9 years jmt12 Might as well add this, with default setting for Hadoop, to SVN... …
(edit) @29663   9 years jmt12 Supporting grayscale printing, fixing mismatched tags and speechmarks, …
(edit) @29662   9 years jmt12 Now removes building and index directories if found
(edit) @29661   9 years jmt12 A helper script to clean-up the bogus directories sometimes created by …
(edit) @29660   9 years jmt12 making the debug variable global... can't remember why though
(edit) @29649   9 years jmt12 Perseus was an attempt to add functionality to automatically and …
(edit) @29276   10 years jmt12 I need to measure the time spent on generating the initial manifest, …
(edit) @29261   10 years jmt12 Removing some of the extraneous IO from high cpu importing... altering …
(edit) @29260   10 years jmt12 Replacing the obsolete call to util::file_lastmodified() with the …
(edit) @29259   10 years jmt12 Kea override allowing for fixed processor affinity if necessary …
(edit) @29258   10 years jmt12 Initial checkin of a new TDB infodb that allows each worker thread in …
(edit) @29257   10 years jmt12 Allow for collection configuration to be passed down to parallel …
(edit) @29243   10 years jmt12 Allowing for file linking to be disabled
(edit) @29162   10 years jmt12 The Lingua module for detecting syllables - used when determining …
(edit) @29161   10 years jmt12 Some modules aren't availalbe on cluster... add test and include path …
(edit) @29160   10 years jmt12 Adding blowfish encryption package to give text processing some work to do
(edit) @29158   10 years jmt12 Initial checkin of script to convert a number of Greenstone|| logs …
(edit) @29106   10 years jmt12 Check-in of script to symlink lorem files to matching files in another …
(edit) @29104   10 years jmt12 A script for extracting textual metrics from a collection of text …
(edit) @29103   10 years jmt12 updated - not any more efficient (Schlemiel the painter performance) …
(edit) @28779   10 years jmt12 Making timing message all sorts of purty
(edit) @28778   10 years jmt12 Typo - underscore where I meant hyphen
(edit) @28777   10 years jmt12 Need to include path to mpiimport on Medusa
(edit) @28771   10 years jmt12 A version of BasePlugout where the RSS feed update attempts to write …
(edit) @28770   10 years jmt12 Adding microtiming... a little tricky what with TDBServer taking …
(edit) @28769   10 years jmt12 No longer used. import.pl now smart enough to dynamically load …
(edit) @28768   10 years jmt12 Initially added microtime to this script, but then remembered it isn't …
(edit) @28767   10 years jmt12 Drastically increased the script to allow 1) battery of imports backed …
(edit) @28766   10 years jmt12 Removing an occasional few characters of garbage that turn up in the …
(edit) @28764   10 years jmt12 Adding microsecond timing messages
(edit) @28666   10 years jmt12 A script to transform a strace.out into a Tab separated file worthy of …
(edit) @28665   10 years jmt12 Latest changes to workaround resumed syscalls massive duration problem
(edit) @28654   10 years jmt12 Removed recordEarliestDatestamp() function as that no lurks in the …
(edit) @28653   10 years jmt12 Changed the way a require was 'eval'd - but I have no idea why
(edit) @28652   10 years jmt12 Changes to support running the reports over logs produced from …
(edit) @28649   10 years jmt12 A version of a Textfile reading plugin that has a configurable load …
(edit) @28648   10 years jmt12 Adding a short delay after writing to the flush_cache file just to …
(edit) @28647   10 years jmt12 Adding progress messages and making a debug message optional
(edit) @28646   10 years jmt12 A script that uses strace to produce IO metrics of a Greenstone import
(edit) @28645   10 years jmt12 Script to generate a report on data locality from GreenstoneHadoop logs
(edit) @28358   11 years jmt12 Replacing my earlier decision to only have data locality information …
(edit) @28357   11 years jmt12 used to update the data_locality.csv file in the case where other …
(edit) @28356   11 years jmt12 Support the legacy version of taskno in the data_locality.csv file (we …
(edit) @28312   11 years jmt12 Working on finer control over data locality - so I can configure a run …
(edit) @28192   11 years jmt12 Need to still output Greenstone messages to log otherwise I can't …
(edit) @28191   11 years jmt12 Removing redundant error stream redirect - this wasn't causing the …
(edit) @28190   11 years jmt12 Had accidently hardcoded the max replication number - allow it to be …
(edit) @28189   11 years jmt12 Replace the newer (and faster) while(@file) loop with the older (and …
(edit) @28188   11 years jmt12 Minor fix to allow for tasks that start in the same second (now each …
(edit) @28187   11 years jmt12 A customized version of Kea.pm that looks in the correct place for …
(edit) @28186   11 years jmt12 A (failed) attempt to use the unix iotop tool to determine IO percentage
(edit) @28018   11 years jmt12 Try really hard to capture the output from 'time' function as Medusa …
(edit) @28017   11 years jmt12 Forgot to add processing comment before call to hadoop_import.pl
(edit) @28016   11 years jmt12 Allow the hadoop report generator to parse start and end times …
(edit) @28015   11 years jmt12 Add an extra option that allows me to pass in the directory to write …
(edit) @28014   11 years jmt12 Remove tasks that have had data locality established from the array of …
(edit) @28013   11 years jmt12 A new script to run a battery of Hadoop ingests at varying replication …
(edit) @28012   11 years jmt12 Express start time as a double as well
(edit) @28011   11 years jmt12 Turn off debugging in the copy in SVN
(edit) @28010   11 years jmt12 Correctly set up the environment for calls to txt2tdb and also replace …
(edit) @28001   11 years jmt12 Write datestamp using dbutil if applicable
(edit) @27996   11 years jmt12 A new version of the archive with minor changes to log4j configuration
(edit) @27995   11 years jmt12 Just adding some code comments
(edit) @27915   11 years jmt12 A new PlugOut that doesn't write any intermediate files (bar those …
(edit) @27914   11 years jmt12 Trying to get around a couple of divide-by-zero issues when generating …
(edit) @27913   11 years jmt12 Made the ingester to be used (version 1 without reduce phase, or …
(edit) @27912   11 years jmt12 Modified the compilation to include the new ingester and its co-requisites.
(edit) @27911   11 years jmt12 Modified the compilation to include the new ingester and its co-requisites
(edit) @27910   11 years jmt12 Extended the existing HadoopGreenstoneIngest with proper Reduce phase …
(edit) @27753   11 years jmt12 Adding Handbrake's percentage complete to report - although this is …
(edit) @27752   11 years jmt12 Data locality file not being found is no longer fatal (HDFS-NFS-Proxy …
(edit) @27732   11 years jmt12 Nice the copy itself too
(edit) @27686   11 years jmt12 A little more progress comments
(edit) @27685   11 years jmt12 in the case of multiple attempts you need to retain the information …
(edit) @27684   11 years jmt12 Adding natural sorting into report generation - so also needed to add …
(edit) @27683   11 years jmt12 moving a few more headings around to help with information block layout
(edit) @27682   11 years jmt12 Copying makeAllDirectories() from vanilla FileUtils.pm
(edit) @27669   11 years jmt12 Sort compute nodes naturally before labelling them with incremental …
(edit) @27654   11 years jmt12 Add the ability to stagger the starting of Mappers by placing a …
(edit) @27653   11 years jmt12 Forgot to pull self off the head of arguments
(edit) @27652   11 years jmt12 Changing buffer to 128K (slightly faster) and adding a comment …
(edit) @27651   11 years jmt12
(edit) @27650   11 years jmt12
Note: See TracRevisionLog for help on using the revision log.