root/gs2-extensions/parallel-building

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Rev Chgset Date Author Log Message
(edit) @28653 [28653] 6 years jmt12 Changed the way a require was 'eval'd - but I have no idea why
(edit) @28652 [28652] 6 years jmt12 Changes to support running the reports over logs produced from multicore …
(edit) @28649 [28649] 6 years jmt12 A version of a Textfile reading plugin that has a configurable load …
(edit) @28648 [28648] 6 years jmt12 Adding a short delay after writing to the flush_cache file just to ensure …
(edit) @28647 [28647] 6 years jmt12 Adding progress messages and making a debug message optional
(edit) @28646 [28646] 6 years jmt12 A script that uses strace to produce IO metrics of a Greenstone import
(edit) @28645 [28645] 6 years jmt12 Script to generate a report on data locality from GreenstoneHadoop? logs
(edit) @28358 [28358] 6 years jmt12 Replacing my earlier decision to only have data locality information …
(edit) @28357 [28357] 6 years jmt12 used to update the data_locality.csv file in the case where other …
(edit) @28356 [28356] 6 years jmt12 Support the legacy version of taskno in the data_locality.csv file (we now …
(edit) @28312 [28312] 6 years jmt12 Working on finer control over data locality - so I can configure a run …
(edit) @28192 [28192] 6 years jmt12 Need to still output Greenstone messages to log otherwise I can't …
(edit) @28191 [28191] 6 years jmt12 Removing redundant error stream redirect - this wasn't causing the issue I …
(edit) @28190 [28190] 6 years jmt12 Had accidently hardcoded the max replication number - allow it to be …
(edit) @28189 [28189] 6 years jmt12 Replace the newer (and faster) while(@file) loop with the older (and more …
(edit) @28188 [28188] 6 years jmt12 Minor fix to allow for tasks that start in the same second (now each …
(edit) @28187 [28187] 6 years jmt12 A customized version of Kea.pm that looks in the correct place for newer …
(edit) @28186 [28186] 6 years jmt12 A (failed) attempt to use the unix iotop tool to determine IO percentage
(edit) @28018 [28018] 6 years jmt12 Try really hard to capture the output from 'time' function as Medusa lets …
(edit) @28017 [28017] 6 years jmt12 Forgot to add processing comment before call to hadoop_import.pl
(edit) @28016 [28016] 6 years jmt12 Allow the hadoop report generator to parse start and end times expressed …
(edit) @28015 [28015] 6 years jmt12 Add an extra option that allows me to pass in the directory to write log …
(edit) @28014 [28014] 6 years jmt12 Remove tasks that have had data locality established from the array of …
(edit) @28013 [28013] 6 years jmt12 A new script to run a battery of Hadoop ingests at varying replication …
(edit) @28012 [28012] 6 years jmt12 Express start time as a double as well
(edit) @28011 [28011] 6 years jmt12 Turn off debugging in the copy in SVN
(edit) @28010 [28010] 6 years jmt12 Correctly set up the environment for calls to txt2tdb and also replace …
(edit) @28001 [28001] 6 years jmt12 Write datestamp using dbutil if applicable
(edit) @27996 [27996] 6 years jmt12 A new version of the archive with minor changes to log4j configuration
(edit) @27995 [27995] 6 years jmt12 Just adding some code comments
(edit) @27915 [27915] 6 years jmt12 A new PlugOut? that doesn't write any intermediate files (bar those …
(edit) @27914 [27914] 6 years jmt12 Trying to get around a couple of divide-by-zero issues when generating …
(edit) @27913 [27913] 6 years jmt12 Made the ingester to be used (version 1 without reduce phase, or version 2 …
(edit) @27912 [27912] 6 years jmt12 Modified the compilation to include the new ingester and its …
(edit) @27911 [27911] 6 years jmt12 Modified the compilation to include the new ingester and its co-requisites
(edit) @27910 [27910] 6 years jmt12 Extended the existing HadoopGreenstoneIngest? with proper Reduce phase - …
(edit) @27753 [27753] 6 years jmt12 Adding Handbrake's percentage complete to report - although this is …
(edit) @27752 [27752] 6 years jmt12 Data locality file not being found is no longer fatal (HDFS-NFS-Proxy …
(edit) @27732 [27732] 6 years jmt12 Nice the copy itself too
(edit) @27686 [27686] 6 years jmt12 A little more progress comments
(edit) @27685 [27685] 6 years jmt12 in the case of multiple attempts you need to retain the information about …
(edit) @27684 [27684] 6 years jmt12 Adding natural sorting into report generation - so also needed to add INC …
(edit) @27683 [27683] 6 years jmt12 moving a few more headings around to help with information block layout
(edit) @27682 [27682] 6 years jmt12 Copying makeAllDirectories() from vanilla FileUtils?.pm
(edit) @27669 [27669] 6 years jmt12 Sort compute nodes naturally before labelling them with incremental worker …
(edit) @27654 [27654] 6 years jmt12 Add the ability to stagger the starting of Mappers by placing a 'delay.me' …
(edit) @27653 [27653] 6 years jmt12 Forgot to pull self off the head of arguments
(edit) @27652 [27652] 6 years jmt12 Changing buffer to 128K (slightly faster) and adding a comment explaining …
(edit) @27651 [27651] 6 years jmt12
(edit) @27650 [27650] 6 years jmt12
(edit) @27649 [27649] 6 years jmt12 No longer in SVN control
(edit) @27648 [27648] 6 years jmt12 Template for setup.bash - a user will have to populate Hadoop fields
(edit) @27645 [27645] 6 years jmt12
(edit) @27644 [27644] 6 years jmt12 Extended to support HDFS-access via NFS. This applies to both the call to …
(edit) @27643 [27643] 6 years jmt12 Changed the script generator so it can recurse through directories and …
(edit) @27642 [27642] 6 years jmt12 A script I downloaded that successfully splits video files - something I …
(edit) @27641 [27641] 6 years jmt12 Altered order of arguments and allow archives dir to be passed as argument …
(edit) @27640 [27640] 6 years jmt12
(edit) @27638 [27638] 6 years jmt12 Change it so failure to open a filehandle isn't fatal - leave it up to the …
(edit) @27631 [27631] 6 years jmt12 A proxy to allow NFS access to HDFS
(edit) @27595 [27595] 7 years jmt12 Updating list of untarred directories to ignore
(edit) @27594 [27594] 7 years jmt12 Extend hadoop_import.pl to be able to start and stop the Thrift server(s)
(edit) @27593 [27593] 7 years jmt12 Need Class Accessor for Thrift client under Rocks
(edit) @27592 [27592] 7 years jmt12 Adding in a script to allow a daemon version of Thrift to be started (and …
(edit) @27591 [27591] 7 years jmt12 Ensure Thrift will, be default, attempt to connect to the local machine …
(edit) @27590 [27590] 7 years jmt12 Adding statistics about data locality, and highlighting tasks where file …
(edit) @27589 [27589] 7 years jmt12 Fixing up some minor bugs in regex's
(edit) @27588 [27588] 7 years jmt12 Extend parser to support jobs that are split over several logs. Also …
(edit) @27587 [27587] 7 years jmt12 Allow debug mode to be enabled from the command line
(edit) @27586 [27586] 7 years jmt12 Updating script to date date of hadoop job into account when searching for …
(edit) @27585 [27585] 7 years jmt12 The perl on Medusa won't let you immediately treat a returned array in a …
(edit) @27584 [27584] 7 years jmt12 I wasn't doing -r when attempting to clear directories left in /tmp by …
(edit) @27583 [27583] 7 years jmt12 Adding code to differentiate between workers in a cluster - all of which …
(edit) @27571 [27571] 7 years jmt12 increase timeout to 4 hours per map
(edit) @27570 [27570] 7 years jmt12 Make the warning about binmode() not being applicable more meaningful, and …
(edit) @27569 [27569] 7 years jmt12 Trying to streamline the error messages from failing to link (otherwise I …
(edit) @27568 [27568] 7 years jmt12 Testing on Medusa suggests optimal buffer size around 128K
(edit) @27567 [27567] 7 years jmt12 Found a printWarning that I handed changed to use the FileUtils? version
(edit) @27566 [27566] 7 years jmt12 Making the getcpu optional - as it isn't available on Medusa (but then I …
(edit) @27561 [27561] 7 years jmt12 Adding very basic compile file for getcpu - can't be bothered going …
(edit) @27560 [27560] 7 years jmt12 Fixing typo in regexp that meant filenames sometimes ignored
(edit) @27559 [27559] 7 years jmt12 Changed mime-type away from binary - I hope. Meanwhile, generate …
(edit) @27558 [27558] 7 years jmt12 Forgot that Hadoop Map processes no longer have the environment …
(edit) @27551 [27551] 7 years jmt12 Altered so that it expects to be given a CSV containing parallel …
(edit) @27550 [27550] 7 years jmt12 Ensure the hostname is added to the Hadoop logs so we can identify the …
(edit) @27549 [27549] 7 years jmt12 Extract information from the logs generated by parallel Greenstone using …
(edit) @27548 [27548] 7 years jmt12 Extract information from the logs generated by parallel Greenstone using …
(edit) @27547 [27547] 7 years jmt12 Rejigging some processing comments
(edit) @27546 [27546] 7 years jmt12 Adding the ability for the Hadoop Mapper to determine what CPU number it …
(edit) @27545 [27545] 7 years jmt12 Ignoring just the compiled file (for now)
(edit) @27544 [27544] 7 years jmt12 A tiny C script to guesstimate the CPU the calling Process is on
(edit) @27543 [27543] 7 years jmt12 Adding generate_gantt.pl script in its original form - i.e. directly reads …
(edit) @27532 [27532] 7 years jmt12 Add the ability to configure the Thrift connector using a 'thrift.conf' …
(edit) @27531 [27531] 7 years jmt12 Only output the message about using copy instead of hard/soft link once
(edit) @27530 [27530] 7 years jmt12 Clear out old logs, and adding more comments about what the script is …
(edit) @27526 [27526] 7 years jmt12 Adding in a 'isHDFS()' function so that some plugins (SimpleVideoPlug?) can …
(edit) @27525 [27525] 7 years jmt12 Adding in a 'isHDFS()' function so that some plugins (SimpleVideoPlug?) can …
(edit) @27515 [27515] 7 years jmt12 Making the file used durig buffertes be configurable
(edit) @27514 [27514] 7 years jmt12 Altering code to allow configurable length of read/write buffer when …
(edit) @27512 [27512] 7 years jmt12 Adding in a special test for measuring the effect of altering ThriftFS …
Note: See TracRevisionLog for help on using the revision log.