source: gs2-extensions/parallel-building

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @28764   7 years jmt12 Adding microsecond timing messages
(edit) @28666   7 years jmt12 A script to transform a strace.out into a Tab separated file worthy of …
(edit) @28665   7 years jmt12 Latest changes to workaround resumed syscalls massive duration problem
(edit) @28654   7 years jmt12 Removed recordEarliestDatestamp() function as that no lurks in the …
(edit) @28653   7 years jmt12 Changed the way a require was 'eval'd - but I have no idea why
(edit) @28652   7 years jmt12 Changes to support running the reports over logs produced from …
(edit) @28649   7 years jmt12 A version of a Textfile reading plugin that has a configurable load …
(edit) @28648   7 years jmt12 Adding a short delay after writing to the flush_cache file just to …
(edit) @28647   7 years jmt12 Adding progress messages and making a debug message optional
(edit) @28646   7 years jmt12 A script that uses strace to produce IO metrics of a Greenstone import
(edit) @28645   7 years jmt12 Script to generate a report on data locality from GreenstoneHadoop logs
(edit) @28358   7 years jmt12 Replacing my earlier decision to only have data locality information …
(edit) @28357   7 years jmt12 used to update the data_locality.csv file in the case where other …
(edit) @28356   7 years jmt12 Support the legacy version of taskno in the data_locality.csv file (we …
(edit) @28312   7 years jmt12 Working on finer control over data locality - so I can configure a run …
(edit) @28192   7 years jmt12 Need to still output Greenstone messages to log otherwise I can't …
(edit) @28191   7 years jmt12 Removing redundant error stream redirect - this wasn't causing the …
(edit) @28190   7 years jmt12 Had accidently hardcoded the max replication number - allow it to be …
(edit) @28189   7 years jmt12 Replace the newer (and faster) while(@file) loop with the older (and …
(edit) @28188   7 years jmt12 Minor fix to allow for tasks that start in the same second (now each …
(edit) @28187   7 years jmt12 A customized version of Kea.pm that looks in the correct place for …
(edit) @28186   7 years jmt12 A (failed) attempt to use the unix iotop tool to determine IO percentage
(edit) @28018   7 years jmt12 Try really hard to capture the output from 'time' function as Medusa …
(edit) @28017   7 years jmt12 Forgot to add processing comment before call to hadoop_import.pl
(edit) @28016   7 years jmt12 Allow the hadoop report generator to parse start and end times …
(edit) @28015   7 years jmt12 Add an extra option that allows me to pass in the directory to write …
(edit) @28014   7 years jmt12 Remove tasks that have had data locality established from the array of …
(edit) @28013   7 years jmt12 A new script to run a battery of Hadoop ingests at varying replication …
(edit) @28012   7 years jmt12 Express start time as a double as well
(edit) @28011   7 years jmt12 Turn off debugging in the copy in SVN
(edit) @28010   7 years jmt12 Correctly set up the environment for calls to txt2tdb and also replace …
(edit) @28001   7 years jmt12 Write datestamp using dbutil if applicable
(edit) @27996   7 years jmt12 A new version of the archive with minor changes to log4j configuration
(edit) @27995   7 years jmt12 Just adding some code comments
(edit) @27915   7 years jmt12 A new PlugOut that doesn't write any intermediate files (bar those …
(edit) @27914   7 years jmt12 Trying to get around a couple of divide-by-zero issues when generating …
(edit) @27913   7 years jmt12 Made the ingester to be used (version 1 without reduce phase, or …
(edit) @27912   7 years jmt12 Modified the compilation to include the new ingester and its co-requisites.
(edit) @27911   7 years jmt12 Modified the compilation to include the new ingester and its co-requisites
(edit) @27910   7 years jmt12 Extended the existing HadoopGreenstoneIngest with proper Reduce phase …
(edit) @27753   7 years jmt12 Adding Handbrake's percentage complete to report - although this is …
(edit) @27752   7 years jmt12 Data locality file not being found is no longer fatal (HDFS-NFS-Proxy …
(edit) @27732   7 years jmt12 Nice the copy itself too
(edit) @27686   7 years jmt12 A little more progress comments
(edit) @27685   7 years jmt12 in the case of multiple attempts you need to retain the information …
(edit) @27684   7 years jmt12 Adding natural sorting into report generation - so also needed to add …
(edit) @27683   7 years jmt12 moving a few more headings around to help with information block layout
(edit) @27682   7 years jmt12 Copying makeAllDirectories() from vanilla FileUtils.pm
(edit) @27669   7 years jmt12 Sort compute nodes naturally before labelling them with incremental …
(edit) @27654   7 years jmt12 Add the ability to stagger the starting of Mappers by placing a …
(edit) @27653   7 years jmt12 Forgot to pull self off the head of arguments
(edit) @27652   7 years jmt12 Changing buffer to 128K (slightly faster) and adding a comment …
(edit) @27651   7 years jmt12
(edit) @27650   7 years jmt12
(edit) @27649   7 years jmt12 No longer in SVN control
(edit) @27648   7 years jmt12 Template for setup.bash - a user will have to populate Hadoop fields
(edit) @27645   7 years jmt12
(edit) @27644   7 years jmt12 Extended to support HDFS-access via NFS. This applies to both the call …
(edit) @27643   7 years jmt12 Changed the script generator so it can recurse through directories and …
(edit) @27642   7 years jmt12 A script I downloaded that successfully splits video files - something …
(edit) @27641   7 years jmt12 Altered order of arguments and allow archives dir to be passed as …
(edit) @27640   7 years jmt12
(edit) @27638   7 years jmt12 Change it so failure to open a filehandle isn't fatal - leave it up to …
(edit) @27631   7 years jmt12 A proxy to allow NFS access to HDFS
(edit) @27595   7 years jmt12 Updating list of untarred directories to ignore
(edit) @27594   7 years jmt12 Extend hadoop_import.pl to be able to start and stop the Thrift server(s)
(edit) @27593   7 years jmt12 Need Class Accessor for Thrift client under Rocks
(edit) @27592   7 years jmt12 Adding in a script to allow a daemon version of Thrift to be started …
(edit) @27591   7 years jmt12 Ensure Thrift will, be default, attempt to connect to the local …
(edit) @27590   7 years jmt12 Adding statistics about data locality, and highlighting tasks where …
(edit) @27589   7 years jmt12 Fixing up some minor bugs in regex's
(edit) @27588   7 years jmt12 Extend parser to support jobs that are split over several logs. Also …
(edit) @27587   7 years jmt12 Allow debug mode to be enabled from the command line
(edit) @27586   7 years jmt12 Updating script to date date of hadoop job into account when searching …
(edit) @27585   7 years jmt12 The perl on Medusa won't let you immediately treat a returned array in …
(edit) @27584   7 years jmt12 I wasn't doing -r when attempting to clear directories left in /tmp by …
(edit) @27583   7 years jmt12 Adding code to differentiate between workers in a cluster - all of …
(edit) @27571   7 years jmt12 increase timeout to 4 hours per map
(edit) @27570   7 years jmt12 Make the warning about binmode() not being applicable more meaningful, …
(edit) @27569   7 years jmt12 Trying to streamline the error messages from failing to link …
(edit) @27568   7 years jmt12 Testing on Medusa suggests optimal buffer size around 128K
(edit) @27567   7 years jmt12 Found a printWarning that I handed changed to use the FileUtils version
(edit) @27566   7 years jmt12 Making the getcpu optional - as it isn't available on Medusa (but then …
(edit) @27561   7 years jmt12 Adding very basic compile file for getcpu - can't be bothered going …
(edit) @27560   7 years jmt12 Fixing typo in regexp that meant filenames sometimes ignored
(edit) @27559   7 years jmt12 Changed mime-type away from binary - I hope. Meanwhile, generate …
(edit) @27558   7 years jmt12 Forgot that Hadoop Map processes no longer have the environment …
(edit) @27551   7 years jmt12 Altered so that it expects to be given a CSV containing parallel …
(edit) @27550   7 years jmt12 Ensure the hostname is added to the Hadoop logs so we can identify the …
(edit) @27549   7 years jmt12 Extract information from the logs generated by parallel Greenstone …
(edit) @27548   7 years jmt12 Extract information from the logs generated by parallel Greenstone …
(edit) @27547   7 years jmt12 Rejigging some processing comments
(edit) @27546   7 years jmt12 Adding the ability for the Hadoop Mapper to determine what CPU number …
(edit) @27545   7 years jmt12 Ignoring just the compiled file (for now)
(edit) @27544   7 years jmt12 A tiny C script to guesstimate the CPU the calling Process is on
(edit) @27543   7 years jmt12 Adding generate_gantt.pl script in its original form - i.e. directly …
(edit) @27532   7 years jmt12 Add the ability to configure the Thrift connector using a …
(edit) @27531   7 years jmt12 Only output the message about using copy instead of hard/soft link once
(edit) @27530   7 years jmt12 Clear out old logs, and adding more comments about what the script is …
(edit) @27526   7 years jmt12 Adding in a 'isHDFS()' function so that some plugins (SimpleVideoPlug) …
Note: See TracRevisionLog for help on using the revision log.