source: gs2-extensions/parallel-building

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @27915   11 years jmt12 A new PlugOut that doesn't write any intermediate files (bar those …
(edit) @27914   11 years jmt12 Trying to get around a couple of divide-by-zero issues when generating …
(edit) @27913   11 years jmt12 Made the ingester to be used (version 1 without reduce phase, or …
(edit) @27912   11 years jmt12 Modified the compilation to include the new ingester and its co-requisites.
(edit) @27911   11 years jmt12 Modified the compilation to include the new ingester and its co-requisites
(edit) @27910   11 years jmt12 Extended the existing HadoopGreenstoneIngest with proper Reduce phase …
(edit) @27753   11 years jmt12 Adding Handbrake's percentage complete to report - although this is …
(edit) @27752   11 years jmt12 Data locality file not being found is no longer fatal (HDFS-NFS-Proxy …
(edit) @27732   11 years jmt12 Nice the copy itself too
(edit) @27686   11 years jmt12 A little more progress comments
(edit) @27685   11 years jmt12 in the case of multiple attempts you need to retain the information …
(edit) @27684   11 years jmt12 Adding natural sorting into report generation - so also needed to add …
(edit) @27683   11 years jmt12 moving a few more headings around to help with information block layout
(edit) @27682   11 years jmt12 Copying makeAllDirectories() from vanilla FileUtils.pm
(edit) @27669   11 years jmt12 Sort compute nodes naturally before labelling them with incremental …
(edit) @27654   11 years jmt12 Add the ability to stagger the starting of Mappers by placing a …
(edit) @27653   11 years jmt12 Forgot to pull self off the head of arguments
(edit) @27652   11 years jmt12 Changing buffer to 128K (slightly faster) and adding a comment …
(edit) @27651   11 years jmt12
(edit) @27650   11 years jmt12
(edit) @27649   11 years jmt12 No longer in SVN control
(edit) @27648   11 years jmt12 Template for setup.bash - a user will have to populate Hadoop fields
(edit) @27645   11 years jmt12
(edit) @27644   11 years jmt12 Extended to support HDFS-access via NFS. This applies to both the call …
(edit) @27643   11 years jmt12 Changed the script generator so it can recurse through directories and …
(edit) @27642   11 years jmt12 A script I downloaded that successfully splits video files - something …
(edit) @27641   11 years jmt12 Altered order of arguments and allow archives dir to be passed as …
(edit) @27640   11 years jmt12
(edit) @27638   11 years jmt12 Change it so failure to open a filehandle isn't fatal - leave it up to …
(edit) @27631   11 years jmt12 A proxy to allow NFS access to HDFS
(edit) @27595   11 years jmt12 Updating list of untarred directories to ignore
(edit) @27594   11 years jmt12 Extend hadoop_import.pl to be able to start and stop the Thrift server(s)
(edit) @27593   11 years jmt12 Need Class Accessor for Thrift client under Rocks
(edit) @27592   11 years jmt12 Adding in a script to allow a daemon version of Thrift to be started …
(edit) @27591   11 years jmt12 Ensure Thrift will, be default, attempt to connect to the local …
(edit) @27590   11 years jmt12 Adding statistics about data locality, and highlighting tasks where …
(edit) @27589   11 years jmt12 Fixing up some minor bugs in regex's
(edit) @27588   11 years jmt12 Extend parser to support jobs that are split over several logs. Also …
(edit) @27587   11 years jmt12 Allow debug mode to be enabled from the command line
(edit) @27586   11 years jmt12 Updating script to date date of hadoop job into account when searching …
(edit) @27585   11 years jmt12 The perl on Medusa won't let you immediately treat a returned array in …
(edit) @27584   11 years jmt12 I wasn't doing -r when attempting to clear directories left in /tmp by …
(edit) @27583   11 years jmt12 Adding code to differentiate between workers in a cluster - all of …
(edit) @27571   11 years jmt12 increase timeout to 4 hours per map
(edit) @27570   11 years jmt12 Make the warning about binmode() not being applicable more meaningful, …
(edit) @27569   11 years jmt12 Trying to streamline the error messages from failing to link …
(edit) @27568   11 years jmt12 Testing on Medusa suggests optimal buffer size around 128K
(edit) @27567   11 years jmt12 Found a printWarning that I handed changed to use the FileUtils version
(edit) @27566   11 years jmt12 Making the getcpu optional - as it isn't available on Medusa (but then …
(edit) @27561   11 years jmt12 Adding very basic compile file for getcpu - can't be bothered going …
(edit) @27560   11 years jmt12 Fixing typo in regexp that meant filenames sometimes ignored
(edit) @27559   11 years jmt12 Changed mime-type away from binary - I hope. Meanwhile, generate …
(edit) @27558   11 years jmt12 Forgot that Hadoop Map processes no longer have the environment …
(edit) @27551   11 years jmt12 Altered so that it expects to be given a CSV containing parallel …
(edit) @27550   11 years jmt12 Ensure the hostname is added to the Hadoop logs so we can identify the …
(edit) @27549   11 years jmt12 Extract information from the logs generated by parallel Greenstone …
(edit) @27548   11 years jmt12 Extract information from the logs generated by parallel Greenstone …
(edit) @27547   11 years jmt12 Rejigging some processing comments
(edit) @27546   11 years jmt12 Adding the ability for the Hadoop Mapper to determine what CPU number …
(edit) @27545   11 years jmt12 Ignoring just the compiled file (for now)
(edit) @27544   11 years jmt12 A tiny C script to guesstimate the CPU the calling Process is on
(edit) @27543   11 years jmt12 Adding generate_gantt.pl script in its original form - i.e. directly …
(edit) @27532   11 years jmt12 Add the ability to configure the Thrift connector using a …
(edit) @27531   11 years jmt12 Only output the message about using copy instead of hard/soft link once
(edit) @27530   11 years jmt12 Clear out old logs, and adding more comments about what the script is …
(edit) @27526   11 years jmt12 Adding in a 'isHDFS()' function so that some plugins (SimpleVideoPlug) …
(edit) @27525   11 years jmt12 Adding in a 'isHDFS()' function so that some plugins (SimpleVideoPlug) …
(edit) @27515   11 years jmt12 Making the file used durig buffertes be configurable
(edit) @27514   11 years jmt12 Altering code to allow configurable length of read/write buffer when …
(edit) @27512   11 years jmt12 Adding in a special test for measuring the effect of altering ThriftFS …
(edit) @27496   11 years jmt12 Replacing a smelly old util::file_exists() with a snazzy new …
(edit) @27495   11 years jmt12 removing doubled up debug comments and putting some paths in …
(edit) @27494   11 years jmt12 Fixing a truncated comment - or maybe I never wrote an end to it…
(edit) @27493   11 years jmt12 No longer required - not that sure why it was required in the first place
(edit) @27492   11 years jmt12 Some versions of Hadoop add host and protocol information into paths - …
(edit) @27491   11 years jmt12 Repairing three bugs in makeAllDirectories - incorrect pattern meant …
(edit) @27490   11 years jmt12 No longer requires
(edit) @27489   11 years jmt12 Shouldn't have been here
(edit) @27488   11 years jmt12 Since I've got rid of the thousand DateTime prereq modules, I can …
(edit) @27487   11 years jmt12 Ensure Parallel Building path in environment (for ThriftFS) and that …
(edit) @27481   11 years jmt12 Adding makeAllDirectories() (which I'd only implemented in LocalFS) to …
(edit) @27480   11 years jmt12 Removing DateTime dependency (so HDFSShell will always fail …
(edit) @27479   11 years jmt12 Remove time parsing as DateTime is a fricking nightmare to install …
(edit) @27478   11 years jmt12 Be a bit smarter about locating Perl version if not provided (rather …
(edit) @27477   11 years jmt12 Changed the order of additions to java classpath to ensure that my …
(edit) @27476   11 years jmt12
(edit) @27475   11 years jmt12
(edit) @27474   11 years jmt12 Ignoring extracted modules
(edit) @27473   11 years jmt12 I give up - I'll just do without DateTime
(edit) @27472   11 years jmt12
(edit) @27471   11 years jmt12
(edit) @27470   11 years jmt12 No longer need to ignore all the files I moved to cpan
(edit) @27469   11 years jmt12 A dedicated folder to hold the many, many CPAN modules I've ended up …
(edit) @27468   11 years jmt12 The (repeated) functionality of all these scripts moved into CPAN.sh
(edit) @27467   11 years jmt12 An advanced cascade script that will loop through an (evergrowing) …
(edit) @27466   11 years jmt12 moved to cpan folder in packages
(edit) @27465   11 years jmt12 Moved to cpan directory so I can use smarter cascade make
(edit) @27463   11 years jmt12
(edit) @27462   11 years jmt12
(edit) @27461   11 years jmt12
Note: See TracRevisionLog for help on using the revision log.