root/gs2-extensions/parallel-building

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Rev Chgset Date Author Log Message
(edit) @28013 [28013] 6 years jmt12 A new script to run a battery of Hadoop ingests at varying replication …
(edit) @28012 [28012] 6 years jmt12 Express start time as a double as well
(edit) @28011 [28011] 6 years jmt12 Turn off debugging in the copy in SVN
(edit) @28010 [28010] 6 years jmt12 Correctly set up the environment for calls to txt2tdb and also replace …
(edit) @28001 [28001] 6 years jmt12 Write datestamp using dbutil if applicable
(edit) @27996 [27996] 6 years jmt12 A new version of the archive with minor changes to log4j configuration
(edit) @27995 [27995] 6 years jmt12 Just adding some code comments
(edit) @27915 [27915] 6 years jmt12 A new PlugOut? that doesn't write any intermediate files (bar those …
(edit) @27914 [27914] 6 years jmt12 Trying to get around a couple of divide-by-zero issues when generating …
(edit) @27913 [27913] 6 years jmt12 Made the ingester to be used (version 1 without reduce phase, or version 2 …
(edit) @27912 [27912] 6 years jmt12 Modified the compilation to include the new ingester and its …
(edit) @27911 [27911] 6 years jmt12 Modified the compilation to include the new ingester and its co-requisites
(edit) @27910 [27910] 6 years jmt12 Extended the existing HadoopGreenstoneIngest? with proper Reduce phase - …
(edit) @27753 [27753] 6 years jmt12 Adding Handbrake's percentage complete to report - although this is …
(edit) @27752 [27752] 6 years jmt12 Data locality file not being found is no longer fatal (HDFS-NFS-Proxy …
(edit) @27732 [27732] 6 years jmt12 Nice the copy itself too
(edit) @27686 [27686] 6 years jmt12 A little more progress comments
(edit) @27685 [27685] 6 years jmt12 in the case of multiple attempts you need to retain the information about …
(edit) @27684 [27684] 6 years jmt12 Adding natural sorting into report generation - so also needed to add INC …
(edit) @27683 [27683] 6 years jmt12 moving a few more headings around to help with information block layout
(edit) @27682 [27682] 6 years jmt12 Copying makeAllDirectories() from vanilla FileUtils?.pm
(edit) @27669 [27669] 6 years jmt12 Sort compute nodes naturally before labelling them with incremental worker …
(edit) @27654 [27654] 6 years jmt12 Add the ability to stagger the starting of Mappers by placing a 'delay.me' …
(edit) @27653 [27653] 6 years jmt12 Forgot to pull self off the head of arguments
(edit) @27652 [27652] 6 years jmt12 Changing buffer to 128K (slightly faster) and adding a comment explaining …
(edit) @27651 [27651] 6 years jmt12
(edit) @27650 [27650] 6 years jmt12
(edit) @27649 [27649] 6 years jmt12 No longer in SVN control
(edit) @27648 [27648] 6 years jmt12 Template for setup.bash - a user will have to populate Hadoop fields
(edit) @27645 [27645] 6 years jmt12
(edit) @27644 [27644] 6 years jmt12 Extended to support HDFS-access via NFS. This applies to both the call to …
(edit) @27643 [27643] 6 years jmt12 Changed the script generator so it can recurse through directories and …
(edit) @27642 [27642] 6 years jmt12 A script I downloaded that successfully splits video files - something I …
(edit) @27641 [27641] 6 years jmt12 Altered order of arguments and allow archives dir to be passed as argument …
(edit) @27640 [27640] 6 years jmt12
(edit) @27638 [27638] 6 years jmt12 Change it so failure to open a filehandle isn't fatal - leave it up to the …
(edit) @27631 [27631] 6 years jmt12 A proxy to allow NFS access to HDFS
(edit) @27595 [27595] 7 years jmt12 Updating list of untarred directories to ignore
(edit) @27594 [27594] 7 years jmt12 Extend hadoop_import.pl to be able to start and stop the Thrift server(s)
(edit) @27593 [27593] 7 years jmt12 Need Class Accessor for Thrift client under Rocks
(edit) @27592 [27592] 7 years jmt12 Adding in a script to allow a daemon version of Thrift to be started (and …
(edit) @27591 [27591] 7 years jmt12 Ensure Thrift will, be default, attempt to connect to the local machine …
(edit) @27590 [27590] 7 years jmt12 Adding statistics about data locality, and highlighting tasks where file …
(edit) @27589 [27589] 7 years jmt12 Fixing up some minor bugs in regex's
(edit) @27588 [27588] 7 years jmt12 Extend parser to support jobs that are split over several logs. Also …
(edit) @27587 [27587] 7 years jmt12 Allow debug mode to be enabled from the command line
(edit) @27586 [27586] 7 years jmt12 Updating script to date date of hadoop job into account when searching for …
(edit) @27585 [27585] 7 years jmt12 The perl on Medusa won't let you immediately treat a returned array in a …
(edit) @27584 [27584] 7 years jmt12 I wasn't doing -r when attempting to clear directories left in /tmp by …
(edit) @27583 [27583] 7 years jmt12 Adding code to differentiate between workers in a cluster - all of which …
(edit) @27571 [27571] 7 years jmt12 increase timeout to 4 hours per map
(edit) @27570 [27570] 7 years jmt12 Make the warning about binmode() not being applicable more meaningful, and …
(edit) @27569 [27569] 7 years jmt12 Trying to streamline the error messages from failing to link (otherwise I …
(edit) @27568 [27568] 7 years jmt12 Testing on Medusa suggests optimal buffer size around 128K
(edit) @27567 [27567] 7 years jmt12 Found a printWarning that I handed changed to use the FileUtils? version
(edit) @27566 [27566] 7 years jmt12 Making the getcpu optional - as it isn't available on Medusa (but then I …
(edit) @27561 [27561] 7 years jmt12 Adding very basic compile file for getcpu - can't be bothered going …
(edit) @27560 [27560] 7 years jmt12 Fixing typo in regexp that meant filenames sometimes ignored
(edit) @27559 [27559] 7 years jmt12 Changed mime-type away from binary - I hope. Meanwhile, generate …
(edit) @27558 [27558] 7 years jmt12 Forgot that Hadoop Map processes no longer have the environment …
(edit) @27551 [27551] 7 years jmt12 Altered so that it expects to be given a CSV containing parallel …
(edit) @27550 [27550] 7 years jmt12 Ensure the hostname is added to the Hadoop logs so we can identify the …
(edit) @27549 [27549] 7 years jmt12 Extract information from the logs generated by parallel Greenstone using …
(edit) @27548 [27548] 7 years jmt12 Extract information from the logs generated by parallel Greenstone using …
(edit) @27547 [27547] 7 years jmt12 Rejigging some processing comments
(edit) @27546 [27546] 7 years jmt12 Adding the ability for the Hadoop Mapper to determine what CPU number it …
(edit) @27545 [27545] 7 years jmt12 Ignoring just the compiled file (for now)
(edit) @27544 [27544] 7 years jmt12 A tiny C script to guesstimate the CPU the calling Process is on
(edit) @27543 [27543] 7 years jmt12 Adding generate_gantt.pl script in its original form - i.e. directly reads …
(edit) @27532 [27532] 7 years jmt12 Add the ability to configure the Thrift connector using a 'thrift.conf' …
(edit) @27531 [27531] 7 years jmt12 Only output the message about using copy instead of hard/soft link once
(edit) @27530 [27530] 7 years jmt12 Clear out old logs, and adding more comments about what the script is …
(edit) @27526 [27526] 7 years jmt12 Adding in a 'isHDFS()' function so that some plugins (SimpleVideoPlug?) can …
(edit) @27525 [27525] 7 years jmt12 Adding in a 'isHDFS()' function so that some plugins (SimpleVideoPlug?) can …
(edit) @27515 [27515] 7 years jmt12 Making the file used durig buffertes be configurable
(edit) @27514 [27514] 7 years jmt12 Altering code to allow configurable length of read/write buffer when …
(edit) @27512 [27512] 7 years jmt12 Adding in a special test for measuring the effect of altering ThriftFS …
(edit) @27496 [27496] 7 years jmt12 Replacing a smelly old util::file_exists() with a snazzy new …
(edit) @27495 [27495] 7 years jmt12 removing doubled up debug comments and putting some paths in speechmarks …
(edit) @27494 [27494] 7 years jmt12 Fixing a truncated comment - or maybe I never wrote an end to it…
(edit) @27493 [27493] 7 years jmt12 No longer required - not that sure why it was required in the first place
(edit) @27492 [27492] 7 years jmt12 Some versions of Hadoop add host and protocol information into paths - and …
(edit) @27491 [27491] 7 years jmt12 Repairing three bugs in makeAllDirectories - incorrect pattern meant paths …
(edit) @27490 [27490] 7 years jmt12 No longer requires
(edit) @27489 [27489] 7 years jmt12 Shouldn't have been here
(edit) @27488 [27488] 7 years jmt12 Since I've got rid of the thousand DateTime? prereq modules, I can revert …
(edit) @27487 [27487] 7 years jmt12 Ensure Parallel Building path in environment (for ThriftFS) and that the …
(edit) @27481 [27481] 7 years jmt12 Adding makeAllDirectories() (which I'd only implemented in LocalFS) to …
(edit) @27480 [27480] 7 years jmt12 Removing DateTime? dependency (so HDFSShell will always fail …
(edit) @27479 [27479] 7 years jmt12 Remove time parsing as DateTime? is a fricking nightmare to install without …
(edit) @27478 [27478] 7 years jmt12 Be a bit smarter about locating Perl version if not provided (rather than …
(edit) @27477 [27477] 7 years jmt12 Changed the order of additions to java classpath to ensure that my …
(edit) @27476 [27476] 7 years jmt12
(edit) @27475 [27475] 7 years jmt12
(edit) @27474 [27474] 7 years jmt12 Ignoring extracted modules
(edit) @27473 [27473] 7 years jmt12 I give up - I'll just do without DateTime?
(edit) @27472 [27472] 7 years jmt12
(edit) @27471 [27471] 7 years jmt12
(edit) @27470 [27470] 7 years jmt12 No longer need to ignore all the files I moved to cpan
(edit) @27469 [27469] 7 years jmt12 A dedicated folder to hold the many, many CPAN modules I've ended up …
Note: See TracRevisionLog for help on using the revision log.