Changeset 27420 for gs2-extensions/parallel-building
- Timestamp:
- 2013-05-24T09:34:37+12:00 (11 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
gs2-extensions/parallel-building/trunk/src/README.txt
r24675 r27420 45 45 ===== bin/script and perllib ===== 46 46 47 **Note:** The following is including for historic reasons - these changes have now been merged (or otherwise dealt with) by major changes to the way import and build scripts are run. The number of actual customized files in Parallel Buildings perllib are now fewer in number, and tend to depend upon proper class inheritence and overriding. 48 47 49 In order to try and make this compatible with the latest advances in the main trunk (so not the 64bit version I've been testing on), I've implemented the parallel building using a SVN head version of import.pl, buildcol.pl and perllib. I'll try to keep a list of the files I've changed here to aid in merging this code back into Greenstone: 48 50 … … 62 64 * perllib/plugin.pm: see IncrementalBuildTools 63 65 * perllib/util.pm: made it only complain about periods (.) in the Identifier once - rather than once per document (which is a PITA when building one million documents). 64 65 66 * perllib/dbutil/gdbm.pm: changed to call lock enabled versions of txt2db and db2txt. 66 67 * perllib/dbutil/sqlite.pm: added WAL Pragma (for all the good it did). Also needed to redirect output (like for db_fast) as the WAL reports each type of action ("add","update", and "delete") that it has queued - very quickly becoming annoying. 67 68 68 * perllib/plugins/DirectoryPlugin.pm: making the "Global file scan..." comment obey verbosity. 69 69 * perllib/plugins/MARCPlugin.pm: see IncrementalBuildTools (in this case the path to cpan) … … 71 71 * perllib/plugins/OAIMetadataXMLPlugin.pm: see IncrementalBuildTools (in this case the path to cpan) 72 72 * perllib/plugins/ReadXMLPlugin.pm: see IncrementalBuildTools (in this case the path to cpan) 73 74 ===== Packages ===== 75 76 ==== Bit-Vector-7.2 ==== 77 78 Required by Thrift's Perl API. 79 80 ==== Hadoop-1.1.0 ==== 81 82 Provides Hadoop capabilities to the extension - you can then either run Greenstone in parallel (using OpenMPI as the parallel framework) pulling the files out of HDFS, or you can run the alternate Hadoop framework import (and maybe build if I can be bothered) and make even better use of HDFS. 83 84 ==== IPC-Run-0.90 ==== 85 86 Used in the server daemons (GDBM and TDB) to provide a handle to running applications that allows bi-directional piping and better process control (get child PIDs etc). 87 88 ==== OpenMPI-1.4.3 ==== 89 90 Provides a framework within which to run Greenstone in parallel. 91 92 ==== Proc-Daemon-0.14 ==== 93 94 Perl module to allow proper daemonization of perl processes. 95 96 ==== Sort-Key-1.32 ==== 97 98 Perl module providing better sorting algorithms include natural sort of keys. 99 100 ==== ThriftFS-0.9.0 ==== 101 102 A custom collection of files extracted from a src install of Hadoop and Thrift providing a persistent Hadoop-Thrift server (in Java), and an API for communicating with the server from Perl. 103 104 Includes a java file providing slightly more efficient Base91 encoding/decoding (as compared to Base64). Required by tweaks to Thrift to allow binary data to be passed around as Java Strings without UTF8 encoding accidentally mangling things (if only they'd used Java Byte[]s instead). 105 106 ==== Tinyxml-gs-2.6.2 ==== 107 108 Used to parse XML 'build recipes' in the parallel version of buildcol.
Note:
See TracChangeset
for help on using the changeset viewer.