source: gsdl/trunk/perllib

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @17668   15 years kjdon added a counter into the filenames for downloaded documents - …
(edit) @17666   15 years max Added file extensions for JPEG2000
(edit) @17664   15 years ak19 In the case of OAIDownloads, WgetDownload no longer uses sockets for …
(edit) @17591   15 years kjdon changed a comment
(edit) @17590   15 years kjdon commit 17320 means that DirectoryPlugin now assumes that filepaths in …
(edit) @17588   15 years kjdon OAIPlugin wasn't calling set_Source_metadata
(edit) @17587   15 years kjdon added a couple of missing plugin strings
(edit) @17586   15 years kjdon added buildcol.incremental_default_builddir
(edit) @17579   15 years kjdon removed a debug print statement
(edit) @17575   15 years kjdon implemented init_for_incremental_build to read in indexfields and …
(edit) @17574   15 years kjdon now calls read_build_cfg() instead of having the code here
(edit) @17573   15 years kjdon moved a couple of things around, added read_build_cfg which finds and …
(edit) @17572   15 years kjdon moved the make_absolute method to here from buildcol.pl
(edit) @17568   15 years kjdon recoding of the text method. more closely matches mgpp one. ZZ field …
(edit) @17567   15 years kjdon if metadata is specified, only add in the ones that are not already …
(edit) @17566   15 years kjdon lucene no longer does anything with paragraphs, so we print a warning …
(edit) @17565   15 years kjdon removed some debug statements, and no longer load in the default …
(edit) @17564   15 years kjdon fixed up some stuff to do with indexfieldmap. still working on it, but …
(edit) @17549   16 years ak19 Changes to sudden wget download termination when OAIDownload.pm is …
(edit) @17547   16 years ak19 When subroutines useWget and useWgetMonitored receive the STOP signal …
(edit) @17543   16 years mdewsnip Fixed the block_exp regular expression to move the $ symbol, so it …
(edit) @17537   16 years ak19 Subroutine useWgetMonitored updated to include the modifications made …
(edit) @17533   16 years oranfry protect against a particular error message poluting XML output
(edit) @17531   16 years ak19 Now works with OAIDownload.pm for downloading over OAI. The variable …
(edit) @17530   16 years ak19 Fixed not being able to run wget from the cmd-line via downloadfrom.pl …
(edit) @17529   16 years ak19 Now WgetDownload.pm uses Sockets to communicate with GLI which …
(edit) @17528   16 years ak19 New subroutine setIsGLI to store whether or not the download is run …
(edit) @17527   16 years ak19 Now calls new subroutine setIsGLI on the download_obj to indicate …
(edit) @17513   16 years kjdon extrametadata keys need to be regexs, so windows paths need converting
(edit) @17512   16 years kjdon added a method to turn windows filename paths (with single back slash) …
(edit) @17483   16 years kjdon I just discovered that if image magick was not installed, you weren't …
(edit) @17480   16 years kjdon removed the pc namespace. the metadata is now extracted metadata, and …
(edit) @17479   16 years kjdon put this back to using block expression for now - on windows sets up …
(edit) @17476   16 years mdewsnip Support for using MSSQL for infodb databases, many thanks to Jeffrey …
(edit) @17463   16 years kjdon some mods to make this a bit more useful in response to request from …
(edit) @17462   16 years kjdon added ProCite.entry_separator
(edit) @17354   16 years ak19 Added SIGTERM and SIGINT handlers to terminate wget child process …
(edit) @17330   16 years kjdon added default values for self->input_encoding and …
(edit) @17322   16 years kjdon added a -f test on filename in can_process_this_file to prevent this …
(edit) @17321   16 years anna Removed a line break at the end of an French element.
(edit) @17320   16 years kjdon found and fixed what I think is a bug - in the metadata structures for …
(edit) @17319   16 years kjdon tidied this up and removed some old code
(edit) @17313   16 years kjdon this seemed to have been forgotten in the 'removing metadata form …
(edit) @17300   16 years kjdon removed the metadata argument from metadata_read as its not used and …
(edit) @17294   16 years kjdon added a fix for a bug John T discovered where in get_new_doc_dir you …
(edit) @17293   16 years davidb fixed type in function call: parsefile -> parse_file
(edit) @17290   16 years kjdon previous changes to get exploding working (using metadata_read) meant …
(edit) @17289   16 years kjdon moved the actual parsing from read into parse_file so other plugins …
(edit) @17288   16 years kjdon in add_section_content, we are regenrating doc objs from gdbm …
(edit) @17287   16 years kjdon added 'if verbosity > 3' to some print statements, and set doctype to …
(edit) @17286   16 years kjdon renamed make_infodatabase to make_infodatabase_dlc so that its not …
(edit) @17285   16 years kjdon fixed a couple of typos in function calls
(edit) @17284   16 years ak19 The PerlDoc seems to indicate that it is necessary to call waitpid …
(edit) @17283   16 years kjdon changed a couple of print statements to be more informative
(edit) @17267   16 years anna Updated French translations. Many thanks to John Rose.
(edit) @17250   16 years kjdon forgot to pass the arguments to ImageConverter::begin()
(edit) @17249   16 years kjdon need to ignore Manifest tag in xml_start_tag
(edit) @17247   16 years ak19 The Server Information button produced nothing for some urls, since …
(edit) @17246   16 years ak19 Clearer description for the display string OAIDownload.get_doc_exts, …
(edit) @17234   16 years ak19 The GLI java class DownloadPane.java has been changed to alter the …
(edit) @17233   16 years ak19 Shorter strings for the various Download Settings in the Download …
(edit) @17232   16 years ak19 Added an extra comment to the new quit_yaz subroutine to indicate why …
(edit) @17231   16 years ak19 In subroutine quit_yaz(), while flushing yaz-client's outputstream, …
(edit) @17230   16 years ak19 SRWDownload now finally quits once it has finished. It's no longer …
(edit) @17229   16 years ak19 Moved code for starting up (including opening connections) and …
(edit) @17220   16 years ak19 One more occasion where the quit command needs to be sent
(edit) @17219   16 years ak19 Previously the yaz-client cmd-line program would not quit (still …
(edit) @17218   16 years ak19 Previously the yaz-client cmd-line program would not quit (still …
(edit) @17216   16 years kjdon trying to get OAI files exploding. Have copied in some code from one …
(edit) @17214   16 years ak19 Significant changes: 1. Textcat can be restricted to a given encoding …
(edit) @17213   16 years ak19 Significant changes to subroutine get_language_encoding to better work …
(edit) @17212   16 years ak19 Removed some unnecessary commented-out code
(edit) @17210   16 years kjdon BasDownload changed to BaseDownload
(edit) @17209   16 years kjdon BasClas renamed to BaseClassifier, tidied up constructors
(edit) @17208   16 years kjdon file rename BasClas.pm to BaseClassifier.pm
(edit) @17207   16 years kjdon BasDownload renamed to BaseDownload, also tidied up the constructors
(edit) @17206   16 years kjdon renamed file BasDownload.pm to BaseDownload.pm
(edit) @17205   16 years kjdon reordered strings - put all the plugout ones together
(edit) @17204   16 years kjdon in use_collection now always set GSDLCOLLECTION. previously was unless …
(edit) @17203   16 years kjdon BasPlugout renamed to BasePlugout. And tidied up the constructors
(edit) @17202   16 years kjdon changed BasPlugout to BasePlugout in line with package renaming. Also …
(edit) @17200   16 years kjdon renamed BasPlugout to BasePlugout
(edit) @17197   16 years kjdon previous metadata changes meant that there was no longer URL metadata …
(edit) @17196   16 years kjdon set cover_image to false as it makes no sense for images
(edit) @17144   16 years kjdon modified export.params, added scripts.gli
(edit) @17143   16 years kjdon modified check_removeold_and_keepold so that you don't need to pass in …
(edit) @17127   16 years kjdon want to block body background, so added it into tabbg_matches regex …
(edit) @17126   16 years kjdon inherit and use args form ReadTextFile cos we want the file encoding stuff
(edit) @17120   16 years ak19 archivesinf_gdbm commented out until more testing under Windows has …
(edit) @17117   16 years kjdon when indexing a combined field, put the field tags arounds the whole …
(edit) @17112   16 years kjdon CJK text segmentation now done at indexing level (in buildproc), not …
(edit) @17111   16 years kjdon added a comment
(edit) @17110   16 years kjdon changed way cjk separation is done. Not done in plugins any more, but …
(edit) @17109   16 years kjdon moved separate_cjk from colcfg to buildcfg
(edit) @17106   16 years mdewsnip No longer writes out the document/section number entries for Lucene, …
(edit) @17105   16 years mdewsnip Not sure why "gdbm-txtgz" was made the default, particularly since …
(edit) @17104   16 years mdewsnip Arrrgghhh, someone uglied up my nice tidy code…
(edit) @17103   16 years ak19 OAI files should be explodable, so added that back in as an option
(edit) @17099   16 years kjdon in get_language_encoding, we extract head from html files. if its not …
(edit) @17088   16 years davidb Plugin modified to only print out URL encoded filename if different to …
Note: See TracRevisionLog for help on using the revision log.