|
|
@13318
|
18 years |
mdewsnip |
Now removes all whitespace from the empty line between the chunks.
|
|
|
@13296
|
18 years |
mdewsnip |
Utility script to convert Excel Unicode text files into the GTI …
|
|
|
@13223
|
18 years |
shaoqun |
used utf-8 coding for the input stream
|
|
|
@13216
|
18 years |
mdewsnip |
Now makes the chunk keys XML safe also, to prevent XML errors with the …
|
|
|
@13186
|
18 years |
kjdon |
removed RecPlug -use_metadata_files option, added MetadataXMLPlug
|
|
|
@13169
|
18 years |
kjdon |
debug mode now passes debug flag to plugout rather than using …
|
|
|
@13166
|
18 years |
mdewsnip |
Can now obtain documents from the web containing spaces.
|
|
|
@13165
|
18 years |
mdewsnip |
Need to unescape filename separators on Windows.
|
|
|
@13163
|
18 years |
kjdon |
changed groupsize mode back to 3 as per John Rose comment
|
|
|
@13072
|
18 years |
kjdon |
fixed a bug where the first page was given pagenumber of 2
|
|
|
@13067
|
18 years |
kjdon |
if we are appending, and a lucene collection, then set builddir to be …
|
|
|
@13054
|
18 years |
mdewsnip |
Now puts the terms through xmlSafe() as well, to prevent invalid XML …
|
|
|
@13053
|
18 years |
kjdon |
removed some 'use xxx' statments for modules which are not used
|
|
|
@12993
|
18 years |
mdewsnip |
Now stores the query results XML in a string buffer before outputting …
|
|
|
@12991
|
18 years |
mdewsnip |
Ooops... managed to lose the header of the XML output in my recent changes.
|
|
|
@12989
|
18 years |
mdewsnip |
Follow to close the searcher object.
|
|
|
@12987
|
18 years |
mdewsnip |
You can now specify the query string as a command-line argument to …
|
|
|
@12983
|
18 years |
mdewsnip |
Moved the stuff for running the query into a new runQuery function, in …
|
|
|
@12981
|
18 years |
mdewsnip |
Tidied up command-line option parsing in preparation for allowing the …
|
|
|
@12980
|
18 years |
mdewsnip |
Now passes the endresults value (if defined) into the …
|
|
|
@12976
|
18 years |
mdewsnip |
Rearranged some code to make the fact that the term information is now …
|
|
|
@12965
|
18 years |
kjdon |
scriptutil::check_removeold_and_keepold now has a incremental argument …
|
|
|
@12964
|
18 years |
kjdon |
added a new option: -incremental, which invokes David's archives.inf …
|
|
|
@12903
|
18 years |
kjdon |
remove the trailing slash from cache_dir -on windows this stuff things up
|
|
|
@12878
|
18 years |
mdewsnip |
Fixed a bug where '&' characters in filenames aren't made XML safe.
|
|
|
@12874
|
18 years |
mdewsnip |
No longer sets the plugin's input encoding back to auto, to prevent it …
|
|
|
@12873
|
18 years |
mdewsnip |
Can now obtain multiple documents for a record (and assign the …
|
|
|
@12848
|
18 years |
nzdl |
try the JAVA_HOME variable first to find java, otherwise just use …
|
|
|
@12846
|
18 years |
mdewsnip |
Minor changes.
|
|
|
@12844
|
18 years |
mdewsnip |
Incremental building and dynamic GDBM updating code, many thanks to …
|
|
|
@12821
|
18 years |
kjdon |
changed the gli modes for some options
|
|
|
@12820
|
18 years |
kjdon |
made index option glimode 4
|
|
|
@12819
|
18 years |
kjdon |
moved some options around
|
|
|
@12776
|
18 years |
mdewsnip |
Fixed a bug where misspelled words could be marked as stop words with …
|
|
|
@12775
|
18 years |
mdewsnip |
Fixed bug where some terms have zero frequency (because they don't …
|
|
|
@12774
|
18 years |
kjdon |
new jar file after xmlSafe change, see log of GS2LuceneQuery.java
|
|
|
@12770
|
18 years |
mdewsnip |
Changed the Lucene "-fuzzy" argument to "-fuzziness <value>", for more …
|
|
|
@12706
|
18 years |
mdewsnip |
Added a "-records_per_folder" option to explode_metadata_database.pl, …
|
|
|
@12704
|
18 years |
davidb |
convert RTF upgraded so it can also use windows scripting option.
|
|
|
@12691
|
18 years |
kjdon |
added OIDtype and OIDmetadata to the option list. it was using OIDtype …
|
|
|
@12656
|
18 years |
mdewsnip |
Put old range filter stuff back, and added "-startresults" and …
|
|
|
@12653
|
18 years |
mdewsnip |
Made it a little bit easier to use a custom set of stop words with Lucene.
|
|
|
@12640
|
18 years |
mdewsnip |
Now returns valid XML instead of an error when -listall and …
|
|
|
@12639
|
18 years |
mdewsnip |
Changed the "-collect" option to "-collection", because it's a million …
|
|
|
@12629
|
18 years |
mdewsnip |
Merged the "-listall" and "-describeall" code, and made both always …
|
|
|
@12625
|
18 years |
mdewsnip |
Removed the DTD stuff from the top of the XML output... it's just one …
|
|
|
@12622
|
18 years |
jrm21 |
if we can't open an output file, also give the operating system's …
|
|
|
@12619
|
18 years |
kjdon |
it seems that when I added in the option OIDmetadata, I didn't …
|
|
|
@12616
|
18 years |
kjdon |
now accepts -h as well as --help
|
|
|
@12615
|
18 years |
kjdon |
changed slightly the checking of how many args we have left after …
|
|
|
@12614
|
18 years |
kjdon |
changed plugin to classifier in the cut and pasted text
|
|
|
@12613
|
18 years |
kjdon |
changed slightly the checking of how many args we have left after …
|
|
|
@12598
|
18 years |
shaoqun |
added mapping_file option for MARCXML plugout
|
|
|
@12594
|
18 years |
shaoqun |
added code that uses MARCXML mapping file
|
|
|
@12593
|
18 years |
shaoqun |
a util class that converts a string to its lowercase
|
|
|
@12574
|
18 years |
shaoqun |
remove the default value for cache_dir because HOME environment is not set
|
|
|
@12566
|
18 years |
mdewsnip |
Added fix for warnings when submitting to a new file.
|
|
|
@12545
|
18 years |
kjdon |
changed parse2::parse so that it returns -1 on error, 0 on success, or …
|
|
|
@12500
|
18 years |
kjdon |
hide the keepold and removeold options from gli
|
|
|
@12484
|
18 years |
mdewsnip |
The username of the person who did the translations is now recorded in …
|
|
|
@12483
|
18 years |
mdewsnip |
Changed the way the Updated comments are dealt with, in preparation …
|
|
|
@12481
|
18 years |
mdewsnip |
Turned tutorial translation off by default.
|
|
|
@12458
|
18 years |
kjdon |
gzip option is only a flag, so don't pass a value to plugouts
|
|
|
@12429
|
18 years |
mdewsnip |
Changed the "-filter" argument to use a general Lucene QueryFilter, …
|
|
|
@12425
|
18 years |
mdewsnip |
Fixed a bug where buildcol would try to continue when invalid …
|
|
|
@12418
|
18 years |
mdewsnip |
Now returns parse exceptions and too many clauses exceptions as …
|
|
|
@12415
|
18 years |
mdewsnip |
Moved the code that messes around with the query to add the fuzziness …
|
|
|
@12408
|
18 years |
mdewsnip |
Added a "-filter" option which can currently be used for specifying …
|
|
|
@12406
|
18 years |
shaoqun |
uses the cache_dir par rather than ${ENV{'HOME'} to get the cache dir
|
|
|
@12399
|
18 years |
kjdon |
I rearranged some stuff so that all the essential checks are done …
|
|
|
@12394
|
18 years |
shaoqun |
class for xslt transformation
|
|
|
@12390
|
18 years |
mdewsnip |
More fixes, many thanks to John Thompson and DL Consulting Ltd.
|
|
|
@12387
|
18 years |
mdewsnip |
Fixes for fuzzy searching, many thanks to John Thompson and DL …
|
|
|
@12377
|
18 years |
mdewsnip |
Now returns query term occurrences correctly, and does fuzzy searching …
|
|
|
@12375
|
18 years |
mdewsnip |
Ooops... StopWord output went to STDERR instead of STDOUT.
|
|
|
@12373
|
18 years |
mdewsnip |
Bit tidy up, particularly regarding command-line option parsing.
|
|
|
@12372
|
18 years |
mdewsnip |
Now returns the stop words that have been removed from the query.
|
|
|
@12370
|
18 years |
kjdon |
now create the archives directory here rather than expecting plugouts …
|
|
|
@12364
|
18 years |
mdewsnip |
Now uses the t variable to control whether a "some" or "all" search is …
|
|
|
@12361
|
18 years |
kjdon |
changed a comment
|
|
|
@12360
|
18 years |
kjdon |
changed sortmeta type to metadata instead of metadatum, cos the latter …
|
|
|
@12359
|
18 years |
kjdon |
added a comment
|
|
|
@12358
|
18 years |
kjdon |
added back in the unshift that I removed in last commit
|
|
|
@12357
|
18 years |
mdewsnip |
Put back in adding GSDLCOLLECTDIR/perllib to INC, so …
|
|
|
@12355
|
18 years |
kjdon |
updated to use plugouts instead of docsave
|
|
|
@12354
|
18 years |
kjdon |
removed docsave reference
|
|
|
@12342
|
18 years |
kjdon |
added modegli=3 to maxnumeric option
|
|
|
@12341
|
18 years |
shaoqun |
fixed the bugs
|
|
|
@12338
|
18 years |
kjdon |
added the maxnumeric option to buildcol. so it can be set in GLI. …
|
|
|
@12335
|
18 years |
shaoqun |
now it uses plugouts
|
|
|
@12334
|
18 years |
shaoqun |
a module that displays plugouts info
|
|
|
@12333
|
18 years |
shaoqun |
now it uses plugout
|
|
|
@12290
|
18 years |
kjdon |
had to add 'use FileHandle' to this file - was getting an error about …
|
|
|
@12275
|
18 years |
mdewsnip |
Added a command-line option for sorting the search results.
|
|
|
@12266
|
18 years |
kjdon |
added a new option: OIDmetadata, which is used with OIDtype=assigned, …
|
|
|
@12264
|
18 years |
mdewsnip |
New classes to support incremental building with Lucene, many thanks …
|
|
|
@12261
|
18 years |
mdewsnip |
Changed the way query terms are output. Also shows the number of query …
|
|
|
@12260
|
18 years |
mdewsnip |
Updated to include the new query term stuff, and the classes in packages.
|
|
|
@12258
|
18 years |
mdewsnip |
Now references the GS2Lucene classes in the org.nzdl.gsdl.LuceneWrap …
|
|
|
@12255
|
18 years |
mdewsnip |
Upgraded the version of Lucene from 1.4.1 to 2.0.0... what's the worst …
|
|
|