|
|
@35401
|
3 years |
anupama |
Committing Dr Bainbridge's improvements to the Tika-preconfigured …
|
|
|
@35173
|
3 years |
kjdon |
renamed gsConvert.pl option to verbosity (instead of verbose) - why …
|
|
|
@35166
|
3 years |
kjdon |
added code that handles utf16 surrogate pair entities.
|
|
|
@35164
|
3 years |
kjdon |
xpdf seems to output surrogate pairs into the html - these end up …
|
|
|
@34999
|
3 years |
davidb |
Indented to better align with map block
|
|
|
@34998
|
3 years |
davidb |
These changes have now been committed into SVN
|
|
|
@34997
|
3 years |
davidb |
When working with orthogonal indexes, these plugins constructors get …
|
|
|
@34921
|
3 years |
anupama |
Committing the improvements to EmbeddedMetaPlugin's processing of …
|
|
|
@34878
|
3 years |
davidb |
Further changes to work more smoothly with JSONSparqlResultsPlugin, …
|
|
|
@34840
|
3 years |
davidb |
Changed to apply extra-metadata before trying to work out doc-id. …
|
|
|
@34690
|
3 years |
davidb |
When using an orthogonal index, the constructor is run for a second …
|
|
|
@34643
|
3 years |
davidb |
Version of file that is designed to work with planned changes in GS v3.11
|
|
|
@34250
|
4 years |
ak19 |
Having tested the incorporation of Kathy's bugfix to CSVPlugin from …
|
|
|
@34249
|
4 years |
ak19 |
Dr Bainbridge in his commit 32810 had expressed that he intended to …
|
|
|
@34221
|
4 years |
ak19 |
Undid the change of converting tabstops to their entities in …
|
|
|
@34220
|
4 years |
ak19 |
1. TextPlugin takes care to preserve whitespace formatting when …
|
|
|
@34137
|
4 years |
ak19 |
Have only been able to incorporate one of Dr Bainbridge's improvements …
|
|
|
@34131
|
4 years |
ak19 |
Allowing input keep-urls-file to contain a comma followed by country …
|
|
|
@34130
|
4 years |
ak19 |
Some more tidying up while isMRI filtered collection rebuilding
|
|
|
@34129
|
4 years |
ak19 |
Implemented Kathy's suggestions: 1. Explicit ex prefix to ex meta …
|
|
|
@34126
|
4 years |
ak19 |
When I'd modified the code to make the keep_urls_file non-compulsory, …
|
|
|
@34125
|
4 years |
ak19 |
Commit message went awry. Cleaned up some comments to recommit with …
|
|
|
@34124
|
4 years |
ak19 |
Decoding the title and text using the encoding seemed to have turned …
|
|
|
@34123
|
4 years |
ak19 |
Some more minor changes
|
|
|
@34122
|
4 years |
ak19 |
1. After some testing of building the complete commoncrawl collection, …
|
|
|
@34121
|
4 years |
ak19 |
1. Introducing NutchTextDumpPlugin to process the records …
|
|
|
@33721
|
4 years |
ak19 |
Inactive but committing to svn: Newer Locale.pm file, and introducing …
|
|
|
@33389
|
5 years |
kjdon |
store csv field array associated with filename, because you might have …
|
|
|
@33309
|
5 years |
ak19 |
More workarounds for HTML conversion results from Word's …
|
|
|
@33301
|
5 years |
ak19 |
Incorporating Dr Bainbridge's suggested fix for dealing with Word docs …
|
|
|
@33299
|
5 years |
ak19 |
1. Committing Dr Bainbridge's fix to remove duplicated heading titles …
|
|
|
@32984
|
5 years |
davidb |
Commented out debug print statements
|
|
|
@32819
|
5 years |
kjdon |
fixed a comment
|
|
|
@32790
|
5 years |
ak19 |
Suffixing .inactive to the GreenstoneSQL plugs because if perl package …
|
|
|
@32783
|
5 years |
kjdon |
adding missing strings and tidying up some mislabelling
|
|
|
@32778
|
5 years |
ak19 |
Need to set the surrounding div width to be the same/no more than the …
|
|
|
@32777
|
5 years |
ak19 |
Minor. Correction to plugin name displayed
|
|
|
@32761
|
5 years |
kjdon |
when printing out arg values for some other thing, I noticed that site …
|
|
|
@32760
|
5 years |
kjdon |
merge_inheritance, if it finds a conflict in option values, will keep …
|
|
|
@32643
|
5 years |
ak19 |
1. Previous commit (r32640) reintroduced an earlier bug in attempting …
|
|
|
@32640
|
5 years |
ak19 |
Important changes (and commented out debugging statements) to get …
|
|
|
@32595
|
5 years |
ak19 |
Major tidying up: last remaining debug statements, lots of comments, …
|
|
|
@32592
|
5 years |
ak19 |
Renamed gssql.pm to gsmysql.pm. Not subclassing the old gssql into …
|
|
|
@32591
|
5 years |
ak19 |
1. gssql destructor DESTROY doesn't really do anything now, as DBI's …
|
|
|
@32589
|
5 years |
ak19 |
1. SQL db password is not compulsory. 2. Forgot to add the …
|
|
|
@32586
|
5 years |
ak19 |
Renaming 'site_name' parameter used by GS SQL Plugout and Plugin to …
|
|
|
@32584
|
5 years |
ak19 |
Some more tidying up of the code.
|
|
|
@32583
|
5 years |
ak19 |
1. Some tidying up of the code. 2. Removing unnecessary calls to …
|
|
|
@32582
|
5 years |
ak19 |
Now that previous commit(s) put sig handlers in place in gs_sql, have …
|
|
|
@32580
|
5 years |
ak19 |
1. support for port param when connecting to SQL DB. 2. GS SQL Plugout …
|
|
|
@32578
|
5 years |
ak19 |
Optimising. The gssql class internally has only one shared connection …
|
|
|
@32577
|
5 years |
ak19 |
Forgot to call superclass in overridden removeall(). Nothing broke so …
|
|
|
@32575
|
5 years |
ak19 |
1. gssql now does fetching all rows internally upon select. With this …
|
|
|
@32571
|
5 years |
ak19 |
Optimised the SQL DB delete operations in case there are several in …
|
|
|
@32570
|
5 years |
ak19 |
1. Bugfix for when renaming an imported doc and …
|
|
|
@32565
|
5 years |
ak19 |
I think this is a bugfix to plugin.pm::remove_some(): when processing …
|
|
|
@32563
|
5 years |
ak19 |
1. Overhaul of GreenstoneSQLPlugs to handle removeold and incremental …
|
|
|
@32562
|
5 years |
ak19 |
Before major changes to GSSQLPlugs, committing useful comments to …
|
|
|
@32560
|
6 years |
ak19 |
gssql constructor accepts a verbosity parameter
|
|
|
@32559
|
6 years |
ak19 |
Removing db_encoding as parameters to GreenstoneSQLPlugout and …
|
|
|
@32556
|
6 years |
ak19 |
Tested to find DBI connection attempt fails immediately when MySQL …
|
|
|
@32555
|
6 years |
ak19 |
1. In GreenstoneSQLPlugout, removeold is now paramterised (as are …
|
|
|
@32544
|
6 years |
ak19 |
1. GreenstoneSQLPlugin: now sub read() calls the new lazy_get_gssql() …
|
|
|
@32543
|
6 years |
ak19 |
Tidying up and adjusting TODO statements
|
|
|
@32542
|
6 years |
ak19 |
Instead of the docoid being stored in the docsql-<OID>.xml filename, …
|
|
|
@32541
|
6 years |
ak19 |
Using proper parameters to GreenstoneSQLPlugin/Plugout instead of …
|
|
|
@32539
|
6 years |
ak19 |
New plugin parameter site_name (only set for GS3) that is passed to …
|
|
|
@32538
|
6 years |
ak19 |
Previous commit message meant to be: string names of strings shared by …
|
|
|
@32537
|
6 years |
ak19 |
First commit to do with reading back in from the SQL DB. This commit …
|
|
|
@32536
|
6 years |
ak19 |
First commit to do with reading back in from the SQL DB. This commit …
|
|
|
@32535
|
6 years |
ak19 |
Fixing couple of typos before major commit.
|
|
|
@32501
|
6 years |
Georgiy Litvinov |
Workaround to set assign metadata via csv metadata plugin. "Section" …
|
|
|
@32500
|
6 years |
ak19 |
For a test case, best_encoding come out with prefix/suffix to utf_8, …
|
|
|
@32499
|
6 years |
ak19 |
Fix for PDFv2 plugin's page buckets.
|
|
|
@32343
|
6 years |
kjdon |
quantifiers mustn't have \ before {
|
|
|
@32341
|
6 years |
ak19 |
1. Fixing up regex syntax in DirectoryPlugin for perl 5.26 that comes …
|
|
|
@32332
|
6 years |
kjdon |
removed replace_images function. this inherits from HTMLPlugin, and …
|
|
|
@32325
|
6 years |
ak19 |
Dr Bainbridge worked out the solution to HTMLPlugin not handling …
|
|
|
@32305
|
6 years |
ak19 |
1. When a plugin's built on multiple inheritance, the first n-1 …
|
|
|
@32303
|
6 years |
ak19 |
Forgot to update the plugin descriptions for the PDF plugins.
|
|
|
@32290
|
6 years |
ak19 |
1. Making paged_pretty_html the default rather than pretty_html, since …
|
|
|
@32289
|
6 years |
ak19 |
The PDFPlugin is being deprecated (since PDFv1 and PDFv2 plugins are …
|
|
|
@32287
|
6 years |
ak19 |
Cleaning up unused strings, some debug statements and recently …
|
|
|
@32286
|
6 years |
ak19 |
PDFv2Plugin will only work out of the box for GS3 now: PDFBoxConverter …
|
|
|
@32285
|
6 years |
ak19 |
Fix to sectionalising xpdftools' produced paged_pretty_html: Dr …
|
|
|
@32284
|
6 years |
ak19 |
PDFv2Plugin doesn't offer a zoom flag anymore, replaced with a dpi …
|
|
|
@32283
|
6 years |
ak19 |
More stable behaviour by PDFv2Plugin: 1. when pdfbox_conversion is on, …
|
|
|
@32281
|
6 years |
ak19 |
Undoing accidental commit
|
|
|
@32280
|
6 years |
ak19 |
Implementing PDFv2paged_text (with pdfbox)
|
|
|
@32277
|
6 years |
ak19 |
First attempt at PDFv2Plugin.pm.
|
|
|
@32275
|
6 years |
ak19 |
Moving another fixed English language string into strings.properties …
|
|
|
@32274
|
6 years |
ak19 |
Related to previous commit, forgot to commit with previous revision. A …
|
|
|
@32273
|
6 years |
ak19 |
First of the commits to do with restructuring and refactoring the …
|
|
|
@32224
|
6 years |
ak19 |
Adding PDF to text support for Windows using Xpdf's pdftotext tool. …
|
|
|
@32223
|
6 years |
ak19 |
When no output mode for PDFPlugin has been set by the user, the output …
|
|
|
@32222
|
6 years |
ak19 |
q
|
|
|
@32215
|
6 years |
ak19 |
Before reorganising our PDFPlugin in whatever way we ultimately …
|
|
|
@32210
|
6 years |
ak19 |
When PDFPlugin is set to paged_html output mode, it now finally …
|
|
|
@32206
|
6 years |
ak19 |
1. ConvertBinaryFile.pm no longer knows more than necessary about …
|
|
|
@32205
|
6 years |
ak19 |
First set of commits to do with implementing the new 'paged_html' …
|
|
|