|
|
@34250
|
4 years |
ak19 |
Having tested the incorporation of Kathy's bugfix to CSVPlugin from …
|
|
|
@34249
|
4 years |
ak19 |
Dr Bainbridge in his commit 32810 had expressed that he intended to …
|
|
|
@34221
|
4 years |
ak19 |
Undid the change of converting tabstops to their entities in …
|
|
|
@34220
|
4 years |
ak19 |
1. TextPlugin takes care to preserve whitespace formatting when …
|
|
|
@34137
|
4 years |
ak19 |
Have only been able to incorporate one of Dr Bainbridge's improvements …
|
|
|
@34131
|
4 years |
ak19 |
Allowing input keep-urls-file to contain a comma followed by country …
|
|
|
@34130
|
4 years |
ak19 |
Some more tidying up while isMRI filtered collection rebuilding
|
|
|
@34129
|
4 years |
ak19 |
Implemented Kathy's suggestions: 1. Explicit ex prefix to ex meta …
|
|
|
@34126
|
4 years |
ak19 |
When I'd modified the code to make the keep_urls_file non-compulsory, …
|
|
|
@34125
|
4 years |
ak19 |
Commit message went awry. Cleaned up some comments to recommit with …
|
|
|
@34124
|
4 years |
ak19 |
Decoding the title and text using the encoding seemed to have turned …
|
|
|
@34123
|
4 years |
ak19 |
Some more minor changes
|
|
|
@34122
|
4 years |
ak19 |
1. After some testing of building the complete commoncrawl collection, …
|
|
|
@34121
|
4 years |
ak19 |
1. Introducing NutchTextDumpPlugin to process the records …
|
|
|
@33721
|
4 years |
ak19 |
Inactive but committing to svn: Newer Locale.pm file, and introducing …
|
|
|
@33389
|
5 years |
kjdon |
store csv field array associated with filename, because you might have …
|
|
|
@33309
|
5 years |
ak19 |
More workarounds for HTML conversion results from Word's …
|
|
|
@33301
|
5 years |
ak19 |
Incorporating Dr Bainbridge's suggested fix for dealing with Word docs …
|
|
|
@33299
|
5 years |
ak19 |
1. Committing Dr Bainbridge's fix to remove duplicated heading titles …
|
|
|
@32984
|
5 years |
davidb |
Commented out debug print statements
|
|
|
@32819
|
5 years |
kjdon |
fixed a comment
|
|
|
@32790
|
5 years |
ak19 |
Suffixing .inactive to the GreenstoneSQL plugs because if perl package …
|
|
|
@32783
|
5 years |
kjdon |
adding missing strings and tidying up some mislabelling
|
|
|
@32778
|
5 years |
ak19 |
Need to set the surrounding div width to be the same/no more than the …
|
|
|
@32777
|
5 years |
ak19 |
Minor. Correction to plugin name displayed
|
|
|
@32761
|
5 years |
kjdon |
when printing out arg values for some other thing, I noticed that site …
|
|
|
@32760
|
5 years |
kjdon |
merge_inheritance, if it finds a conflict in option values, will keep …
|
|
|
@32643
|
5 years |
ak19 |
1. Previous commit (r32640) reintroduced an earlier bug in attempting …
|
|
|
@32640
|
5 years |
ak19 |
Important changes (and commented out debugging statements) to get …
|
|
|
@32595
|
6 years |
ak19 |
Major tidying up: last remaining debug statements, lots of comments, …
|
|
|
@32592
|
6 years |
ak19 |
Renamed gssql.pm to gsmysql.pm. Not subclassing the old gssql into …
|
|
|
@32591
|
6 years |
ak19 |
1. gssql destructor DESTROY doesn't really do anything now, as DBI's …
|
|
|
@32589
|
6 years |
ak19 |
1. SQL db password is not compulsory. 2. Forgot to add the …
|
|
|
@32586
|
6 years |
ak19 |
Renaming 'site_name' parameter used by GS SQL Plugout and Plugin to …
|
|
|
@32584
|
6 years |
ak19 |
Some more tidying up of the code.
|
|
|
@32583
|
6 years |
ak19 |
1. Some tidying up of the code. 2. Removing unnecessary calls to …
|
|
|
@32582
|
6 years |
ak19 |
Now that previous commit(s) put sig handlers in place in gs_sql, have …
|
|
|
@32580
|
6 years |
ak19 |
1. support for port param when connecting to SQL DB. 2. GS SQL Plugout …
|
|
|
@32578
|
6 years |
ak19 |
Optimising. The gssql class internally has only one shared connection …
|
|
|
@32577
|
6 years |
ak19 |
Forgot to call superclass in overridden removeall(). Nothing broke so …
|
|
|
@32575
|
6 years |
ak19 |
1. gssql now does fetching all rows internally upon select. With this …
|
|
|
@32571
|
6 years |
ak19 |
Optimised the SQL DB delete operations in case there are several in …
|
|
|
@32570
|
6 years |
ak19 |
1. Bugfix for when renaming an imported doc and …
|
|
|
@32565
|
6 years |
ak19 |
I think this is a bugfix to plugin.pm::remove_some(): when processing …
|
|
|
@32563
|
6 years |
ak19 |
1. Overhaul of GreenstoneSQLPlugs to handle removeold and incremental …
|
|
|
@32562
|
6 years |
ak19 |
Before major changes to GSSQLPlugs, committing useful comments to …
|
|
|
@32560
|
6 years |
ak19 |
gssql constructor accepts a verbosity parameter
|
|
|
@32559
|
6 years |
ak19 |
Removing db_encoding as parameters to GreenstoneSQLPlugout and …
|
|
|
@32556
|
6 years |
ak19 |
Tested to find DBI connection attempt fails immediately when MySQL …
|
|
|
@32555
|
6 years |
ak19 |
1. In GreenstoneSQLPlugout, removeold is now paramterised (as are …
|
|
|
@32544
|
6 years |
ak19 |
1. GreenstoneSQLPlugin: now sub read() calls the new lazy_get_gssql() …
|
|
|
@32543
|
6 years |
ak19 |
Tidying up and adjusting TODO statements
|
|
|
@32542
|
6 years |
ak19 |
Instead of the docoid being stored in the docsql-<OID>.xml filename, …
|
|
|
@32541
|
6 years |
ak19 |
Using proper parameters to GreenstoneSQLPlugin/Plugout instead of …
|
|
|
@32539
|
6 years |
ak19 |
New plugin parameter site_name (only set for GS3) that is passed to …
|
|
|
@32538
|
6 years |
ak19 |
Previous commit message meant to be: string names of strings shared by …
|
|
|
@32537
|
6 years |
ak19 |
First commit to do with reading back in from the SQL DB. This commit …
|
|
|
@32536
|
6 years |
ak19 |
First commit to do with reading back in from the SQL DB. This commit …
|
|
|
@32535
|
6 years |
ak19 |
Fixing couple of typos before major commit.
|
|
|
@32501
|
6 years |
Georgiy Litvinov |
Workaround to set assign metadata via csv metadata plugin. "Section" …
|
|
|
@32500
|
6 years |
ak19 |
For a test case, best_encoding come out with prefix/suffix to utf_8, …
|
|
|
@32499
|
6 years |
ak19 |
Fix for PDFv2 plugin's page buckets.
|
|
|
@32343
|
6 years |
kjdon |
quantifiers mustn't have \ before {
|
|
|
@32341
|
6 years |
ak19 |
1. Fixing up regex syntax in DirectoryPlugin for perl 5.26 that comes …
|
|
|
@32332
|
6 years |
kjdon |
removed replace_images function. this inherits from HTMLPlugin, and …
|
|
|
@32325
|
6 years |
ak19 |
Dr Bainbridge worked out the solution to HTMLPlugin not handling …
|
|
|
@32305
|
6 years |
ak19 |
1. When a plugin's built on multiple inheritance, the first n-1 …
|
|
|
@32303
|
6 years |
ak19 |
Forgot to update the plugin descriptions for the PDF plugins.
|
|
|
@32290
|
6 years |
ak19 |
1. Making paged_pretty_html the default rather than pretty_html, since …
|
|
|
@32289
|
6 years |
ak19 |
The PDFPlugin is being deprecated (since PDFv1 and PDFv2 plugins are …
|
|
|
@32287
|
6 years |
ak19 |
Cleaning up unused strings, some debug statements and recently …
|
|
|
@32286
|
6 years |
ak19 |
PDFv2Plugin will only work out of the box for GS3 now: PDFBoxConverter …
|
|
|
@32285
|
6 years |
ak19 |
Fix to sectionalising xpdftools' produced paged_pretty_html: Dr …
|
|
|
@32284
|
6 years |
ak19 |
PDFv2Plugin doesn't offer a zoom flag anymore, replaced with a dpi …
|
|
|
@32283
|
6 years |
ak19 |
More stable behaviour by PDFv2Plugin: 1. when pdfbox_conversion is on, …
|
|
|
@32281
|
6 years |
ak19 |
Undoing accidental commit
|
|
|
@32280
|
6 years |
ak19 |
Implementing PDFv2paged_text (with pdfbox)
|
|
|
@32277
|
6 years |
ak19 |
First attempt at PDFv2Plugin.pm.
|
|
|
@32275
|
6 years |
ak19 |
Moving another fixed English language string into strings.properties …
|
|
|
@32274
|
6 years |
ak19 |
Related to previous commit, forgot to commit with previous revision. A …
|
|
|
@32273
|
6 years |
ak19 |
First of the commits to do with restructuring and refactoring the …
|
|
|
@32224
|
6 years |
ak19 |
Adding PDF to text support for Windows using Xpdf's pdftotext tool. …
|
|
|
@32223
|
6 years |
ak19 |
When no output mode for PDFPlugin has been set by the user, the output …
|
|
|
@32222
|
6 years |
ak19 |
q
|
|
|
@32215
|
6 years |
ak19 |
Before reorganising our PDFPlugin in whatever way we ultimately …
|
|
|
@32210
|
6 years |
ak19 |
When PDFPlugin is set to paged_html output mode, it now finally …
|
|
|
@32206
|
6 years |
ak19 |
1. ConvertBinaryFile.pm no longer knows more than necessary about …
|
|
|
@32205
|
6 years |
ak19 |
First set of commits to do with implementing the new 'paged_html' …
|
|
|
@32192
|
6 years |
kjdon |
with new result and result_str return values from convert, need to …
|
|
|
@32186
|
6 years |
kjdon |
if the eval didn't work, all the return values might be undefined, so …
|
|
|
@32185
|
6 years |
kjdon |
use new return values from ImageConverter::convert
|
|
|
@32184
|
6 years |
kjdon |
change the return values of convert to match tmp_area_convert_file in …
|
|
|
@32183
|
6 years |
kjdon |
image height and width might be returned as 'unknown', in particular …
|
|
|
@32159
|
6 years |
ak19 |
incremental building was not being incremental when no metadata was …
|
|
|
@32131
|
6 years |
kjdon |
don't want the initial , if trying to match 41 times. this is a syntax …
|
|
|
@32129
|
6 years |
kjdon |
After () in a regex, {} signifys quantifiers. eg (xx){2,4} - 2-4 …
|
|
|
@32122
|
6 years |
kjdon |
we had [a-z]{2..} which causes an error in later versions of perl. …
|
|
|
@32096
|
6 years |
ak19 |
Marking all the uses of sysread() with a comment saying they're a …
|
|
|
@32028
|
7 years |
ak19 |
1. Bugfix to previous commit: var might not be on an …
|
|
|
@32026
|
7 years |
ak19 |
Some more placeholder strings for the UnknownConverterPlugin to …
|
|
|