source: main/trunk/greenstone2/perllib/plugins

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @35401   3 years anupama Committing Dr Bainbridge's improvements to the Tika-preconfigured …
(edit) @35173   3 years kjdon renamed gsConvert.pl option to verbosity (instead of verbose) - why …
(edit) @35166   3 years kjdon added code that handles utf16 surrogate pair entities.
(edit) @35164   3 years kjdon xpdf seems to output surrogate pairs into the html - these end up …
(edit) @34999   3 years davidb Indented to better align with map block
(edit) @34998   3 years davidb These changes have now been committed into SVN
(edit) @34997   3 years davidb When working with orthogonal indexes, these plugins constructors get …
(edit) @34921   3 years anupama Committing the improvements to EmbeddedMetaPlugin's processing of …
(edit) @34878   3 years davidb Further changes to work more smoothly with JSONSparqlResultsPlugin, …
(edit) @34840   3 years davidb Changed to apply extra-metadata before trying to work out doc-id. …
(edit) @34690   3 years davidb When using an orthogonal index, the constructor is run for a second …
(edit) @34643   3 years davidb Version of file that is designed to work with planned changes in GS v3.11
(edit) @34250   4 years ak19 Having tested the incorporation of Kathy's bugfix to CSVPlugin from …
(edit) @34249   4 years ak19 Dr Bainbridge in his commit 32810 had expressed that he intended to …
(edit) @34221   4 years ak19 Undid the change of converting tabstops to their entities in …
(edit) @34220   4 years ak19 1. TextPlugin takes care to preserve whitespace formatting when …
(edit) @34137   4 years ak19 Have only been able to incorporate one of Dr Bainbridge's improvements …
(edit) @34131   4 years ak19 Allowing input keep-urls-file to contain a comma followed by country …
(edit) @34130   4 years ak19 Some more tidying up while isMRI filtered collection rebuilding
(edit) @34129   4 years ak19 Implemented Kathy's suggestions: 1. Explicit ex prefix to ex meta …
(edit) @34126   4 years ak19 When I'd modified the code to make the keep_urls_file non-compulsory, …
(edit) @34125   4 years ak19 Commit message went awry. Cleaned up some comments to recommit with …
(edit) @34124   4 years ak19 Decoding the title and text using the encoding seemed to have turned …
(edit) @34123   4 years ak19 Some more minor changes
(edit) @34122   4 years ak19 1. After some testing of building the complete commoncrawl collection, …
(edit) @34121   4 years ak19 1. Introducing NutchTextDumpPlugin to process the records …
(edit) @33721   4 years ak19 Inactive but committing to svn: Newer Locale.pm file, and introducing …
(edit) @33389   5 years kjdon store csv field array associated with filename, because you might have …
(edit) @33309   5 years ak19 More workarounds for HTML conversion results from Word's …
(edit) @33301   5 years ak19 Incorporating Dr Bainbridge's suggested fix for dealing with Word docs …
(edit) @33299   5 years ak19 1. Committing Dr Bainbridge's fix to remove duplicated heading titles …
(edit) @32984   5 years davidb Commented out debug print statements
(edit) @32819   5 years kjdon fixed a comment
(edit) @32790   5 years ak19 Suffixing .inactive to the GreenstoneSQL plugs because if perl package …
(edit) @32783   5 years kjdon adding missing strings and tidying up some mislabelling
(edit) @32778   5 years ak19 Need to set the surrounding div width to be the same/no more than the …
(edit) @32777   5 years ak19 Minor. Correction to plugin name displayed
(edit) @32761   5 years kjdon when printing out arg values for some other thing, I noticed that site …
(edit) @32760   5 years kjdon merge_inheritance, if it finds a conflict in option values, will keep …
(edit) @32643   5 years ak19 1. Previous commit (r32640) reintroduced an earlier bug in attempting …
(edit) @32640   5 years ak19 Important changes (and commented out debugging statements) to get …
(edit) @32595   5 years ak19 Major tidying up: last remaining debug statements, lots of comments, …
(edit) @32592   5 years ak19 Renamed gssql.pm to gsmysql.pm. Not subclassing the old gssql into …
(edit) @32591   5 years ak19 1. gssql destructor DESTROY doesn't really do anything now, as DBI's …
(edit) @32589   5 years ak19 1. SQL db password is not compulsory. 2. Forgot to add the …
(edit) @32586   5 years ak19 Renaming 'site_name' parameter used by GS SQL Plugout and Plugin to …
(edit) @32584   5 years ak19 Some more tidying up of the code.
(edit) @32583   5 years ak19 1. Some tidying up of the code. 2. Removing unnecessary calls to …
(edit) @32582   5 years ak19 Now that previous commit(s) put sig handlers in place in gs_sql, have …
(edit) @32580   5 years ak19 1. support for port param when connecting to SQL DB. 2. GS SQL Plugout …
(edit) @32578   5 years ak19 Optimising. The gssql class internally has only one shared connection …
(edit) @32577   5 years ak19 Forgot to call superclass in overridden removeall(). Nothing broke so …
(edit) @32575   5 years ak19 1. gssql now does fetching all rows internally upon select. With this …
(edit) @32571   5 years ak19 Optimised the SQL DB delete operations in case there are several in …
(edit) @32570   5 years ak19 1. Bugfix for when renaming an imported doc and …
(edit) @32565   5 years ak19 I think this is a bugfix to plugin.pm::remove_some(): when processing …
(edit) @32563   5 years ak19 1. Overhaul of GreenstoneSQLPlugs to handle removeold and incremental …
(edit) @32562   5 years ak19 Before major changes to GSSQLPlugs, committing useful comments to …
(edit) @32560   6 years ak19 gssql constructor accepts a verbosity parameter
(edit) @32559   6 years ak19 Removing db_encoding as parameters to GreenstoneSQLPlugout and …
(edit) @32556   6 years ak19 Tested to find DBI connection attempt fails immediately when MySQL …
(edit) @32555   6 years ak19 1. In GreenstoneSQLPlugout, removeold is now paramterised (as are …
(edit) @32544   6 years ak19 1. GreenstoneSQLPlugin: now sub read() calls the new lazy_get_gssql() …
(edit) @32543   6 years ak19 Tidying up and adjusting TODO statements
(edit) @32542   6 years ak19 Instead of the docoid being stored in the docsql-<OID>.xml filename, …
(edit) @32541   6 years ak19 Using proper parameters to GreenstoneSQLPlugin/Plugout instead of …
(edit) @32539   6 years ak19 New plugin parameter site_name (only set for GS3) that is passed to …
(edit) @32538   6 years ak19 Previous commit message meant to be: string names of strings shared by …
(edit) @32537   6 years ak19 First commit to do with reading back in from the SQL DB. This commit …
(edit) @32536   6 years ak19 First commit to do with reading back in from the SQL DB. This commit …
(edit) @32535   6 years ak19 Fixing couple of typos before major commit.
(edit) @32501   6 years Georgiy Litvinov Workaround to set assign metadata via csv metadata plugin. "Section" …
(edit) @32500   6 years ak19 For a test case, best_encoding come out with prefix/suffix to utf_8, …
(edit) @32499   6 years ak19 Fix for PDFv2 plugin's page buckets.
(edit) @32343   6 years kjdon quantifiers mustn't have \ before {
(edit) @32341   6 years ak19 1. Fixing up regex syntax in DirectoryPlugin for perl 5.26 that comes …
(edit) @32332   6 years kjdon removed replace_images function. this inherits from HTMLPlugin, and …
(edit) @32325   6 years ak19 Dr Bainbridge worked out the solution to HTMLPlugin not handling …
(edit) @32305   6 years ak19 1. When a plugin's built on multiple inheritance, the first n-1 …
(edit) @32303   6 years ak19 Forgot to update the plugin descriptions for the PDF plugins.
(edit) @32290   6 years ak19 1. Making paged_pretty_html the default rather than pretty_html, since …
(edit) @32289   6 years ak19 The PDFPlugin is being deprecated (since PDFv1 and PDFv2 plugins are …
(edit) @32287   6 years ak19 Cleaning up unused strings, some debug statements and recently …
(edit) @32286   6 years ak19 PDFv2Plugin will only work out of the box for GS3 now: PDFBoxConverter …
(edit) @32285   6 years ak19 Fix to sectionalising xpdftools' produced paged_pretty_html: Dr …
(edit) @32284   6 years ak19 PDFv2Plugin doesn't offer a zoom flag anymore, replaced with a dpi …
(edit) @32283   6 years ak19 More stable behaviour by PDFv2Plugin: 1. when pdfbox_conversion is on, …
(edit) @32281   6 years ak19 Undoing accidental commit
(edit) @32280   6 years ak19 Implementing PDFv2paged_text (with pdfbox)
(edit) @32277   6 years ak19 First attempt at PDFv2Plugin.pm.
(edit) @32275   6 years ak19 Moving another fixed English language string into strings.properties …
(edit) @32274   6 years ak19 Related to previous commit, forgot to commit with previous revision. A …
(edit) @32273   6 years ak19 First of the commits to do with restructuring and refactoring the …
(edit) @32224   6 years ak19 Adding PDF to text support for Windows using Xpdf's pdftotext tool. …
(edit) @32223   6 years ak19 When no output mode for PDFPlugin has been set by the user, the output …
(edit) @32222   6 years ak19 q
(edit) @32215   6 years ak19 Before reorganising our PDFPlugin in whatever way we ultimately …
(edit) @32210   6 years ak19 When PDFPlugin is set to paged_html output mode, it now finally …
(edit) @32206   6 years ak19 1. ConvertBinaryFile.pm no longer knows more than necessary about …
(edit) @32205   6 years ak19 First set of commits to do with implementing the new 'paged_html' …
Note: See TracRevisionLog for help on using the revision log.