Changeset 32577 for main


Ignore:
Timestamp:
2018-11-06T16:26:57+13:00 (5 years ago)
Author:
ak19
Message:

Forgot to call superclass in overridden removeall(). Nothing broke so far only because the superclass chain didn't have an actual implementation.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/plugins/GreenstoneSQLPlugin.pm

    r32575 r32577  
    4343# back in from the sql db while the remainder is to be read back in from the docsql .xml files.
    4444
    45 # TODO: deal with incremental vs removeold. If docs removed from import folder, then import step
    46 # won't delete it from archives but buildcol step will. Need to implement this with this database plugin or wherever the actual flow is
    47 
    48 # TODO Q: is "reindex" = del from db + add to db?
    49 # - is this okay for reindexing, or will it need to modify existing values (update table)
    50 # - if it's okay, what does reindex need to accomplish (and how) if the OID changes because hash id produced is different?
    51 # - delete is accomplished in GS SQL Plugin, during buildcol.pl. When should reindexing take place?
    52 # during SQL plugout/import.pl or during plugin? If adding is done by GSSQLPlugout, does it need to
    53 # be reimplemented in GSSQLPlugin to support the adding portion of reindexing.
    54 
    5545# TODO: Add public instructions on using this plugin and its plugout: start with installing mysql binary, changing pwd, running the server (and the client against it for checking: basic cmds like create and drop). Then discuss db name, table names (per coll), db cols and col types, and how the plugout and plugin work.
    5646# Discuss the plugin/plugout parameters.
    5747
     48# TODO, test on windows and mac.
     49# Note: if parsing fails (e.g. using wrong plugout like GS XML plugout, which chokes on args intended for SQL plugout) then SQL plugin init would have already been called and done connection, but disconnect would not have been done because SQL plugin disconnect would not have been called upon parse failure.
    5850
    5951# DONE:
     
    7870# effect that if the db doesn't exist, gssql::use_db() fails, as it won't create db.
    7971#   This got fixed when GSSQLPlugin stopped connecting on init().
    80 
     72#
     73#
     74#+ TODO: deal with incremental vs removeold. If docs removed from import folder, then import step
     75# won't delete it from archives but buildcol step will. Need to implement this with this database plugin or wherever the actual flow is.
     76#
     77# + TODO Q: is "reindex" = del from db + add to db?
     78# - is this okay for reindexing, or will it need to modify existing values (update table)
     79# - if it's okay, what does reindex need to accomplish (and how) if the OID changes because hash id produced is different?
     80# - delete is accomplished in GS SQL Plugin, during buildcol.pl. When should reindexing take place?
     81# during SQL plugout/import.pl or during plugin? If adding is done by GSSQLPlugout, does it need to
     82# be reimplemented in GSSQLPlugin to support the adding portion of reindexing.
     83#
     84# INCREMENTAL REBUILDING IMPLEMENTED CORRECTLY AND WORKS:
     85# Overriding plugins' remove_all() method covered removeold.
     86# Overriding plugins' remove_one() method is all I needed to do for reindex and deletion
     87# (incremental and non-incremental) to work.
     88# but doing all this needed an overhaul of gssql.pm and its use by the GS SQL plugin and plugout.
     89# - needed to correct plugin.pm::remove_some() to process all files
     90# - and needed to correct GreenstoneSQLPlugin::close_document() to setOID() after all
     91# All incremental import and buildcol worked after that:
     92# - deleting files and running incr-import and incr-buildcol (= "incr delete"),
     93# - deleting files and running incr-import and buildcol (="non-incr delete")
     94# - modifying meta and doing an incr rebuild
     95# - modifying fulltext and doing an incr rebuild
     96# - renaming a file forces a reindex: doc is removed from db and added back in, due to remove_one()
     97# - tested CSV file: adding some records, changing some records
     98#    + CSVPlugin test (collection csvsql)
     99#    + MetadataCSVPlugin test (modified collection sqltest to have metadata.csv refer to the
     100#      filenames of sqltest's documents)
     101#    + shared image test (collection shareimg): if 2 html files reference the same image, the docs
     102#      are indeed both reindexed if the image is modified (e.g. I replaced the image with another
     103#      of the same name) which in the GS SQL plugin/plugout case is that the 2 docs are deleted
     104#      and added in again.
    81105
    82106########################################################################################
     
    189213    my ($pluginfo, $base_dir, $processor, $maxdocs) = @_;
    190214
     215    $self->SUPER::remove_all(@_);
     216   
    191217    print STDERR "   Building with removeold option set, so deleting current collection's tables if they exist\n" if($self->{'verbosity'});
    192218   
     
    227253           # SO DON'T RETURN IF CAN'T_PROCESS_THIS_FILE
    228254   
     255   
     256    my $gs_sql = $self->{'gs_sql'} || return 0; # couldn't make the connection or no db etc
     257
    229258    print STDERR "*****************************\nAsked to remove_one oid\n***********************\n";
    230    
    231     my $gs_sql = $self->{'gs_sql'} || return 0; # couldn't make the connection or no db etc
     259    print STDERR "Num oids: " . scalar (@$oids) . "\n";
    232260   
    233261    my $proc_mode = $self->{'process_mode'};
     
    368396
    369397
    370 # TODO: only want to work with sql db if buildcol.pl. Unfortunately, also runs on import.pl.
    371 # During import, the GS SQL Plugin is called before the GS SQL Plugout with undesirable side
    372 # effect that if the db doesn't exist, gssql::use_db() fails, as it won't create db.
    373 
    374398# GS SQL Plugin::init() (and deinit()) is called by import.pl and also by buildcol.pl
    375399# This means it connects and deconnects during import.pl as well. This is okay
    376 # as removeold, which should drop the collection tables, happens during the import phase
    377 # and therefore also requires a db connection.
     400# as removeold, which should drop the collection tables, happens during the import phase,
     401# calling GreenstoneSQLPlugin::and therefore also requires a db connection.
    378402# TODO: Eventually can try moving get_gssql_instance into gssql.pm? That way both GS SQL Plugin
    379403# and Plugout would be using one connection during import.pl phase when both plugs exist.
Note: See TracChangeset for help on using the changeset viewer.