Changeset 32577

Show
Ignore:
Timestamp:
06.11.2018 16:26:57 (2 weeks ago)
Author:
ak19
Message:

Forgot to call superclass in overridden removeall(). Nothing broke so far only because the superclass chain didn't have an actual implementation.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/plugins/GreenstoneSQLPlugin.pm

    r32575 r32577  
    4343# back in from the sql db while the remainder is to be read back in from the docsql .xml files. 
    4444 
    45 # TODO: deal with incremental vs removeold. If docs removed from import folder, then import step 
    46 # won't delete it from archives but buildcol step will. Need to implement this with this database plugin or wherever the actual flow is 
    47  
    48 # TODO Q: is "reindex" = del from db + add to db? 
    49 # - is this okay for reindexing, or will it need to modify existing values (update table) 
    50 # - if it's okay, what does reindex need to accomplish (and how) if the OID changes because hash id produced is different? 
    51 # - delete is accomplished in GS SQL Plugin, during buildcol.pl. When should reindexing take place? 
    52 # during SQL plugout/import.pl or during plugin? If adding is done by GSSQLPlugout, does it need to 
    53 # be reimplemented in GSSQLPlugin to support the adding portion of reindexing. 
    54  
    5545# TODO: Add public instructions on using this plugin and its plugout: start with installing mysql binary, changing pwd, running the server (and the client against it for checking: basic cmds like create and drop). Then discuss db name, table names (per coll), db cols and col types, and how the plugout and plugin work. 
    5646# Discuss the plugin/plugout parameters. 
    5747 
     48# TODO, test on windows and mac. 
     49# Note: if parsing fails (e.g. using wrong plugout like GS XML plugout, which chokes on args intended for SQL plugout) then SQL plugin init would have already been called and done connection, but disconnect would not have been done because SQL plugin disconnect would not have been called upon parse failure. 
    5850 
    5951# DONE: 
     
    7870# effect that if the db doesn't exist, gssql::use_db() fails, as it won't create db. 
    7971#   This got fixed when GSSQLPlugin stopped connecting on init(). 
    80  
     72# 
     73# 
     74#+ TODO: deal with incremental vs removeold. If docs removed from import folder, then import step 
     75# won't delete it from archives but buildcol step will. Need to implement this with this database plugin or wherever the actual flow is. 
     76# 
     77# + TODO Q: is "reindex" = del from db + add to db? 
     78# - is this okay for reindexing, or will it need to modify existing values (update table) 
     79# - if it's okay, what does reindex need to accomplish (and how) if the OID changes because hash id produced is different? 
     80# - delete is accomplished in GS SQL Plugin, during buildcol.pl. When should reindexing take place? 
     81# during SQL plugout/import.pl or during plugin? If adding is done by GSSQLPlugout, does it need to 
     82# be reimplemented in GSSQLPlugin to support the adding portion of reindexing. 
     83# 
     84# INCREMENTAL REBUILDING IMPLEMENTED CORRECTLY AND WORKS: 
     85# Overriding plugins' remove_all() method covered removeold. 
     86# Overriding plugins' remove_one() method is all I needed to do for reindex and deletion 
     87# (incremental and non-incremental) to work. 
     88# but doing all this needed an overhaul of gssql.pm and its use by the GS SQL plugin and plugout. 
     89# - needed to correct plugin.pm::remove_some() to process all files 
     90# - and needed to correct GreenstoneSQLPlugin::close_document() to setOID() after all 
     91# All incremental import and buildcol worked after that: 
     92# - deleting files and running incr-import and incr-buildcol (= "incr delete"), 
     93# - deleting files and running incr-import and buildcol (="non-incr delete") 
     94# - modifying meta and doing an incr rebuild 
     95# - modifying fulltext and doing an incr rebuild 
     96# - renaming a file forces a reindex: doc is removed from db and added back in, due to remove_one() 
     97# - tested CSV file: adding some records, changing some records 
     98#    + CSVPlugin test (collection csvsql) 
     99#    + MetadataCSVPlugin test (modified collection sqltest to have metadata.csv refer to the 
     100#      filenames of sqltest's documents) 
     101#    + shared image test (collection shareimg): if 2 html files reference the same image, the docs 
     102#      are indeed both reindexed if the image is modified (e.g. I replaced the image with another 
     103#      of the same name) which in the GS SQL plugin/plugout case is that the 2 docs are deleted 
     104#      and added in again. 
    81105 
    82106######################################################################################## 
     
    189213    my ($pluginfo, $base_dir, $processor, $maxdocs) = @_; 
    190214 
     215    $self->SUPER::remove_all(@_); 
     216     
    191217    print STDERR "   Building with removeold option set, so deleting current collection's tables if they exist\n" if($self->{'verbosity'}); 
    192218     
     
    227253           # SO DON'T RETURN IF CAN'T_PROCESS_THIS_FILE 
    228254     
     255     
     256    my $gs_sql = $self->{'gs_sql'} || return 0; # couldn't make the connection or no db etc 
     257 
    229258    print STDERR "*****************************\nAsked to remove_one oid\n***********************\n"; 
    230      
    231     my $gs_sql = $self->{'gs_sql'} || return 0; # couldn't make the connection or no db etc 
     259    print STDERR "Num oids: " . scalar (@$oids) . "\n"; 
    232260     
    233261    my $proc_mode = $self->{'process_mode'}; 
     
    368396 
    369397 
    370 # TODO: only want to work with sql db if buildcol.pl. Unfortunately, also runs on import.pl. 
    371 # During import, the GS SQL Plugin is called before the GS SQL Plugout with undesirable side 
    372 # effect that if the db doesn't exist, gssql::use_db() fails, as it won't create db. 
    373  
    374398# GS SQL Plugin::init() (and deinit()) is called by import.pl and also by buildcol.pl 
    375399# This means it connects and deconnects during import.pl as well. This is okay 
    376 # as removeold, which should drop the collection tables, happens during the import phase 
    377 # and therefore also requires a db connection. 
     400# as removeold, which should drop the collection tables, happens during the import phase, 
     401# calling GreenstoneSQLPlugin::and therefore also requires a db connection. 
    378402# TODO: Eventually can try moving get_gssql_instance into gssql.pm? That way both GS SQL Plugin 
    379403# and Plugout would be using one connection during import.pl phase when both plugs exist.