Changeset 32582

Show
Ignore:
Timestamp:
07.11.2018 20:44:34 (12 days ago)
Author:
ak19
Message:

Now that previous commit(s) put sig handlers in place in gs_sql, have been able to add in Undo on build/import Cancel for the GS SQL Plugs. This utilizes AutoCommit? vs Transaction (rollback/commit) behaviour. On cancel, a sig handler is triggered (SIGINT) and, if AutoCommit? is off, does a rollback before die() which calls object destructor and disconnects from db. On regular program execution running to normal termination, the last finish() call on gs_sql that will trigger the disconnect, will now first do a commit(), if AutoCommit? is off, before disconnecting. For now, the default for both GreenstoneSQLPlugs is to support Undo (i.e. transactions), which turns AutoCommit? off. Not sure whether this will be robust: what if transactions take place in memory, we could be dealing with millions of docs of large full-txt. Another issue is that the SQL DB may be out of sync with archives and index folder on Cancel: archives and index just terminate and are in an intermediate state depending on when cancel was pressed. Whereas the GS SQL DB is in a rolled back state as if the import or build never took place. A third issue is that during buildcol (perhaps specifically during buildcol's doc processing phase), pressing Cancel still continues buildcol: the current perl process is cancelled but the next one continues, rather than terminating buildcol in entirety. What happens with the GS SQL DB is that any 'transaction' until then is rolled back, perhaps a transaction regarding one doc if the Cancel affects on a doc basis, and the next process (next doc processing?) continues and allows for further transactions that are all committed at the end on natural termination of buildcol. Need to whether Undo behavious is really what we want. But it's available now and we can simply change the default to not support Undo if we want the old behaviour again.

Location:
main/trunk/greenstone2/perllib
Files:
4 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/gssql.pm

    r32581 r32582  
    3333use DBI; # the central package for this module used by GreenstoneSQL Plugout and Plugin 
    3434 
    35 #$SIG{INT}  = sub { die "Caught a sigint $!" }; 
    36 #$SIG{TERM}  = sub { die "Caught a sigterm $!" }; 
    37 #$SIG{KILL}  = sub { die "Caught a sigkill $!" }; 
    38  
    39  
    40 $SIG{INT}  = \&finish_signal_handler; 
    41 $SIG{TERM}  = \&finish_signal_handler; 
    42 $SIG{KILL}  = \&finish_signal_handler; 
    43  
    44 sub finish_signal_handler { 
    45     my ($sig) = @_; # one of INT|KILL|TERM 
    46     die "Caught a $sig signal $!"; # will call destructor 
    47 } 
     35 
    4836############################## 
    4937 
     
    7563# - db_name (which is the GS3 sitename) 
    7664 
     65 
     66 
     67$SIG{INT}  = \&finish_signal_handler; 
     68$SIG{TERM}  = \&finish_signal_handler; 
     69$SIG{KILL}  = \&finish_signal_handler; 
     70 
     71sub finish_signal_handler { 
     72    my ($sig) = @_; # one of INT|KILL|TERM 
     73 
     74    if ($_dbh_instance) { # database handle (note, using singleton) still active. 
     75     
     76    # TODO: If autocommit wasn't set, then this is a cancel operation. 
     77    # If we've not disconnected from the sql db yet and if we've not committed 
     78    # transactions yet, then cancel means we do a rollback here 
     79     
     80    if($_dbh_instance->{AutoCommit} == 0) { 
     81        print STDERR "   User cancelled: rolling back SQL database transaction.\n"; 
     82        $_dbh_instance->rollback(); # will warn on failure, nothing more we can/want to do, 
     83    } 
     84    } 
     85 
     86     
     87    die "Caught a $sig signal $!"; # die() will always call destructor (sub DESTROY) 
     88} 
     89 
    7790sub new 
    7891 
     
    111124# We want to ensure we've closed the db connection in such cases. 
    112125# "It’s common to call die when handling SIGINT and SIGTERM. die is useful because it will ensure that Perl stops correctly: for example Perl will execute a destructor method if present when die is called, but the destructor method will not be called if a SIGINT or SIGTERM is received and no signal handler calls die." 
     126# 
    113127# https://perldoc.perl.org/perlobj.html#Destructors 
     128# 
     129# https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#disconnect 
     130# "Disconnects the database from the database handle. disconnect is typically only used before exitin# g the program. The handle is of little use after disconnecting. 
     131# 
     132# The transaction behaviour of the disconnect method is, sadly, undefined. Some database systems (such as Oracle and Ingres) will automatically commit any outstanding changes, but others (such as Informix) will rollback any outstanding changes. Applications not using AutoCommit should explicitly call commit or rollback before calling disconnect. 
     133# 
     134# The database is automatically disconnected by the DESTROY method if still connected when there are no longer any references to the handle. The DESTROY method for each driver should implicitly call rollback to undo any uncommitted changes. This is vital behaviour to ensure that incomplete transactions don't get committed simply because Perl calls DESTROY on every object before exiting. Also, do not rely on the order of object destruction during "global destruction", as it is undefined. 
     135# 
     136# Generally, if you want your changes to be committed or rolled back when you disconnect, then you should explicitly call "commit" or "rollback" before disconnecting. 
     137# 
     138# If you disconnect from a database while you still have active statement handles (e.g., SELECT statement handles that may have more data to fetch), you will get a warning. The warning may indicate that a fetch loop terminated early, perhaps due to an uncaught error. To avoid the warning call the finish method on the active handles." 
     139# 
    114140sub DESTROY { 
    115141    my $self = shift; 
    116142 
    117143    if (${^GLOBAL_PHASE} eq 'DESTRUCT') { 
    118     if ($_dbh_instance) { 
     144     
     145    if ($_dbh_instance) { # database handle still active. Use singleton handle! 
     146 
     147        # THIS CODE HAS MOVED TO finish_signal_handler() WHERE IT BELONGS 
     148        # If autocommit wasn't set, then this is a cancel operation. 
     149        # If we've not disconnected from the sql db yet and if we've not committed 
     150        # transactions yet, then cancel means we do a rollback here 
     151 
     152        # if($_dbh_instance->{AutoCommit} == 0) { 
     153         
     154        #   $_dbh_instance->rollback(); # will warn on failure, nothing more we can/want to do, 
     155            #     # don't do a die() here: possibility of infinite loop and we still want to disconnect 
     156        # } 
     157 
     158        # Either way, we're now finally ready to disconnect as is required for premature 
     159        # termination too 
    119160        print STDERR "XXXXXXXX Global Destruct: Disconnecting from database\n"; 
    120161        $_dbh_instance->disconnect or warn $_dbh_instance->errstr; 
     
    182223    #my $self= shift (@_); # singleton method doesn't use self, but callers don't need to know that 
    183224    my ($params_map) = @_; 
    184  
     225     
     226    if($params_map->{'verbosity'}) { 
     227    if(!defined $params_map->{'autocommit'}) { 
     228        print STDERR "  Autocommit parameter not defined\n"; 
     229    } 
     230    if($params_map->{'autocommit'}) { 
     231        print STDERR "   SQL DB UNDO SUPPORT OFF.\n"; 
     232    } else { 
     233        print STDERR "   SQL DB UNDO SUPPORT ON.\n"; 
     234    } 
     235    } 
     236     
    185237    return $_dbh_instance if($_dbh_instance); 
    186238 
     
    217269    print STDERR "\nAssuming the mysql server has been started with: --character_set_server=utf8mb4\n" if $db_driver eq "mysql"; 
    218270    } 
     271 
     272    # DBI AutoCommit connection param is on/1 by default, so if a value for this is not defined 
     273    # as a method parameter to _get_connection_instance, then fallback to the default of on/1 
     274    my $autocommit = (defined $params_map->{'autocommit'}) ? $params_map->{'autocommit'} : 1; 
    219275     
    220276    my $dbh = DBI->connect("$connect_str", $db_user, $db_pwd, 
     
    223279                   PrintError => 1, # on by default, but being explicit 
    224280                   RaiseError => 0, # off by default, but being explicit 
    225                    AutoCommit => 1, # on by default, but being explicit 
     281                   AutoCommit => $autocommit, 
    226282                   mysql_enable_utf8mb4 => 1 # tells MySQL to use UTF-8 for communication and tells DBD::mysql to decode the data, see https://stackoverflow.com/questions/46727362/perl-mysql-utf8mb4-issue-possible-bug  
    227283               }); 
     
    273329    my $self= shift (@_); 
    274330 
     331    # TODO: if AutoCommit was off, meaning transactions were on/enabled, 
     332    # then here is where we commit our one long transaction. 
     333    # https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#commit 
     334    my $rc = 1;     
     335     
    275336    $ref_count--; 
    276337    if($ref_count == 0) { 
     338    # Only commit transaction when we're about to disconnect, not before 
     339    # If autocommit was on, then we'd have committed after every db operation, so nothing to do 
     340    $rc = $self->do_commit_if_on(); 
     341     
    277342    $self->force_disconnect_from_db(); 
    278     }     
     343    } 
     344 
     345    return $rc; 
     346} 
     347 
     348sub do_commit_if_on { 
     349    my $self= shift (@_); 
     350    my $dbh = $self->{'db_handle'}; 
     351     
     352    my $rc = 1; # return code: everything went fine, regardless of whether we needed to commit 
     353                # (AutoCommit on or off) 
     354 
     355    # https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#commit 
     356    if($dbh->{AutoCommit} == 0) { 
     357    print STDERR "   Committing transaction to SQL database now.\n" if $self->{'verbosity'}; 
     358    $rc = $dbh->commit() or warn("SQL DB COMMIT FAILED: " . $dbh->errstr); # important problem 
     359                                                            # worth embellishing error message 
     360    } 
     361    # If autocommit was on, then we'd have committed after every db operation, so nothing to do 
     362 
     363    return $rc; 
    279364} 
    280365 
     
    429514    $dbh->do("drop table $table");# || warn("@@@ Couldn't delete $table"); 
    430515    } 
     516 
     517    # TODO Q: commit here, so that future select statements work? 
     518    # See https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#Transactions 
    431519} 
    432520 
  • main/trunk/greenstone2/perllib/plugins/GreenstoneSQLPlugin.pm

    r32580 r32582  
    137137        'desc' => "{GreenstoneSQLPlug.process_mode.all}" } ]; 
    138138 
     139my $rollback_on_cancel_list = 
     140    [ { 'name' => "true", 
     141        'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}" },       
     142      { 'name' => "false", 
     143        'desc' => "{GreenstoneSQLPlug.rollbacl_on_cancel}" } ]; 
     144 
    139145my $arguments = 
    140146    [ { 'name' => "process_exp", 
     
    149155    'deft' => "all", 
    150156    'reqd' => "no"}, 
     157      { 'name' => "rollback_on_cancel",  
     158    'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}", 
     159    'type' => "enum", 
     160    'list' => $rollback_on_cancel_list, 
     161    'deft' => "true", # TODO Q: what's the better default? If "true", any memory concerns? 
     162    'reqd' => "no", 
     163    'hiddengli' => "no"}, 
    151164      { 'name' => "db_driver",  
    152165    'desc' => "{GreenstoneSQLPlug.db_driver}", 
     
    204217} 
    205218 
     219###### Called during import.pl 
     220 
    206221# This is called once if removeold is set with import.pl. Most plugins will do 
    207222# nothing but if a plugin does any stuff outside of creating doc obj, then  
     
    230245    $gs_sql->ensure_fulltxt_table_exists(); 
    231246    } 
    232 } 
    233  
    234 # This is called per document for docs that have been deleted from the  
     247 
     248    # UNNECESSARY 
     249    # The removeold related DB transaction (deleting collection tables) is complete 
     250    # Don't let GS SQL PlugIN interfere with GS SQL PlugOUT's database transactions 
     251    # during import.pl hereafter. Finish up. 
     252    #$gs_sql->do_commit_if_on(); 
     253} 
     254 
     255# This is called during import.pl per document for docs that have been deleted from the  
    235256# collection. Most plugins will do nothing 
    236257# but if a plugin does any stuff outside of creating doc obj, then it may need 
     
    273294} 
    274295 
     296#### Called during buildcol 
    275297 
    276298sub xml_start_tag { 
     
    395417} 
    396418 
     419#### Called during buildcol and import 
    397420 
    398421# GS SQL Plugin::init() (and deinit()) is called by import.pl and also by buildcol.pl 
     
    429452    'verbosity' => $self->{'verbosity'} || 0 
    430453               }); 
     454     
     455    # if autocommit is set, there's no rollback support 
     456    my $autocommit = ($self->{'rollback_on_cancel'} eq "false") ? 1 : 0; 
    431457 
    432458    # try connecting to the mysql db, die if that fails 
     
    435461    'db_client_user' => $self->{'db_client_user'}, 
    436462    'db_client_pwd' => $self->{'db_client_pwd'}, 
    437     'db_host' => $self->{'db_host'} 
     463    'db_host' => $self->{'db_host'}, 
     464    'autocommit' => $autocommit 
    438465                   }) 
    439466    ) 
  • main/trunk/greenstone2/perllib/plugouts/GreenstoneSQLPlugout.pm

    r32580 r32582  
    3838use DBI; # the central package for this plugout 
    3939 
     40 
     41# This entire class is called only during import.pl 
    4042 
    4143# TODO: SIGTERM rollback and disconnect? 
     
    6971        'desc' => "{GreenstoneSQLPlug.process_mode.all}" } ]; 
    7072 
     73my $rollback_on_cancel_list = 
     74    [ { 'name' => "true", 
     75        'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}" },       
     76      { 'name' => "false", 
     77        'desc' => "{GreenstoneSQLPlug.rollbacl_on_cancel}" } ]; 
     78 
    7179# The following are the saveas.options: 
    7280my $arguments = [  
     
    7684      'list' => $process_mode_list, 
    7785      'deft' => "all", 
     86      'reqd' => "no", 
     87      'hiddengli' => "no"}, 
     88    { 'name' => "rollback_on_cancel",  
     89      'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}", 
     90      'type' => "enum", 
     91      'list' => $rollback_on_cancel_list, 
     92      'deft' => "true", # TODO Q: what's the better default? If "true", any memory concerns? 
    7893      'reqd' => "no", 
    7994      'hiddengli' => "no"}, 
     
    146161    'collection_name' => $ENV{'GSDLCOLLECTION'}, 
    147162    'verbosity' => $self->{'verbosity'} || 0 
     163     
    148164    }; 
    149165 
    150166    my $gs_sql = new gssql($db_params); 
     167 
     168    # if autocommit is set, there's no rollback support 
     169    my $autocommit = ($self->{'rollback_on_cancel'} eq "false") ? 1 : 0; 
    151170     
    152171    # try connecting to the mysql db, die if that fails 
     
    156175    'db_client_user' => $self->{'db_client_user'}, 
    157176    'db_client_pwd' => $self->{'db_client_pwd'}, 
    158     'db_host' => $self->{'db_host'} 
     177    'db_host' => $self->{'db_host'}, 
     178    'autocommit' => $autocommit 
    159179                   }) 
    160180    ) 
     
    164184    die("Could not connect to db. Can't proceed.\n"); 
    165185    } 
     186 
     187    #die("@@@@ TEST. Connected successfully. Testing gssql::destructor.\n"); # WORKS 
    166188     
    167189    my $db_name = $self->{'site_name'} || "greenstone2"; # one database per GS3 site, for GS2 the db is called greenstone2 
  • main/trunk/greenstone2/perllib/strings.properties

    r32559 r32582  
    14601460GreenstoneSQLPlug.db_host:The hostname on which the (My)SQL database server is running, 127.0.0.1 by default. Other values to try include localhost. 
    14611461 
     1462GreenstoneSQLPlug.rollback_on_cancel:Support for undo on cancel. Set to true to support rollbacks on cancel. Transactions are then only committed to the database at the end of import and buildcol. Set to false if you do not want undo support, in which case SQL statements are autocommitted to the database. 
     1463 
    14621464# 
    14631465# Perl module strings