Changeset 32582


Ignore:
Timestamp:
2018-11-07T20:44:34+13:00 (3 years ago)
Author:
ak19
Message:

Now that previous commit(s) put sig handlers in place in gs_sql, have been able to add in Undo on build/import Cancel for the GS SQL Plugs. This utilizes AutoCommit vs Transaction (rollback/commit) behaviour. On cancel, a sig handler is triggered (SIGINT) and, if AutoCommit is off, does a rollback before die() which calls object destructor and disconnects from db. On regular program execution running to normal termination, the last finish() call on gs_sql that will trigger the disconnect, will now first do a commit(), if AutoCommit is off, before disconnecting. For now, the default for both GreenstoneSQLPlugs is to support Undo (i.e. transactions), which turns AutoCommit off. Not sure whether this will be robust: what if transactions take place in memory, we could be dealing with millions of docs of large full-txt. Another issue is that the SQL DB may be out of sync with archives and index folder on Cancel: archives and index just terminate and are in an intermediate state depending on when cancel was pressed. Whereas the GS SQL DB is in a rolled back state as if the import or build never took place. A third issue is that during buildcol (perhaps specifically during buildcol's doc processing phase), pressing Cancel still continues buildcol: the current perl process is cancelled but the next one continues, rather than terminating buildcol in entirety. What happens with the GS SQL DB is that any 'transaction' until then is rolled back, perhaps a transaction regarding one doc if the Cancel affects on a doc basis, and the next process (next doc processing?) continues and allows for further transactions that are all committed at the end on natural termination of buildcol. Need to whether Undo behavious is really what we want. But it's available now and we can simply change the default to not support Undo if we want the old behaviour again.

Location:
main/trunk/greenstone2/perllib
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/gssql.pm

    r32581 r32582  
    3333use DBI; # the central package for this module used by GreenstoneSQL Plugout and Plugin
    3434
    35 #$SIG{INT}  = sub { die "Caught a sigint $!" };
    36 #$SIG{TERM}  = sub { die "Caught a sigterm $!" };
    37 #$SIG{KILL}  = sub { die "Caught a sigkill $!" };
    38 
    39 
    40 $SIG{INT}  = \&finish_signal_handler;
    41 $SIG{TERM}  = \&finish_signal_handler;
    42 $SIG{KILL}  = \&finish_signal_handler;
    43 
    44 sub finish_signal_handler {
    45     my ($sig) = @_; # one of INT|KILL|TERM
    46     die "Caught a $sig signal $!"; # will call destructor
    47 }
     35
    4836##############################
    4937
     
    7563# - db_name (which is the GS3 sitename)
    7664
     65
     66
     67$SIG{INT}  = \&finish_signal_handler;
     68$SIG{TERM}  = \&finish_signal_handler;
     69$SIG{KILL}  = \&finish_signal_handler;
     70
     71sub finish_signal_handler {
     72    my ($sig) = @_; # one of INT|KILL|TERM
     73
     74    if ($_dbh_instance) { # database handle (note, using singleton) still active.
     75   
     76    # TODO: If autocommit wasn't set, then this is a cancel operation.
     77    # If we've not disconnected from the sql db yet and if we've not committed
     78    # transactions yet, then cancel means we do a rollback here
     79   
     80    if($_dbh_instance->{AutoCommit} == 0) {
     81        print STDERR "   User cancelled: rolling back SQL database transaction.\n";
     82        $_dbh_instance->rollback(); # will warn on failure, nothing more we can/want to do,
     83    }
     84    }
     85
     86   
     87    die "Caught a $sig signal $!"; # die() will always call destructor (sub DESTROY)
     88}
     89
    7790sub new
    7891
     
    111124# We want to ensure we've closed the db connection in such cases.
    112125# "It’s common to call die when handling SIGINT and SIGTERM. die is useful because it will ensure that Perl stops correctly: for example Perl will execute a destructor method if present when die is called, but the destructor method will not be called if a SIGINT or SIGTERM is received and no signal handler calls die."
     126#
    113127# https://perldoc.perl.org/perlobj.html#Destructors
     128#
     129# https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#disconnect
     130# "Disconnects the database from the database handle. disconnect is typically only used before exitin# g the program. The handle is of little use after disconnecting.
     131#
     132# The transaction behaviour of the disconnect method is, sadly, undefined. Some database systems (such as Oracle and Ingres) will automatically commit any outstanding changes, but others (such as Informix) will rollback any outstanding changes. Applications not using AutoCommit should explicitly call commit or rollback before calling disconnect.
     133#
     134# The database is automatically disconnected by the DESTROY method if still connected when there are no longer any references to the handle. The DESTROY method for each driver should implicitly call rollback to undo any uncommitted changes. This is vital behaviour to ensure that incomplete transactions don't get committed simply because Perl calls DESTROY on every object before exiting. Also, do not rely on the order of object destruction during "global destruction", as it is undefined.
     135#
     136# Generally, if you want your changes to be committed or rolled back when you disconnect, then you should explicitly call "commit" or "rollback" before disconnecting.
     137#
     138# If you disconnect from a database while you still have active statement handles (e.g., SELECT statement handles that may have more data to fetch), you will get a warning. The warning may indicate that a fetch loop terminated early, perhaps due to an uncaught error. To avoid the warning call the finish method on the active handles."
     139#
    114140sub DESTROY {
    115141    my $self = shift;
    116142
    117143    if (${^GLOBAL_PHASE} eq 'DESTRUCT') {
    118     if ($_dbh_instance) {
     144   
     145    if ($_dbh_instance) { # database handle still active. Use singleton handle!
     146
     147        # THIS CODE HAS MOVED TO finish_signal_handler() WHERE IT BELONGS
     148        # If autocommit wasn't set, then this is a cancel operation.
     149        # If we've not disconnected from the sql db yet and if we've not committed
     150        # transactions yet, then cancel means we do a rollback here
     151
     152        # if($_dbh_instance->{AutoCommit} == 0) {
     153       
     154        #   $_dbh_instance->rollback(); # will warn on failure, nothing more we can/want to do,
     155            #     # don't do a die() here: possibility of infinite loop and we still want to disconnect
     156        # }
     157
     158        # Either way, we're now finally ready to disconnect as is required for premature
     159        # termination too
    119160        print STDERR "XXXXXXXX Global Destruct: Disconnecting from database\n";
    120161        $_dbh_instance->disconnect or warn $_dbh_instance->errstr;
     
    182223    #my $self= shift (@_); # singleton method doesn't use self, but callers don't need to know that
    183224    my ($params_map) = @_;
    184 
     225   
     226    if($params_map->{'verbosity'}) {
     227    if(!defined $params_map->{'autocommit'}) {
     228        print STDERR "  Autocommit parameter not defined\n";
     229    }
     230    if($params_map->{'autocommit'}) {
     231        print STDERR "   SQL DB UNDO SUPPORT OFF.\n";
     232    } else {
     233        print STDERR "   SQL DB UNDO SUPPORT ON.\n";
     234    }
     235    }
     236   
    185237    return $_dbh_instance if($_dbh_instance);
    186238
     
    217269    print STDERR "\nAssuming the mysql server has been started with: --character_set_server=utf8mb4\n" if $db_driver eq "mysql";
    218270    }
     271
     272    # DBI AutoCommit connection param is on/1 by default, so if a value for this is not defined
     273    # as a method parameter to _get_connection_instance, then fallback to the default of on/1
     274    my $autocommit = (defined $params_map->{'autocommit'}) ? $params_map->{'autocommit'} : 1;
    219275   
    220276    my $dbh = DBI->connect("$connect_str", $db_user, $db_pwd,
     
    223279                   PrintError => 1, # on by default, but being explicit
    224280                   RaiseError => 0, # off by default, but being explicit
    225                    AutoCommit => 1, # on by default, but being explicit
     281                   AutoCommit => $autocommit,
    226282                   mysql_enable_utf8mb4 => 1 # tells MySQL to use UTF-8 for communication and tells DBD::mysql to decode the data, see https://stackoverflow.com/questions/46727362/perl-mysql-utf8mb4-issue-possible-bug
    227283               });
     
    273329    my $self= shift (@_);
    274330
     331    # TODO: if AutoCommit was off, meaning transactions were on/enabled,
     332    # then here is where we commit our one long transaction.
     333    # https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#commit
     334    my $rc = 1;   
     335   
    275336    $ref_count--;
    276337    if($ref_count == 0) {
     338    # Only commit transaction when we're about to disconnect, not before
     339    # If autocommit was on, then we'd have committed after every db operation, so nothing to do
     340    $rc = $self->do_commit_if_on();
     341   
    277342    $self->force_disconnect_from_db();
    278     }   
     343    }
     344
     345    return $rc;
     346}
     347
     348sub do_commit_if_on {
     349    my $self= shift (@_);
     350    my $dbh = $self->{'db_handle'};
     351   
     352    my $rc = 1; # return code: everything went fine, regardless of whether we needed to commit
     353                # (AutoCommit on or off)
     354
     355    # https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#commit
     356    if($dbh->{AutoCommit} == 0) {
     357    print STDERR "   Committing transaction to SQL database now.\n" if $self->{'verbosity'};
     358    $rc = $dbh->commit() or warn("SQL DB COMMIT FAILED: " . $dbh->errstr); # important problem
     359                                                            # worth embellishing error message
     360    }
     361    # If autocommit was on, then we'd have committed after every db operation, so nothing to do
     362
     363    return $rc;
    279364}
    280365
     
    429514    $dbh->do("drop table $table");# || warn("@@@ Couldn't delete $table");
    430515    }
     516
     517    # TODO Q: commit here, so that future select statements work?
     518    # See https://metacpan.org/pod/release/TIMB/DBI-1.634_50/DBI.pm#Transactions
    431519}
    432520
  • main/trunk/greenstone2/perllib/plugins/GreenstoneSQLPlugin.pm

    r32580 r32582  
    137137        'desc' => "{GreenstoneSQLPlug.process_mode.all}" } ];
    138138
     139my $rollback_on_cancel_list =
     140    [ { 'name' => "true",
     141        'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}" },     
     142      { 'name' => "false",
     143        'desc' => "{GreenstoneSQLPlug.rollbacl_on_cancel}" } ];
     144
    139145my $arguments =
    140146    [ { 'name' => "process_exp",
     
    149155    'deft' => "all",
    150156    'reqd' => "no"},
     157      { 'name' => "rollback_on_cancel",
     158    'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}",
     159    'type' => "enum",
     160    'list' => $rollback_on_cancel_list,
     161    'deft' => "true", # TODO Q: what's the better default? If "true", any memory concerns?
     162    'reqd' => "no",
     163    'hiddengli' => "no"},
    151164      { 'name' => "db_driver",
    152165    'desc' => "{GreenstoneSQLPlug.db_driver}",
     
    204217}
    205218
     219###### Called during import.pl
     220
    206221# This is called once if removeold is set with import.pl. Most plugins will do
    207222# nothing but if a plugin does any stuff outside of creating doc obj, then
     
    230245    $gs_sql->ensure_fulltxt_table_exists();
    231246    }
    232 }
    233 
    234 # This is called per document for docs that have been deleted from the
     247
     248    # UNNECESSARY
     249    # The removeold related DB transaction (deleting collection tables) is complete
     250    # Don't let GS SQL PlugIN interfere with GS SQL PlugOUT's database transactions
     251    # during import.pl hereafter. Finish up.
     252    #$gs_sql->do_commit_if_on();
     253}
     254
     255# This is called during import.pl per document for docs that have been deleted from the
    235256# collection. Most plugins will do nothing
    236257# but if a plugin does any stuff outside of creating doc obj, then it may need
     
    273294}
    274295
     296#### Called during buildcol
    275297
    276298sub xml_start_tag {
     
    395417}
    396418
     419#### Called during buildcol and import
    397420
    398421# GS SQL Plugin::init() (and deinit()) is called by import.pl and also by buildcol.pl
     
    429452    'verbosity' => $self->{'verbosity'} || 0
    430453               });
     454   
     455    # if autocommit is set, there's no rollback support
     456    my $autocommit = ($self->{'rollback_on_cancel'} eq "false") ? 1 : 0;
    431457
    432458    # try connecting to the mysql db, die if that fails
     
    435461    'db_client_user' => $self->{'db_client_user'},
    436462    'db_client_pwd' => $self->{'db_client_pwd'},
    437     'db_host' => $self->{'db_host'}
     463    'db_host' => $self->{'db_host'},
     464    'autocommit' => $autocommit
    438465                   })
    439466    )
  • main/trunk/greenstone2/perllib/plugouts/GreenstoneSQLPlugout.pm

    r32580 r32582  
    3838use DBI; # the central package for this plugout
    3939
     40
     41# This entire class is called only during import.pl
    4042
    4143# TODO: SIGTERM rollback and disconnect?
     
    6971        'desc' => "{GreenstoneSQLPlug.process_mode.all}" } ];
    7072
     73my $rollback_on_cancel_list =
     74    [ { 'name' => "true",
     75        'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}" },     
     76      { 'name' => "false",
     77        'desc' => "{GreenstoneSQLPlug.rollbacl_on_cancel}" } ];
     78
    7179# The following are the saveas.options:
    7280my $arguments = [
     
    7684      'list' => $process_mode_list,
    7785      'deft' => "all",
     86      'reqd' => "no",
     87      'hiddengli' => "no"},
     88    { 'name' => "rollback_on_cancel",
     89      'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}",
     90      'type' => "enum",
     91      'list' => $rollback_on_cancel_list,
     92      'deft' => "true", # TODO Q: what's the better default? If "true", any memory concerns?
    7893      'reqd' => "no",
    7994      'hiddengli' => "no"},
     
    146161    'collection_name' => $ENV{'GSDLCOLLECTION'},
    147162    'verbosity' => $self->{'verbosity'} || 0
     163   
    148164    };
    149165
    150166    my $gs_sql = new gssql($db_params);
     167
     168    # if autocommit is set, there's no rollback support
     169    my $autocommit = ($self->{'rollback_on_cancel'} eq "false") ? 1 : 0;
    151170   
    152171    # try connecting to the mysql db, die if that fails
     
    156175    'db_client_user' => $self->{'db_client_user'},
    157176    'db_client_pwd' => $self->{'db_client_pwd'},
    158     'db_host' => $self->{'db_host'}
     177    'db_host' => $self->{'db_host'},
     178    'autocommit' => $autocommit
    159179                   })
    160180    )
     
    164184    die("Could not connect to db. Can't proceed.\n");
    165185    }
     186
     187    #die("@@@@ TEST. Connected successfully. Testing gssql::destructor.\n"); # WORKS
    166188   
    167189    my $db_name = $self->{'site_name'} || "greenstone2"; # one database per GS3 site, for GS2 the db is called greenstone2
  • main/trunk/greenstone2/perllib/strings.properties

    r32559 r32582  
    14601460GreenstoneSQLPlug.db_host:The hostname on which the (My)SQL database server is running, 127.0.0.1 by default. Other values to try include localhost.
    14611461
     1462GreenstoneSQLPlug.rollback_on_cancel:Support for undo on cancel. Set to true to support rollbacks on cancel. Transactions are then only committed to the database at the end of import and buildcol. Set to false if you do not want undo support, in which case SQL statements are autocommitted to the database.
     1463
    14621464#
    14631465# Perl module strings
Note: See TracChangeset for help on using the changeset viewer.