source: main/trunk/greenstone2/perllib/plugins/GreenstoneSQLPlugin.pm@ 32582

Last change on this file since 32582 was 32582, checked in by ak19, 5 years ago

Now that previous commit(s) put sig handlers in place in gs_sql, have been able to add in Undo on build/import Cancel for the GS SQL Plugs. This utilizes AutoCommit vs Transaction (rollback/commit) behaviour. On cancel, a sig handler is triggered (SIGINT) and, if AutoCommit is off, does a rollback before die() which calls object destructor and disconnects from db. On regular program execution running to normal termination, the last finish() call on gs_sql that will trigger the disconnect, will now first do a commit(), if AutoCommit is off, before disconnecting. For now, the default for both GreenstoneSQLPlugs is to support Undo (i.e. transactions), which turns AutoCommit off. Not sure whether this will be robust: what if transactions take place in memory, we could be dealing with millions of docs of large full-txt. Another issue is that the SQL DB may be out of sync with archives and index folder on Cancel: archives and index just terminate and are in an intermediate state depending on when cancel was pressed. Whereas the GS SQL DB is in a rolled back state as if the import or build never took place. A third issue is that during buildcol (perhaps specifically during buildcol's doc processing phase), pressing Cancel still continues buildcol: the current perl process is cancelled but the next one continues, rather than terminating buildcol in entirety. What happens with the GS SQL DB is that any 'transaction' until then is rolled back, perhaps a transaction regarding one doc if the Cancel affects on a doc basis, and the next process (next doc processing?) continues and allows for further transactions that are all committed at the end on natural termination of buildcol. Need to whether Undo behavious is really what we want. But it's available now and we can simply change the default to not support Undo if we want the old behaviour again.

File size: 22.3 KB
Line 
1###########################################################################
2#
3# GreenstoneSQLPlugin.pm -- reads into doc_obj from SQL db and docsql.xml
4# Metadata and/or fulltext are stored in SQL db, the rest may be stored in
5# the docsql .xml files.
6# A component of the Greenstone digital library software
7# from the New Zealand Digital Library Project at the
8# University of Waikato, New Zealand.
9#
10# Copyright (C) 2001 New Zealand Digital Library Project
11#
12# This program is free software; you can redistribute it and/or modify
13# it under the terms of the GNU General Public License as published by
14# the Free Software Foundation; either version 2 of the License, or
15# (at your option) any later version.
16#
17# This program is distributed in the hope that it will be useful,
18# but WITHOUT ANY WARRANTY; without even the implied warranty of
19# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20# GNU General Public License for more details.
21#
22# You should have received a copy of the GNU General Public License
23# along with this program; if not, write to the Free Software
24# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
25#
26###########################################################################
27
28package GreenstoneSQLPlugin;
29
30
31use strict;
32no strict 'refs'; # allow filehandles to be variables and viceversa
33
34use DBI;
35use docprint; # for new unescape_text() subroutine
36use GreenstoneXMLPlugin;
37use gssql;
38
39
40# TODO:
41# - Run TODOs here, in Plugout and in gssql.pm by Dr Bainbridge.
42# - Have not yet tested writing out just meta or just fulltxt to sql db and reading just that
43# back in from the sql db while the remainder is to be read back in from the docsql .xml files.
44
45# TODO: Add public instructions on using this plugin and its plugout: start with installing mysql binary, changing pwd, running the server (and the client against it for checking: basic cmds like create and drop). Then discuss db name, table names (per coll), db cols and col types, and how the plugout and plugin work.
46# Discuss the plugin/plugout parameters.
47
48# TODO, test on windows and mac.
49# Note: if parsing fails (e.g. using wrong plugout like GS XML plugout, which chokes on args intended for SQL plugout) then SQL plugin init would have already been called and done connection, but disconnect would not have been done because SQL plugin disconnect would not have been called upon parse failure.
50
51# DONE:
52# + TODO: Incremental delete can't work until GSSQLPlugout has implemented build_mode = incremental
53# (instead of tossing away db on every build)
54# + Ask about docsql naming convention adopted to identify OID. Better way?
55# collection names -> table names: it seems hyphens not allowed. Changed to underscores.
56# + Startup parameters (except removeold/build_mode)
57# + how do we detect we're to do removeold during plugout in import.pl phase
58# + incremental building: where do we need to add code to delete rows from our sql table after
59# incrementally importing a coll with fewer docs (for instance)? What about deleted/modified meta?
60# + Ask if I can assume that all SQL dbs (not just MySQL) will preserve the order of inserted nodes
61# (sections) which in this case had made it easy to reconstruct the doc_obj in memory in the correct order.
62# YES: Otherwise for later db types (drivers), can set order by primary key column and then order by did column
63# + NOTTODO: when db is not running GLI is paralyzed -> can we set timeout on DBI connection attempt?
64# NOT A PROBLEM: Tested to find DBI connection attempt fails immediately when MySQL server not
65# running. The GLI "paralyzing" incident last time was not because of the gs sql connection code,
66# but because my computer was freezing on-and-off.
67# + "Courier" demo documents in lucene-sql collection: character (degree symbol) not preserved in title. Is this because we encode in utf8 when putting into db and reading back in?
68# Test doc with meta and text like macron in Maori text.
69# + TODO Q: During import, the GS SQL Plugin is called before the GS SQL Plugout with undesirable side
70# effect that if the db doesn't exist, gssql::use_db() fails, as it won't create db.
71# This got fixed when GSSQLPlugin stopped connecting on init().
72#
73#
74#+ TODO: deal with incremental vs removeold. If docs removed from import folder, then import step
75# won't delete it from archives but buildcol step will. Need to implement this with this database plugin or wherever the actual flow is.
76#
77# + TODO Q: is "reindex" = del from db + add to db?
78# - is this okay for reindexing, or will it need to modify existing values (update table)
79# - if it's okay, what does reindex need to accomplish (and how) if the OID changes because hash id produced is different?
80# - delete is accomplished in GS SQL Plugin, during buildcol.pl. When should reindexing take place?
81# during SQL plugout/import.pl or during plugin? If adding is done by GSSQLPlugout, does it need to
82# be reimplemented in GSSQLPlugin to support the adding portion of reindexing.
83#
84# INCREMENTAL REBUILDING IMPLEMENTED CORRECTLY AND WORKS:
85# Overriding plugins' remove_all() method covered removeold.
86# Overriding plugins' remove_one() method is all I needed to do for reindex and deletion
87# (incremental and non-incremental) to work.
88# but doing all this needed an overhaul of gssql.pm and its use by the GS SQL plugin and plugout.
89# - needed to correct plugin.pm::remove_some() to process all files
90# - and needed to correct GreenstoneSQLPlugin::close_document() to setOID() after all
91# All incremental import and buildcol worked after that:
92# - deleting files and running incr-import and incr-buildcol (= "incr delete"),
93# - deleting files and running incr-import and buildcol (="non-incr delete")
94# - modifying meta and doing an incr rebuild
95# - modifying fulltext and doing an incr rebuild
96# - renaming a file forces a reindex: doc is removed from db and added back in, due to remove_one()
97# - tested CSV file: adding some records, changing some records
98# + CSVPlugin test (collection csvsql)
99# + MetadataCSVPlugin test (modified collection sqltest to have metadata.csv refer to the
100# filenames of sqltest's documents)
101# + shared image test (collection shareimg): if 2 html files reference the same image, the docs
102# are indeed both reindexed if the image is modified (e.g. I replaced the image with another
103# of the same name) which in the GS SQL plugin/plugout case is that the 2 docs are deleted
104# and added in again.
105
106########################################################################################
107
108# GreenstoneSQLPlugin inherits from GreenstoneXMLPlugin so that it if meta or fulltext
109# is still written out to doc.xml (docsql .xml), that will be processed as usual,
110# whereas GreenstoneSQLPlugin will process all the rest (full text and/or meta, whichever
111# is written out by GreenstoneSQLPlugout into the SQL db).
112
113
114sub BEGIN {
115 @GreenstoneSQLPlugin::ISA = ('GreenstoneXMLPlugin');
116}
117
118# This plugin must be in the document plugins pipeline IN PLACE OF GreenstoneXMLPlugin
119# So we won't have a process exp conflict here.
120# The structure of docsql.xml files is identical to doc.xml and the contents are similar except:
121# - since metadata and/or fulltxt are stored in mysql db instead, just XML comments indicating
122# this are left inside docsql.xml within the <Description> (for meta) and/or <Content> (for txt)
123# - the root element Archive now has a docoid attribute: <Archive docoid="OID">
124sub get_default_process_exp {
125 my $self = shift (@_);
126
127 return q^(?i)docsql(-\d+)?\.xml$^; # regex based on this method in GreenstoneXMLPlugin
128 #return q^(?i)docsql(-.+)?\.xml$^; # no longer storing the OID embedded in docsql .xml filename
129}
130
131my $process_mode_list =
132 [ { 'name' => "meta_only",
133 'desc' => "{GreenstoneSQLPlug.process_mode.meta_only}" },
134 { 'name' => "text_only",
135 'desc' => "{GreenstoneSQLPlug.process_mode.text_only}" },
136 { 'name' => "all",
137 'desc' => "{GreenstoneSQLPlug.process_mode.all}" } ];
138
139my $rollback_on_cancel_list =
140 [ { 'name' => "true",
141 'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}" },
142 { 'name' => "false",
143 'desc' => "{GreenstoneSQLPlug.rollbacl_on_cancel}" } ];
144
145my $arguments =
146 [ { 'name' => "process_exp",
147 'desc' => "{BaseImporter.process_exp}",
148 'type' => "regexp",
149 'deft' => &get_default_process_exp(),
150 'reqd' => "no" },
151 { 'name' => "process_mode",
152 'desc' => "{GreenstoneSQLPlug.process_mode}",
153 'type' => "enum",
154 'list' => $process_mode_list,
155 'deft' => "all",
156 'reqd' => "no"},
157 { 'name' => "rollback_on_cancel",
158 'desc' => "{GreenstoneSQLPlug.rollback_on_cancel}",
159 'type' => "enum",
160 'list' => $rollback_on_cancel_list,
161 'deft' => "true", # TODO Q: what's the better default? If "true", any memory concerns?
162 'reqd' => "no",
163 'hiddengli' => "no"},
164 { 'name' => "db_driver",
165 'desc' => "{GreenstoneSQLPlug.db_driver}",
166 'type' => "string",
167 'deft' => "mysql",
168 'reqd' => "yes"},
169 { 'name' => "db_client_user",
170 'desc' => "{GreenstoneSQLPlug.db_client_user}",
171 'type' => "string",
172 'deft' => "root",
173 'reqd' => "yes"},
174 { 'name' => "db_client_pwd",
175 'desc' => "{GreenstoneSQLPlug.db_client_pwd}",
176 'type' => "string",
177 'deft' => "",
178 'reqd' => "yes"}, # pwd required?
179 { 'name' => "db_host",
180 'desc' => "{GreenstoneSQLPlug.db_host}",
181 'type' => "string",
182 'deft' => "127.0.0.1",
183 'reqd' => "yes"},
184 ];
185
186my $options = { 'name' => "GreenstoneSQLPlugin",
187 'desc' => "{GreenstoneSQLPlugin.desc}",
188 'abstract' => "no",
189 'inherits' => "yes",
190 'args' => $arguments };
191
192
193# TODO: For on cancel, add a SIGTERM handler or so to call end()
194# or to explicitly call gs_sql->close_connection if $gs_sql def
195
196sub new {
197 my ($class) = shift (@_);
198 my ($pluginlist,$inputargs,$hashArgOptLists) = @_;
199 push(@$pluginlist, $class);
200
201 push(@{$hashArgOptLists->{"ArgList"}},@{$arguments});
202 push(@{$hashArgOptLists->{"OptList"}},$options);
203
204 my $self = new GreenstoneXMLPlugin($pluginlist, $inputargs, $hashArgOptLists);
205
206
207 #return bless $self, $class;
208 $self = bless $self, $class;
209 if ($self->{'info_only'}) {
210 # If running pluginfo, we don't need to go further.
211 return $self;
212 }
213
214 # do anything else that needs to be done here when not pluginfo
215
216 return $self;
217}
218
219###### Called during import.pl
220
221# This is called once if removeold is set with import.pl. Most plugins will do
222# nothing but if a plugin does any stuff outside of creating doc obj, then
223# it may need to clear something.
224# In the case of GreenstoneSQL plugs: this is the first time we have a chance
225# to purge the tables of the current collection from the current site's database
226sub remove_all {
227 my $self = shift (@_);
228 my ($pluginfo, $base_dir, $processor, $maxdocs) = @_;
229
230 $self->SUPER::remove_all(@_);
231
232 print STDERR " Building with removeold option set, so deleting current collection's tables if they exist\n" if($self->{'verbosity'});
233
234 # if we're in here, we'd already have run 'use database <site_name>;' during sub init()
235 # so we can go ahead and delete the collection's tables
236 my $gs_sql = $self->{'gs_sql'};
237 $gs_sql->delete_collection_tables(); # will delete them if they exist
238
239 # and recreate tables? No. Tables' existence is ensured in GreenstoneSQLPlugout::begin()
240 my $proc_mode = $self->{'process_mode'};
241 if($proc_mode ne "text_only") {
242 $gs_sql->ensure_meta_table_exists();
243 }
244 if($proc_mode ne "meta_only") {
245 $gs_sql->ensure_fulltxt_table_exists();
246 }
247
248 # UNNECESSARY
249 # The removeold related DB transaction (deleting collection tables) is complete
250 # Don't let GS SQL PlugIN interfere with GS SQL PlugOUT's database transactions
251 # during import.pl hereafter. Finish up.
252 #$gs_sql->do_commit_if_on();
253}
254
255# This is called during import.pl per document for docs that have been deleted from the
256# collection. Most plugins will do nothing
257# but if a plugin does any stuff outside of creating doc obj, then it may need
258# to clear something.
259# remove the doc(s) denoted by oids from GS SQL db
260# This takes care of incremental deletes (docs marked D by ArchivesInfPlugin when building
261# incrementally) as well as cases of "Non-icremental Delete", see ArchivesInfPlugin.pm
262sub remove_one {
263 my $self = shift (@_);
264
265 my ($file, $oids, $archivedir) = @_;
266
267 my $rv = $self->SUPER::remove_one(@_);
268
269 print STDERR "@@@ IN SQLPLUG::REMOVE_ONE: $file\n";
270
271 #return undef unless $self->can_process_this_file($file); # NO, DON'T DO THIS (inherited remove_one behaviour) HERE:
272 # WE DON'T CARE IF IT'S AN IMAGE FILE THAT WAS DELETED.
273 # WE CARE ABOUT REMOVING THE DOC_OID OF THAT IMAGE FILE FROM THE SQL DB
274 # SO DON'T RETURN IF CAN'T_PROCESS_THIS_FILE
275
276
277 my $gs_sql = $self->{'gs_sql'} || return 0; # couldn't make the connection or no db etc
278
279 print STDERR "*****************************\nAsked to remove_one oid\n***********************\n";
280 print STDERR "Num oids: " . scalar (@$oids) . "\n";
281
282 my $proc_mode = $self->{'process_mode'};
283 foreach my $oid (@$oids) {
284 if($proc_mode eq "all" || $proc_mode eq "meta_only") {
285 print STDERR "@@@@@@@@ Deleting $oid from meta table\n" if $self->{'verbosity'} > 2;
286 $gs_sql->delete_recs_from_metatable_with_docid($oid);
287 }
288 if($proc_mode eq "all" || $proc_mode eq "text_only") {
289 print STDERR "@@@@@@@@ Deleting $oid from fulltxt table\n" if $self->{'verbosity'} > 2;
290 $gs_sql->delete_recs_from_texttable_with_docid($oid);
291 }
292 }
293 return $rv;
294}
295
296#### Called during buildcol
297
298sub xml_start_tag {
299 my $self = shift(@_);
300 my ($expat, $element) = @_;
301
302 my $outhandle = $self->{'outhandle'};
303
304 $self->{'element'} = $element;
305 if ($element eq "Archive") { # docsql.xml files contain a OID attribute on Archive element
306 # the element's attributes are in %_ as per ReadXMLFile::xml_start_tag() (while $_
307 # contains the tag)
308
309 # Don't access %_{'docoid'} directly: keep getting a warning message to
310 # use $_{'docoid'} for scalar contexts, but %_ is the element's attr hashmap
311 # whereas $_ has the tag info. So we don't want to do $_{'docoid'}.
312 my %attr_hash = %_; # right way, see OAIPlugin.pm
313 $self->{'doc_oid'} = $attr_hash{'docoid'};
314 ##print STDERR "XXXXXXXXXXXXXX in SQLPlugin::xml_start_tag()\n";
315 print $outhandle "Extracted OID from docsql.xml: ".$self->{'doc_oid'}."\n"
316 if $self->{'verbosity'} > 2;
317
318 }
319 else { # let superclass GreenstoneXMLPlugin continue to process <Section> and <Metadata> elements
320 $self->SUPER::xml_start_tag(@_);
321 }
322}
323
324# TODO Q: Why are there 4 passes when we're only indexing at doc and section level (2 passes)? What's the dummy pass, why is there a pass for infodb?
325
326# We should only ever get here during the buildcol.pl phase
327# At the end of superclass GreenstoneXMLPlugin.pm's close_document() method,
328# the doc_obj in memory is processed (indexed) and then made undef.
329# So we have to work with doc_obj before superclass close_document() is finished.
330sub close_document {
331 my $self = shift(@_);
332
333 ##print STDERR "XXXXXXXXX in SQLPlugin::close_doc()\n";
334
335 my $gs_sql = $self->{'gs_sql'};
336
337 my $outhandle = $self->{'outhandle'};
338 my $doc_obj = $self->{'doc_obj'};
339
340 my $oid = $self->{'doc_oid'}; # we stored current doc's OID during sub xml_start_tag()
341 my $proc_mode = $self->{'process_mode'};
342
343 # For now, we have access to doc_obj (until just before super::close_document() terminates)
344
345 # OID parsed of docsql.xml file does need to be set on $doc_obj, as noticed in this case:
346 # when a doc in import is renamed, and you do incremental import, it is marked for reindexing
347 # (reindexing is implemented by this plugin as a delete followed by add into the sql db).
348 # In that case, UNLESS you set the OID at this stage, the old deleted doc id (for the old doc
349 # name) continues to exist in the index at the end of incremental rebuilding if you were to
350 # browse the rebuilt collection by files/titles. So unless you set the OID here, the deleted
351 # doc oids will still be listed in the index.
352 $self->{'doc_obj'}->set_OID($oid);
353
354 print STDERR " GreenstoneSQLPlugin processing doc $oid (reading into docobj from SQL db)\n"
355 if $self->{'verbosity'} > 0;
356
357 if($proc_mode eq "all" || $proc_mode eq "meta_only") {
358 # read in meta for the collection (i.e. select * from <col>_metadata table
359
360 my $records = $gs_sql->select_from_metatable_matching_docid($oid, $outhandle);
361
362 print $outhandle "----------SQL DB contains meta-----------\n" if $self->{'verbosity'} > 2;
363 # https://www.effectiveperlprogramming.com/2010/07/set-custom-dbi-error-handlers/
364
365 foreach my $row (@$records) {
366 #print $outhandle "row: @$row\n";
367 my ($primary_key, $did, $sid, $metaname, $metaval) = @$row;
368
369 # get rid of the artificial "root" introduced in section id when saving to sql db
370 $sid =~ s@^root@@;
371 $sid = $doc_obj->get_top_section() unless $sid;
372 print $outhandle "### did: $did, sid: |$sid|, meta: $metaname, val: $metaval\n"
373 if $self->{'verbosity'} > 2;
374
375 # TODO: we accessed the db in utf8 mode, so, we can call doc_obj->add_utf8_meta directly:
376 $doc_obj->add_utf8_metadata($sid, $metaname, &docprint::unescape_text($metaval));
377 }
378 print $outhandle "----------FIN READING DOC's META FROM SQL DB------------\n"
379 if $self->{'verbosity'} > 2;
380 }
381
382 if($proc_mode eq "all" || $proc_mode eq "text_only") {
383 # read in fulltxt for the collection (i.e. select * from <col>_fulltxt table
384
385 my $fulltxt_table = $gs_sql->get_fulltext_table_name();
386
387
388 my $records = $gs_sql->select_from_texttable_matching_docid($oid, $outhandle);
389
390
391 print $outhandle "----------\nSQL DB contains txt entries for-----------\n"
392 if $self->{'verbosity'} > 2;
393
394 foreach my $row (@$records) {
395 my ($primary_key, $did, $sid, $text) = @$row;
396
397 # get rid of the artificial "root" introduced in section id when saving to sql db
398 #$sid =~ s@^root@@;
399 $sid = $doc_obj->get_top_section() if ($sid eq "root");
400 print $outhandle "### did: $did, sid: |$sid|, fulltext: <TXT>\n"
401 if $self->{'verbosity'} > 2;
402
403 # TODO - pass by ref?
404 # TODO: we accessed the db in utf8 mode, so, we can call doc_obj->add_utf8_text directly:
405 my $textref = &docprint::unescape_textref(\$text);
406 $doc_obj->add_utf8_text($sid, $$textref);
407 }
408 print $outhandle "----------FIN READING DOC's TXT FROM SQL DB------------\n"
409 if $self->{'verbosity'} > 2;
410 }
411
412 # done reading into docobj from SQL db
413
414 # don't forget to clean up on close() in superclass
415 # It will get the doc_obj indexed then make it undef
416 $self->SUPER::close_document(@_);
417}
418
419#### Called during buildcol and import
420
421# GS SQL Plugin::init() (and deinit()) is called by import.pl and also by buildcol.pl
422# This means it connects and deconnects during import.pl as well. This is okay
423# as removeold, which should drop the collection tables, happens during the import phase,
424# calling GreenstoneSQLPlugin::and therefore also requires a db connection.
425# TODO: Eventually can try moving get_gssql_instance into gssql.pm? That way both GS SQL Plugin
426# and Plugout would be using one connection during import.pl phase when both plugs exist.
427
428# Call init() not begin() because there can be multiple plugin passes and begin() called for
429# each pass (one for doc level and another for section level indexing), whereas init() should
430# be called before any and all passes.
431# This way, we can connect to the SQL database once per buildcol run.
432sub init {
433 my ($self) = shift (@_);
434 ##print STDERR "@@@@@@@@@@ INIT CALLED\n";
435
436 $self->SUPER::init(@_); # super (GreenstoneXMLPlugin) will not yet be trying to read from doc.xml (docsql .xml) files in init().
437
438 ####################
439# print "@@@ SITE NAME: ". $self->{'site_name'} . "\n" if defined $self->{'site_name'};
440# print "@@@ COLL NAME: ". $ENV{'GSDLCOLLECTION'} . "\n";
441
442# print STDERR "@@@@ db_pwd: " . $self->{'db_client_pwd'} . "\n";
443# print STDERR "@@@@ user: " . $self->{'db_client_user'} . "\n";
444# print STDERR "@@@@ db_host: " . $self->{'db_host'} . "\n";
445# print STDERR "@@@@ db_driver: " . $self->{'db_driver'} . "\n";
446 ####################
447
448 # create gssql object.
449 # collection name will be used for naming tables (site name will be used for naming database)
450 my $gs_sql = new gssql({
451 'collection_name' => $ENV{'GSDLCOLLECTION'},
452 'verbosity' => $self->{'verbosity'} || 0
453 });
454
455 # if autocommit is set, there's no rollback support
456 my $autocommit = ($self->{'rollback_on_cancel'} eq "false") ? 1 : 0;
457
458 # try connecting to the mysql db, die if that fails
459 if(!$gs_sql->connect_to_db({
460 'db_driver' => $self->{'db_driver'},
461 'db_client_user' => $self->{'db_client_user'},
462 'db_client_pwd' => $self->{'db_client_pwd'},
463 'db_host' => $self->{'db_host'},
464 'autocommit' => $autocommit
465 })
466 )
467 {
468 # This is fatal for the plugout, let's terminate here
469 # PrintError would already have displayed the warning message on connection fail
470 die("Could not connect to db. Can't proceed.\n");
471 }
472
473 my $db_name = $self->{'site_name'} || "greenstone2"; # one database per GS3 site, for GS2 the db is called greenstone2
474
475 # Attempt to use the db, create it if it doesn't exist (but don't create the tables yet)
476 # Bail if we can't use the database
477 if(!$gs_sql->use_db($db_name)) {
478
479 # This is fatal for the plugout, let's terminate here after disconnecting again
480 # PrintError would already have displayed the warning message on load fail
481 $gs_sql->force_disconnect_from_db();
482 die("Could not use db $db_name. Can't proceed.\n");
483 }
484
485
486 # store db handle now that we're connected
487 $self->{'gs_sql'} = $gs_sql;
488}
489
490
491# This method also runs on import.pl if gs_sql has a value. But we just want to run it on buildcol
492# Call deinit() not end() because there can be multiple plugin passes:
493# one for doc level and another for section level indexing
494# and deinit() should be called before all passes
495# This way, we can close the SQL database once per buildcol run.
496sub deinit {
497 my ($self) = shift (@_);
498
499 ##print STDERR "@@@@@@@@@@ GreenstoneSQLPlugin::DEINIT CALLED\n";
500
501 if($self->{'gs_sql'}) { # only want to work with sql db if buildcol.pl, gs_sql won't have
502 # a value except during buildcol, so when processor =~ m/buildproc$/.
503 $self->{'gs_sql'}->finished();
504
505 # Clear gs_sql (setting key to undef has a different meaning from deleting:
506 # undef makes key still exist but its value is unded whereas delete deletes the key)
507 # So all future use has to make the connection again
508 delete $self->{'gs_sql'};
509 }
510
511 $self->SUPER::deinit(@_);
512}
513
514
515
516
Note: See TracBrowser for help on using the repository browser.