Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#922 closed enhancement (fixed)

OAI deletion policy

Reported by: ak19 Owned by: ak19
Priority: very high Milestone:
Component: Greenstone2&3 Severity: major
Keywords: OAI Cc:

Description

Change History (6)

comment:2 by ak19, 4 years ago

No longer outputting [oai] and [oai.#] fake classifiers into index db, as etc/oai-inf.db now contains the oai data that needs to be stored for each OID:

http://trac.greenstone.org/changeset/31412

Minor commits:

comment:3 by ak19, 4 years ago

Resolution: fixed
Status: newclosed

comment:4 by ak19, 4 years ago

Kathy undid http://trac.greenstone.org/changeset/31412 , as it was not the best location to stop outputting the OAI "classifier" information that used to go into the index db, and in case we wanted to use the method for something in future.

Instead, Kathy made the necessary changes to greenstone2/perllib/classify.pm :

http://trac.greenstone.org/changeset/31673

comment:5 by ak19, 3 years ago

The oai-inf db now stores an extra record with internal ID "_earliesttimestamp". Its time and datestamp fields contain info on the collection's earliest timestamp, which is the time its oai-inf db was first created.

Previously, the earliestDatestamp field of the build config file was used to denote the earliest timestamp of a collection, and used in determining the earliest timestamp of the OAI repository.

From now on, the earliesttimestamp in oai-inf db should be used as the earliest timestamp of a collection, which is then used to determine the earliest timestamp of the OAI repository from among the earliest timestamp values of all the collections in the repository.

  1. Changes to perl code

http://trac.greenstone.org/changeset/31900 - http://trac.greenstone.org/changeset/31903

  1. Needed to modify the demo collection's existing oai-inf db that had been committed to SVN, to now contain the new _earliesttimestamp record:

http://trac.greenstone.org/changeset/31901 and updated again in http://trac.greenstone.org/changeset/31904

(Revision 31901 still had the record called "earliesttimestamp", while 31904 has the record under the entry for "_earliesttimestamp" denoting an internal key ID.)

  1. The GS2 changes to handle use the _earliesttimestamp field (yet skip this field when getting actual docoids from oai-inf db) are in

http://trac.greenstone.org/changeset/31903 and http://trac.greenstone.org/changeset/31904

  1. Corresponding GS3 changes are in commits

http://trac.greenstone.org/changeset/31911 and http://trac.greenstone.org/changeset/31912 (and http://trac.greenstone.org/changeset/31913 )

comment:6 by ak19, 3 years ago

The GS3 server side changes committed previously (documented just above) would still use the earliestDatestamp found in buildconfig as fallback value for that collection.

However, Dr Bainbridge has thought about it and came to the conclusion that the correct solution is that, since a collection will always have an oai-inf db from now on, the earliest datestamp of a collection should not fall back to either buildconfig's earliestdatestamp field or else buildconfig's lastmodified. However, the latter are used as the publishing date by the RSS service, and so still stored as Collection.java's earliestDatestamp. Now OAICollection has a new additional field, earliestOAIDatestamp which contains the earliest timestamp in oai-inf db. The OAIReceptionist now determines the earliestDatestamp of the entire OAIRepository solely based on the earliestOAIDatestamp values across all OAICollections, also with no fallbacks on Collections' earliestDatestamp or lastModified fields.

GS3 server side commits for this additional modification:

http://trac.greenstone.org/changeset/31915 and http://trac.greenstone.org/changeset/31916

Note: See TracTickets for help on using tickets.