Show
Ignore:
Timestamp:
28.11.2012 11:59:17 (7 years ago)
Author:
davidb
Message:

Introduction of two new OIDtype values (hash_on_full_filename and full_filename) designed to help provide more stable document IDs for collections that are rebuilt over time, including rebuilt after the Greenstone install has been upgraded

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/strings.properties

    r26268 r26536  
    279279import.OIDtype.hash:Hash the contents of the file. Document identifiers will be the same every time the collection is imported. 
    280280import.OIDtype.hash_on_ga_xml:Hash the contents of the Greenstone Archive XML file. Document identifiers will be the same every time the collection is imported as long as the metadata does not change. 
     281import.OIDtype.hash_on_full_filename:Hash on the full filename to the document within the 'import' folder (and not its contents).  Helps make document identifiers more stable across upgrades of the software, although it means that duplicate documents contained in the collection are no longer detected automatically. 
    281282 
    282283import.OIDtype.incremental:Use a simple document count. Significantly faster than "hash", but does not necessarily assign the same identifier to the same document content if the collection is reimported. 
     
    285286 
    286287import.OIDtype.dirname:Use the parent directory name (preceded by 'J'). There should only be one document per directory, and directory names should be unique. E.g. import/b13as/h15ef/page.html will get an identifier of Jh15ef. 
     288 
     289import.OIDtype.filename:Use the tail file name.  Requires every filename across all the folders within 'import' to be unique. 
     290 
     291import.OIDtype.full_filename:Use the full file name within the 'import' folder as the identifier for the document (with _ and - substitutions made for symbols such as directory separators and the fullstop in a filename extension) 
    287292 
    288293import.OIDmetadata:Specifies the metadata element that hold's the document's unique identifier, for use with -OIDtype=assigned.