Changeset 8796


Ignore:
Timestamp:
2004-12-14T14:07:09+13:00 (19 years ago)
Author:
kjdon
Message:

added new oidtypes 'assigned' (from stephen de gabrielle) and 'dirname' (from emanuel dejanu)

Location:
trunk/gsdl
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • trunk/gsdl/bin/script/import.pl

    r8603 r8796  
    5454        'desc' => "{import.OIDtype.hash}" },
    5555      { 'name' => "incremental",
    56         'desc' => "{import.OIDtype.incremental}" } ];
     56        'desc' => "{import.OIDtype.incremental}" },
     57      { 'name' => "assigned",
     58        'desc' => "{import.OIDtype.assigned}" },
     59      { 'name' => "dirname",
     60        'desc' => "{import.OIDtype.dirname}" } ];
    5761
    5862#** define to use the original GA format or METS format
     
    207211             'gzip', \$gzip,
    208212             'groupsize/\d+/1', \$groupsize,
    209              'OIDtype/^(hash|incremental)$/', \$OIDtype,
     213             'OIDtype/^(hash|incremental|assigned|dirname)$/', \$OIDtype,
    210214             'sortmeta/.*/', \$sortmeta,
    211215             'debug', \$debug,
     
    334338        }
    335339    }
    336     if ($OIDtype !~ /^(hash|incremental)$/) {
    337         if (defined $collectcfg->{'OIDtype'} && $collectcfg->{'OIDtype'} =~ /^(hash|incremental)$/) {
     340    if ($OIDtype !~ /^(hash|incremental|assigned|dirname)$/) {
     341        if (defined $collectcfg->{'OIDtype'} && $collectcfg->{'OIDtype'} =~ /^(hash|incremental|assigned|dirname)$/) {
    338342        $OIDtype = $collectcfg->{'OIDtype'};
    339343        } else {
  • trunk/gsdl/perllib/strings.rb

    r8789 r8796  
    181181import.OIDtype:The method to use when generating unique identifiers for each document.
    182182import.OIDtype.hash:Hashes the contents of the file. Document identifier will be the same every time the collection is imported.
     183
    183184import.OIDtype.incremental:A simple document count that is significantly faster than "hash". It is not guaranteed to always assign the same identifier to a given document though and does not allow further documents to be added to existing xml archives.
     185
     186import.OIDtype.assigned:Uses 'D' plus the value of dc.Identifier as the document identifier. dc.Identifiers should be unique. If no dc.Identifier is assigned to the document, a hash id will be used instead.
     187
     188import.OIDtype.dirname:Uses 'J' plus the parent directory name as the identifier. This relies on there being only one document per directory, and all directory names being unique. E.g. import/b13as/h15ef/page.html will get an identifier of Jh15ef.
    184189
    185190import.saveas:This is to decide the archives format to be generated. The default setting is to GA.
Note: See TracChangeset for help on using the changeset viewer.