Changeset 23547 for documentation

Show
Ignore:
Timestamp:
11.01.2011 14:53:10 (9 years ago)
Author:
kjdon
Message:

added a bit extra to removesuffix for titles so that [sound recording] is removed. greenstone makes [ an entity, so therefore using [

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • documentation/trunk/tutorial_sample_files/beatles/advbeat_large/etc/collect.cfg

    r22947 r23547  
    2727plugin  DirectoryPlugin 
    2828 
    29 classify    AZCompactList -mingroup 1 -metadata dc.Title,Title -minnesting 20 -firstvalueonly -removesuffix "(?i)(\\s+\\d+)|(\\s*[[:punct:]]\\s+.*)|(\\s*by the beatles\\s*)" -buttonname Title -removeprefix (?i)\\s*beatles\\s+\\-\\s+ 
     29# (\\s+[.*) in removesuffix is to remove eg [sound recording] from the Title. Greenstone escapes [] as they are used to represent metadata format elements, hence the use of [ instead of \\[ in the regex. 
     30classify    AZCompactList -mingroup 1 -metadata dc.Title,Title -minnesting 20 -firstvalueonly -removesuffix "(?i)(\\s+\\d+)|(\\s*[[:punct:]]\\s+.*)|(\\s+[.*)|(\\s*by the beatles\\s*)" -buttonname Title -removeprefix (?i)\\s*beatles\\s+\\-\\s+ 
    3031classify    AZCompactList -metadata dc.Format -buttonname Browse -sort Title 
    3132# classify  Phind 
     
    6566collectionmeta  .document:Source [l=en] "filenames" 
    6667collectionmeta  collectionname [l=en] "Advanced Beatles -- large" 
    67 collectionmeta  collectionextra [l=en] "Demonstration collection illustrating the use of heterogeneous documents. Source document are about 
    68 The Beatles pop group in the following formats: HTML, TXT, JPEG, Word, PDF, MIDI, MP3, and MARC file formats." 
     68collectionmeta  collectionextra [l=en] "Demonstration collection illustrating the use of heterogeneous documents. Source documents are about The Beatles pop group in the following formats: HTML, TXT, JPEG, Word, PDF, MIDI, MP3, and MARC file formats." 
    6969collectionmeta  iconcollection [l=en] "_httpprefix_/collect/advbeat_large/images/beatlesmm.png"