source: collections/documented-examples/trunk/pagedimg-e/etc/collect.cfg@ 19196

Last change on this file since 19196 was 19196, checked in by kjdon, 15 years ago

added OIDtype and OIDmetadata options to PagedImagePLugin to get persistent OIDs across platforms.

  • Property svn:executable set to *
File size: 6.7 KB
Line 
1creator [email protected]
2maintainer [email protected]
3public true
4
5indexes section:text
6defaultindex section:text
7
8plugin GreenstoneXMLPlugin
9# We want the two types of paged documents to be treated differently: paged
10# and hierarchical. So include two PagedImagePlugin plugins and modify the
11# process_exp.
12plugin PagedImagePlugin -create_screenview true -minimumsize 100 -documenttype hierarchy -process_exp xml.*\.item$ -OIDtype assigned -OIDmetadata ItemOID
13plugin PagedImagePlugin -create_screenview true -minimumsize 100 -documenttype paged -OIDtype assigned -OIDmetadata ItemOID
14plugin MetadataXMLPlugin
15plugin ArchivesInfPlugin
16plugin DirectoryPlugin
17
18classify AZCompactList -metadata Series -sort Date
19classify DateList
20
21# Format statements to display Series, Volume, Number and Date information
22
23format DocumentVList "<td valign=top>[link][icon][/link]</td>
24<td valign=top>{If}{[Series],[Series] {If}{[Volume],Vol. [Volume]} {If}{[Number],No. [Number]},[highlight]{Or}{[Title],[PageNum]}[/highlight]}</td>"
25
26format CL1VList "<td valign=top>[link][icon][/link]</td>
27<td valign=top>{If}{[numleafdocs],[Title],{If}{[Volume],Vol. [Volume]} {If}{[Number],No. [Number]} ([format:Date])}</td>"
28
29format SearchVList "<td valign=top>[link][icon][/link]</td>
30<td valign=top>[parent(Top):Series] {If}{[parent(Top):Volume],Vol. [parent(Top):Volume]} {If}{[parent(Top):Number],No. [parent(Top):Number]} Page [Title]</td>"
31
32format DateList "<td valign=top>[link][icon][/link]</td>
33<td valign=top>[Series] {If}{[Volume],Vol. [Volume]} {If}{[Number],No. [Number]}</td>"
34
35format HList "[link][highlight][ex.Title][/highlight][/link]"
36
37# We customise the document display, so use the extended options
38format AllowExtendedOptions true
39
40# We want to add in fullsize/preview/text buttons to switch between the
41# different versions of each page
42
43format DocumentHeading "<center><table width=_pagewidth_>
44<tr valign=top><td>{Or}{[parent(Top):Series],[Series]}</td></tr>
45<tr valign=top><td><table><tr><td>
46[DocumentButtonDetach][DocumentButtonHighlight]
47{If}{_cgiargp_ eq 'fullsize',{If}{[screenicon],_document:viewpreview_}
48{If}{[NoText] eq \'1\',,_document:viewtext_},
49{If}{_cgiargp_ eq 'preview',{If}{[srcicon],_document:viewfullsize_}
50{If}{[NoText] eq \'1\',,_document:viewtext_},
51{If}{[srcicon],_document:viewfullsize_}
52{If}{[screenicon],_document:viewpreview_}}}
53</td></tr></table></td>
54<td>[DocTOC]</td></tr></table></center>"
55
56# Document text display changes based on the p argument - this is not used
57#normally for document display, so we can use it here to switch between
58#fullsize/preview/text versions.
59format DocumentText "<center><table width=_pagewidth_><tr><td>
60{If}{_cgiargp_ eq \'fullsize\',[srcicon],
61{If}{_cgiargp_ eq \'preview\',[screenicon],{If}{[NoText] eq \'1\',,[Text]}}}
62</td></tr></table></center>"
63
64
65# -- English strings --------------------
66collectionmeta collectionname [l=en] "Paged Image example"
67collectionmeta .section:text [l=en] "newspaper pages"
68
69# -- English text -----------------------
70
71collectionmeta collectionextra [l=en] "This collection contains a few newspapers from the
72<a href='http://www.nzdl.org/cgi-bin/library?a=p&amp;p=about&amp;c=niupepa'>
73Niupepa</a> collection of Maori newspapers.
74
75<h3>How the collection works</h3>
76<p>Each newspaper issue consists of a set of images, one per page, and a set
77of text files for the OCR'd text. An item file links the set of pages into a
78single newspaper document. PagedImagePlugin is used to process the item files.
79<p>There are two styles of item files, and this collection demonstrates both.
80The first uses a text based format, and consists of a list of metadata for the
81document, and a list of pages. Here are some examples:
82<a href='_httpcollection_/import/09/09\_1\_1.item'>Te Waka o Te Iwi, Vol. 1, No. 1</a>,
83<a href='_httpcollection_/import/10/10\_1\_3.item'>Te Whetu o Te Tau, Vol. 1, No. 3</a>.
84This format allows specification of document level metadata, and a single list of pages.
85<p>The second style is an extended format, and uses XML. It allows a hierarchy
86of pages, and metadata specification at the page level as well as at the
87document level. An example is <a href='_httpcollection_/import/xml/23/23\_\_2.item'>Matariki 1881, No. 2</a>.
88This newspaper also has an abstract associated with it. The contents have been
89grouped into two sections: Supplementary Material, which contains the Abstract,
90 and Newspaper Pages, which contains the page images.
91<p>Paged documents can be presented with a hierarchical table of contents
92(e.g. <a href='?a=d&amp;c=_cgiargc_&amp;d=23\_\_1.2.1&p=text'>this one</a>),
93or with next and previous page arrows, and a goto page box
94(e.g. <a href='?a=d&amp;c=_cgiargc_&amp;d=10\_1\_2&p=preview'>this one</a>).
95This is specified by the <tt>-documenttype (hierarchy|paged)</tt> option to PagedImagePlugin.
96The next and previous arrows suit the linear sequence documents, while the table of contents
97suits the hierarchically organised document. Ordinarily, a Greenstone collection
98 would have one plugin per document type, and all documents of that type get
99the same processing. In this case, we want to treat the XML-based item files
100differently from the text-based item files. We can achieve this by adding two
101PagedImagePlugin plugins to the collection, and configuring them differently.
102<p><tt>plugin PagedImagePlugin -documenttype hierarchy -process_exp xml.*\.item$ <br/>
103plugin PagedImagePlugin -documenttype paged </tt>
104
105<p>XML based newpapers have been grouped into a folder called <tt>xml</tt>.
106This enables us to process these files differently, by utilising the
107<tt>process_exp</tt> option which all plugins support. The first PagedImagePlugin
108in the list looks for item files underneath the xml folder. These documents
109will be processed as hierarchical documents. Item files that don't match the
110process expression (i.e. aren't underneath the xml folder) will be passed onto
111the second PagedImagePlugin, and these are treated as paged documents.
112
113<p><b>Formatting</b>
114<p>We have modified the document formatting to display fullsized images,
115preview images or text, with buttons to switch between them. This involves
116modifications to the DocumentHeading and DocumentText format statements in the
117<a href='_httpcollection_/etc/collect.cfg' target=\'collect.cfg\'>collection configuration file</a>,
118and some macro definitions in the <a href='_httpcollection_/macros/extra.dm' target=\'extra.dm\'>extra.dm macro file</a>.
119The extra.dm macro file provides definitions for the buttons (\_viewfullsize\_,
120 \_viewpreview\_, \_viewtext\_) which are used by the format statement in the
121collect.cfg file. The format statement switches the document display and sets
122the buttons to be displayed based on the p argument, which is also set by the
123format statement.
124"
125
Note: See TracBrowser for help on using the repository browser.