Context Navigation

source: collections/documented-examples/trunk/pagedimg-e/etc/collect.cfg@ 19359

Last change on this file since 19359 was 19359, checked in by anna, 15 years ago
Spanish translations of the documented example collections. Many thanks to Diego Spano.
Property svn:executable set to ``*
File size: 11.0 KB

Line
1	creator [email protected]
2	maintainer [email protected]
3	public true
4
5	indexes section:text
6	defaultindex section:text
7
8	plugin GreenstoneXMLPlugin
9	# We want the two types of paged documents to be treated differently: paged
10	# and hierarchical. So include two PagedImagePlugin plugins and modify the
11	# process_exp.
12	plugin PagedImagePlugin -create_screenview true -minimumsize 100 -documenttype hierarchy -process_exp xml.*\.item$ -OIDtype assigned -OIDmetadata ItemOID
13	plugin PagedImagePlugin -create_screenview true -minimumsize 100 -documenttype paged -OIDtype assigned -OIDmetadata ItemOID
14	plugin MetadataXMLPlugin
15	plugin ArchivesInfPlugin
16	plugin DirectoryPlugin
17
18	classify AZCompactList -metadata Series -sort Date
19	classify DateList
20
21	# Format statements to display Series, Volume, Number and Date information
22
23	format DocumentVList "<td valign=top>[link][icon][/link]</td>
24	<td valign=top>{If}{[Series],[Series] {If}{[Volume],Vol. [Volume]} {If}{[Number],No. [Number]},[highlight]{Or}{[Title],[PageNum]}[/highlight]}</td>"
25
26	format CL1VList "<td valign=top>[link][icon][/link]</td>
27	<td valign=top>{If}{[numleafdocs],[Title],{If}{[Volume],Vol. [Volume]} {If}{[Number],No. [Number]} ([format:Date])}</td>"
28
29	format SearchVList "<td valign=top>[link][icon][/link]</td>
30	<td valign=top>[parent(Top):Series] {If}{[parent(Top):Volume],Vol. [parent(Top):Volume]} {If}{[parent(Top):Number],No. [parent(Top):Number]} Page [Title]</td>"
31
32	format DateList "<td valign=top>[link][icon][/link]</td>
33	<td valign=top>[Series] {If}{[Volume],Vol. [Volume]} {If}{[Number],No. [Number]}</td>"
34
35	format HList "[link][highlight][ex.Title][/highlight][/link]"
36
37	# We customise the document display, so use the extended options
38	format AllowExtendedOptions true
39
40	# We want to add in fullsize/preview/text buttons to switch between the
41	# different versions of each page
42
43	format DocumentHeading "<center><table width=_pagewidth_>
44	<tr valign=top><td>{Or}{[parent(Top):Series],[Series]}</td></tr>
45	<tr valign=top><td><table><tr><td>
46	[DocumentButtonDetach][DocumentButtonHighlight]
47	{If}{_cgiargp_ eq 'fullsize',{If}{[screenicon],_document:viewpreview_}
48	{If}{[NoText] eq \'1\',,_document:viewtext_},
49	{If}{_cgiargp_ eq 'preview',{If}{[srcicon],_document:viewfullsize_}
50	{If}{[NoText] eq \'1\',,_document:viewtext_},
51	{If}{[srcicon],_document:viewfullsize_}
52	{If}{[screenicon],_document:viewpreview_}}}
53	</td></tr></table></td>
54	<td>[DocTOC]</td></tr></table></center>"
55
56	# Document text display changes based on the p argument - this is not used
57	#normally for document display, so we can use it here to switch between
58	#fullsize/preview/text versions.
59	format DocumentText "<center><table width=_pagewidth_><tr><td>
60	{If}{_cgiargp_ eq \'fullsize\',[srcicon],
61	{If}{_cgiargp_ eq \'preview\',[screenicon],{If}{[NoText] eq \'1\',,[Text]}}}
62	</td></tr></table></center>"
63
64
65	# -- English strings --------------------
66	collectionmeta collectionname [l=en] "Paged Image example"
67	collectionmeta .section:text [l=en] "newspaper pages"
68
69	# -- Spanish strings --------------------
70	collectionmeta collectionname [l=es] "Ejemplo de imÃ¡genes paginadas"
71	collectionmeta .section:text [l=es] "pÃ¡ginas de diario"
72
73	# -- English text -----------------------
74
75	collectionmeta collectionextra [l=en] "This collection contains a few newspapers from the
76	<a href='http://www.nzdl.org/cgi-bin/library?a=p&p=about&c=niupepa'>
77	Niupepa</a> collection of Maori newspapers.
78
79	<h3>How the collection works</h3>
80	<p>Each newspaper issue consists of a set of images, one per page, and a set
81	of text files for the OCR'd text. An item file links the set of pages into a
82	single newspaper document. PagedImagePlugin is used to process the item files.
83	<p>There are two styles of item files, and this collection demonstrates both.
84	The first uses a text based format, and consists of a list of metadata for the
85	document, and a list of pages. Here are some examples:
86	<a href='_httpcollection_/import/09/09\_1\_1.item'>Te Waka o Te Iwi, Vol. 1, No. 1</a>,
87	<a href='_httpcollection_/import/10/10\_1\_3.item'>Te Whetu o Te Tau, Vol. 1, No. 3</a>.
88	This format allows specification of document level metadata, and a single list of pages.
89	<p>The second style is an extended format, and uses XML. It allows a hierarchy
90	of pages, and metadata specification at the page level as well as at the
91	document level. An example is <a href='_httpcollection_/import/xml/23/23\_\_2.item'>Matariki 1881, No. 2</a>.
92	This newspaper also has an abstract associated with it. The contents have been
93	grouped into two sections: Supplementary Material, which contains the Abstract,
94	and Newspaper Pages, which contains the page images.
95	<p>Paged documents can be presented with a hierarchical table of contents
96	(e.g. <a href='?a=d&c=_cgiargc_&d=23\_\_1.2.1&p=text'>this one</a>),
97	or with next and previous page arrows, and a goto page box
98	(e.g. <a href='?a=d&c=_cgiargc_&d=10\_1\_2&p=preview'>this one</a>).
99	This is specified by the <tt>-documenttype (hierarchy\|paged)</tt> option to PagedImagePlugin.
100	The next and previous arrows suit the linear sequence documents, while the table of contents
101	suits the hierarchically organised document. Ordinarily, a Greenstone collection
102	would have one plugin per document type, and all documents of that type get
103	the same processing. In this case, we want to treat the XML-based item files
104	differently from the text-based item files. We can achieve this by adding two
105	PagedImagePlugin plugins to the collection, and configuring them differently.
106	<p><tt>plugin PagedImagePlugin -documenttype hierarchy -process_exp xml.*\.item$ <br/>
107	plugin PagedImagePlugin -documenttype paged </tt>
108
109	<p>XML based newpapers have been grouped into a folder called <tt>xml</tt>.
110	This enables us to process these files differently, by utilising the
111	<tt>process_exp</tt> option which all plugins support. The first PagedImagePlugin
112	in the list looks for item files underneath the xml folder. These documents
113	will be processed as hierarchical documents. Item files that don't match the
114	process expression (i.e. aren't underneath the xml folder) will be passed onto
115	the second PagedImagePlugin, and these are treated as paged documents.
116
117	<p><b>Formatting</b>
118	<p>We have modified the document formatting to display fullsized images,
119	preview images or text, with buttons to switch between them. This involves
120	modifications to the DocumentHeading and DocumentText format statements in the
121	<a href='_httpcollection_/etc/collect.cfg' target=\'collect.cfg\'>collection configuration file</a>,
122	and some macro definitions in the <a href='_httpcollection_/macros/extra.dm' target=\'extra.dm\'>extra.dm macro file</a>.
123	The extra.dm macro file provides definitions for the buttons (\_viewfullsize\_,
124	\_viewpreview\_, \_viewtext\_) which are used by the format statement in the
125	collect.cfg file. The format statement switches the document display and sets
126	the buttons to be displayed based on the p argument, which is also set by the
127	format statement.
128	"
129
130	# -- Spanish text -----------------------
131	collectionmeta collectionextra [l=es] "Esta colecciÃ³n contiene algunos diarios de la colecciÃ³n
132	<a href='http://www.nzdl.org/cgi-bin/library?a=p&p=about&c=niupepa'>
133	Niupepa</a> de periÃ³dicos Maories.
134
135	<h3>CÃ³mo funciona la colecciÃ³n</h3>
136	<p>Cada diario consiste en un conjunto de imÃ¡genes, una por pÃ¡gina, y un conjunto de archivos de texto provenientes del OCR. Un archivo .item relaciona al conjunto de pÃ¡ginas en un Ãºnico documento de diario. PagedImagePlugin se utiliza para procesar esos archivos .item.
137	<p>Hay dos estilos para escribir esos archivos item. y esta colecciÃ³n demuestra ambos.
138	El primero usa un formato bÃ¡sico de texto, y consiste en una lista de metadatos para el documento, y una lista de pÃ¡ginas. AquÃ hay algunos ejemplos:
139	<a href='_httpcollection_/import/09/09\_1\_1.item'>Te Waka o Te Iwi, Vol. 1, No. 1</a>,
140	<a href='_httpcollection_/import/10/10\_1\_3.item'>Te Whetu o Te Tau, Vol. 1, No. 3</a>.
141	Este formato permite la especificaciÃ³n de metadatos a nivel de documento, y una lista simple de pÃ¡ginas.
142	<p>El segundo estilo es un formato extendido y usa XML. Permite una jerarquÃa de pÃ¡ginas, y una especificaciÃ³n de metadatos a nivel de documento como tambiÃ©n de pÃ¡ginas. Un ejemplo es <a href='_httpcollection_/import/xml/23/23\_\_2.item'>Matariki 1881, No. 2</a>.
143	Este diario tambiÃ©n tiene un resumen asociado a Ã©l. Los contenidos han sido agrupados en 2 secciones: Material Suplementario, la cual contiene el resumen, y PÃ¡ginas del Diario, que contiene las imÃ¡genes de las pÃ¡ginas.
144	<p>Los documentos paginados pueden presentarse con una tabla de contenidos jerÃ¡rquica
145	(por ej. <a href='?a=d&c=_cgiargc_&d=23\_\_1.2.1&p=text'>esta</a>),
146	o con flechas "Siguiente" y "Anterior" y un recuadro "Ir a la pÃ¡gina..."
147	(por ej. <a href='?a=d&c=_cgiargc_&d=10\_1\_2&p=preview'>esta</a>).
148	Esto es definido por la opciÃ³n <tt>-documenttype (hierarchy\|paged)</tt> asignada al plugin PagedImagePlugin.
149	Las flechas Siguiente y Anterior permiten seguir el documento de manera lineal, mientras que la tabla de contenidos muestra al documento organizado jerÃ¡rquicamente. Generalmente, una colecciÃ³n de Greenstone tendrÃa un plugin por cada tipo de documento y todos los documentos de ese mismo tipo tendrÃan el mismo procesamiento. En este caso, queremos tratar los archivos .item con formato XML de manera diferente a aquellos con formato de texto plano. Esto puede lograrse agregando dos plugin PagedImagePlugin a la colecciÃ³n, y configurÃ¡ndolo de manera diferente.
150	<p><tt>plugin PagedImagePlugin -documenttype hierarchy -process_exp xml.*\.item$ <br/>
151	plugin PagedImagePlugin -documenttype paged </tt>
152
153	<p>Los diarios basados en XML han sido agrupados en una carpeta llamada <tt>xml</tt>.
154	Esto nos permite procesar esos archivos de una manera diferente, utilizando la opciÃ³n <tt>process_exp</tt> que es soportada por todos los plugins. El primer plugin PagedImagePlugin
155	en la lista busca archivos .item que se encuentren en la carpeta xml. Estos documentos se procesarÃ¡n como documentos jerÃ¡rquicos. Los archivos .item que no coincidan con la expresiÃ³n de procesamiento (por ejemplo, los que no estÃ©n dentro de la carpeta xml) serÃ¡n pasados al segundo plugin PagedImagePlugin, y se tratarÃ¡n como documentos paginados.
156
157	<p><b>Formateo</b>
158	<p>Hemos modificado el formateo del documento para mostrar imÃ¡genes a tamaÃ±o completo, previsualizar imÃ¡genes o texto y botones para cambiar entre estas opciones. Esto involucra modificaciones a las cadenas de formateo del DocumentHeading y el DocumentText en el
159	<a href='_httpcollection_/etc/collect.cfg' target=\'collect.cfg\'>archivo de configuraciÃ³n de la colecciÃ³n.</a>,
160	y algunas definiciones de macros en el <a href='_httpcollection_/macros/extra.dm' target=\'extra.dm\'>archivo de macros extra.dm</a>.
161	El archivo extra.dm provee definiciones para los botones (\_viewfullsize\_,
162	\_viewpreview\_, \_viewtext\_) los cuales son usados por la sentencia de formateo en el archivo collect.cfg. La sentencia de formateo cambia la visualizaciÃ³n del documento y setea los botones que deben mostrarse basÃ¡ndose en el argumento p, el cual es configurado tambiÃ©n en la misma sentencia.
163	"

Note: See TracBrowser for help on using the repository browser.

Download in other formats: