source: trunk/gsdl3/TODO_michael@ 3742

Last change on this file since 3742 was 3742, checked in by mdewsnip, 21 years ago

Done MG wrapper (JNI).

  • Property svn:keywords set to Author Date Id Revision
File size: 11.0 KB
Line 
1MICHAEL'S LIST :-)
2
3/home/kjdon/public_html/michael (www.cs.waikato.ac.nz/~kjdon/michael) has some helpful docs - actually only one - the jni book.
4also see hints.tex and manual.tex in docs/manual - printed out too.
5
6* GS2MGPPSearch - currently for field searching, it uses a hard coded list of fields - this should be read in from the config file and be colleciton dependent. see createParameter, for the FIELD_FIELD_PARAM.
7
8* images from the server. currently we expect that they are on the same machine - the xslt puts in links to them. need a better way to get images that works remotely. so far do this for the colleciton icons. once document display is worked out better, do it for document images too.
9Then there is the generic associated resource retrieval. this will depend on document DTD and what xlinks we use for associated docs/resources. maybe the id of a resource is its http address if there is one, otherwise you get it through soap or in the xml.
10theres two routines in GSFile - base64EncodeFromFile() and base64DecodeToFile() which goes file <-> base64 string - can use these to put image etc into xml.
11
12* document display for document action. At the moment, a query gives a document id, which will be a Section id. A call is made to DocumentRetrieve with that id, and the text returned. It would be nice if the document display looked more like old greenstone - with table of contents for hierarchical docs, or prev and next buttons for paged documents. Some of this can be done using xslt. But you will need to get more information from the service, such as the document type, the table of contents hierarchy etc.
13Look at gs2 to see what info is needed, then work out a message format to retrieve that - perhaps parameters to the document retrieve service could specify whether you want table of contents etc. Or is that a new service with a separate request?
14
15* actual text output. The document text inside mgpp is html. This is added into an xml document as a text node. When you output the contents of the node, any < or > symbols are escaped by default - this means that all the html tags are escaped and appear on the web page - nice :-). You can turn this off using disable-output-escaping attribute. eg
16
17to get the value of a node normally:
18<xsl:value-of select="content/document"/>
19
20to disable the escaping:
21<xsl:value-of disable-output-escaping="yes" select="content/document"/>
22
23this puts (processing instructions??)
24
25<?javax.xml.transform.disable-output-escaping ?>
26
27<?javax.xml.transform.enable-output-escaping ?>
28
29tags around the text.
30
31If you transform some xml with xslt, and output it to a String directly (via the transformation process), and these tags are present, the html inside the text node will come out properly, unescaped.
32
33However, what I have done in the action code, is transform the page xml xhtml in a DOM tree form - then this is output to a String later by the XMLConverter object. This does not take any notice of the disable-blah tags, and the html is escaped. Internally it uses a TextWriter object from the xindice package. You need to either edit this class to take note of the instructions, or find something else to use instead.
34
35* add the Gatherer code to cvs.
36I've been meaning to do this for ages, but never got around to it. The code is in ...
37
38* dynamic xslt ie old format statements.
39in gs2 the collection builder can add format statements for search results, doc text, classifiers to the collect.cfg file. gs3 uses xslt for formatting pages.
40collection builders should be able to specify some xslt eg a single template for a document in a search result list, to the config file, to be passed to the actions and incorporated into teh stylesheet before processing the page.
41
42What I have done so far:
43In browse action, the code checks for the existence of a <stylesheet> element in the response:'
44
45Node new_style = GSXML.getChildByTagName(response, GSXML.STYLESHEET_ELEM);
46if (new_style !=null) {
47 GSXSLT.mergeStylesheets(style_doc, (Element)new_style);
48 response.removeChild(new_style);
49}
50
51If it finds it, it adds it in to the stylesheet which will be used for transforming the page (mergeStylesheets), and then removes it from the response.
52
53If a template is added with a higher priority, it will be used instead of the default one
54
55eg
56<xsl:template match="document" priority="3">
57
58I have tried this out with hardcoding the service to return a static xslt along with a classifier response and it works fine.
59
60TODO:
61
62For classifiers: edit the GS2Browse service agent to look in a config file for xslt fragments - should probably go in the collectionConfig.xml which isn't used yet. but for now can go in buildConfig.xml.
63
64Then it needs to pass the appropriate one back with a response. - they are classifier specific.
65have a look at classifier.xsl (in interfaces/default/transforms) to see how the templates are used. The ones you'd want to return are document and maybe node, but node is really complicated.
66
67you need to decide how to add it into the config file - element names etc and where to put it - inside the classifier element? in a <stylesheet> node or something different?
68
69Once that is working for classifiers, you can do the same thing for query search results and document text. You will need to do the action side as well for those, but that should just be a metter of cutting and pasting code.
70
71Another thing to think about: the browse action just looks for the stylesheet element in a normal response. but perhaps it should be a separate request sent to a service? do you have some xslt which you would like me to use?
72
73if it just comes back in a response from a different request, when doing eg a query, when does it come back? with the original query result? I guess you'd send a request to a query, and along with the document list, it could send back a stylesheet element if it wanted too.
74
75It may be cleaner to have a separate request, I dont know.
76
77* Display stuff vs metadata
78
79 When you get a service description, you also get a <display> element which has text strings for the service name, submit button, and any text needed for the parameters. (see query_messages)
80
81When you do a browse thingy, along with any classifier info you also get some metadata - Title (see browse_messages).
82
83Where do you draw the line between a display element and a metadata element?
84
85The full name of the service that is currently in the display could arguably be a metadata element.
86
87we need a consistent view of what is metadata and what is just a display element.
88
89all agents respond to a describe yourself message which retrieves metadata as part of the description.
90
91when the about page is produced for a collection or service cluster, a list of services are obtained, and currently their names are just displayed. We need their real names to be displayed. should this be gotten though a describe yourself message to each service and pick the name out of the display? or do a describe yourself, just for metadata and have the name returned as metadata instead?
92
93Note, describe requests have another attribute called info (I think), so you can narrow down the request eg
94<request type='describe' info='metadataList'>
95
96this works for the message router - is it implemented in all agents? (collections, service clusters and services?) I think it should be.
97
98ANd I think I prefer that service name eg "Query a collection" for TextQuery should be a metadata element.
99
100anyway, have a think about it.
101
102The main thing I would like out of this is for the service names eg TextQuery not to appear on the about pages, but are replaced with the metadata Title eg "Query a collection" which can then be changed depending on the users language.
103
104* add document for building
105
106The addDocument service (in GS2Construct) has not been implemented. It should take a file name and add the document to the import directory of teh collection.
107there are problems with just transmitting a file name - the service may live remotely and therefore the document is not there. You should probably send it attached to the html - therefore need to work out:
108how to get the document attached to the form (theres something to do with post and encoding = multipart ?? otherwise the browser just sends the filename),
109and then where to get it in the servlet - is it a parameter? or something else?
110
111and then it needs to be added into the xml request to be passed to the service.
112
113If the program is running locally its much simpler just to send a filename - can we somehow check for this?
114
115* also to do with building, a little harder, is the ConfigureCollection service. There is no stub for it yet, but easy enough to add one. - need to add this service to the service description xml stuff, and write a processConfigureColleciton() method. It would be easy enough to display the config file in a big text box, and have the user edit it like the collector does.
116
117the hard bit is that when you click ConfigureCollection, you dont know what collection you are going to be dealing with - all the building services, you select the collection on the service page. with the configure stuff, you need to select the collection, and then the config file needs to be retrieved. so its really a two step process to configure the coll - first select the coll, submit that, then edit the config file, and submit that.
118
119All the services currently are one step - need to think about how this type of service fits into the model.
120
121maybe it needs a hidden arg? - to tell teh service if you're at stage 1 or 2?
122when teh action does the request, it then asks for the service description again to redisplay it for the user. maybe if the service knows that it has done the first half, it sends the second type of description?
123
124* sequence of services
125
126some service clusters have services that you are supposed to carry out in sequence such as building, but there may be others.
127
128can we do a generic action or xslt or something that sends teh user to teh next service once they've completed the first one?
129
130Maybe teh service cluster/serviceRack class specifies teh sequence of services, and they are all handled individually like present except that some xslt puts a next button on each page with a link to teh next service in the list.
131
132* tidy up the setup.bash, install.bash stuff. see david. theres some things that need to be done once on install, and if you do them again your system craps out. but if you change what needs to be done in cvs, the user cant update their system without great knowledge of what needs to be done. other stuff like make needs to be done after every cvs update. there are two scripts install and update which attempts to solve part of this. setup.bash needs to be run multiple times and ends up adding stuff to teh path more than once.
133
134anyway, all this install stuff could be much nicer. If you really want to make it nice, you could look at doing a proper configure, and using --prefix etc so that its a proper package (see john also for help about this) - bin, lib etc would get put to their proper places not necessarily into gsld3/bin etc.
135
136* Lucene - svetlana has done a project comparing mg, mgpp and lucene. investigate further whether lucene would be good for us to use. incremental update?
137report on my desk.
Note: See TracBrowser for help on using the repository browser.