source: other-projects/the-macronizer/trunk/README.md

Last change on this file was 35791, checked in by cstephen, 2 years ago

Add updated macroniser code. This is a significant change to the codebase:

  • Servlets now send JSON responses that are easier to consume from other services.
  • Error responses are better conveyed and more infomative.
  • Monogram components have been touched up. They now bubble errors up and, where applicable, implement relevant interfaces.
  • The JSP interface has been removed
  • The SQL logging functionality has been deleted. It wasn't used before.
  • Dependencies updated.
File size: 5.0 KB
Line 
1# Macroniser API Setup
2
31. Edit the HTML content of the `unauthorised` page so that the button redirects to your preferred location.
4
5 ```sh
6 nano src/main/webapp/webContent/unauthorised.html
7 ```
8
92. Compile and install the WAR file.
10
11 ```sh
12 > ant install
13 ```
14
153. Update the apache2 config with the relevant `ProxyPass` rules
16
17 ```sh
18 > sudo nano /etc/apache2/sites-enabled/000-default-le-ssl.conf
19 > sudo nano /etc/apache2/sites-enabled/000-default.conf
20 > sudo /etc/init.d/apache2 reload
21
22 // ADD THE FOLLOWING
23 ProxyPass /gs3-macroniser http://localhost:8383/gs3-macroniser
24 ProxyPassReverse /gs3-macroniser http://localhost:8383/gs3-macroniser
25 ```
26
274. If `403 Forbidden` errors are observed, update the CORS filter in `web.xml` to include your root domains, and re-install.
28
29 ```sh
30 > nano src/main/webapp/WEB-INF/web.xml
31
32 <filter>
33 <filter-name>CorsFilter</filter-name>
34 <filter-class>org.apache.catalina.filters.CorsFilter</filter-class>
35 <init-param>
36 <param-name>cors.allowed.origins</param-name>
37 -- <param-value>http://localhost:8080</param-value> <!-- Separate values by a comma -->
38 ++ <param-value>http://localhost:8080,http://atea.space,https://atea.space</param-value>
39 </init-param>
40 </filter>
41 ```
42
43### Usage outside of Tomcat/Greenstone3
44
45The Macroniser API was designed with Greenstone3 in mind. Hence, it expects an Apache Tomcat server and the ant installation task is specific to Greenstone3.
46If you need to use this API in a different environment, there are two steps you'll need to take:
47
481. If you'll not be hosting the servlet in Tomcat, update the class of the CORS filter in `web.xml`. An example for a Jetty server is provided.
49
502. Manually install the WAR file to your server. It can be generated with the ant `package` task, which outputs it to `./build/gs3-macroniser.war`.
51
52 ```sh
53 > ant package
54 ```
55
56If you'd like a quickstart in a non-greenstone environment, the `dev-deploy.tar.gz` archive contains standalone bundles of Ant and Jetty that you can utilise.
57
58# Consuming the API
59
60## Direct Macronisation
61
62- Endpoint: `/direct`
63- Method: `POST`
64- Request Content Type: `multipart/form-data`
65- Response Content Type: `application/json`
66
67The `direct` endpoint macronises and returns raw text.
68
69### Expected Form Parts
70
71Name | Type | Optional | Description
72--|--|--|--
73`fragment` | `string` | No | The fragment to macronise
74`preserveExistingMacrons` | `boolean` | Yes (default: `false`) | A value indicating whether or not existing macrons on the input text will be preserved.
75
76### Response Fields
77
78Name | Type | Optional | Description
79--|--|--|--
80`w` | `string` | Yes | A processed word word.
81`macronised` | `boolean` | Yes | A value indicating whether the associated `w` value was macronised.
82`linebreaks` | `number` | Yes | Indicates a number of linebreaks encountered in the original text.
83
84#### Example Response
85
86```json
87[
88 {
89 "w": "Whakaora"
90 },
91 {
92 "w": "Tohutō",
93 "macronised": true
94 },
95 {
96 "w": "Māori",
97 "macronised": true
98 },
99 {
100 "linebreaks": 3
101 }
102]
103```
104
105## File Macronisation
106
107- Endpoint: `/file`
108- Method: `POST`
109- Request Content Type: `multipart/form-data`
110- Response Content Type: `application/json`
111
112The `file` endpoint macronises a single file. The resulting macronised file is stored temporarily and can be retrieved using the `download` endpoint.
113The endpoint can be used multiple times to macronised more than one file.
114
115Supported formats:
116
117- Plain Text (.txt)
118- Microsoft Powerpoint (.pptx)
119- Microsoft Word (.docx)
120- Open Document Text (.odt)
121
122### Expected Form Parts
123
124Name | Type | Optional | Description
125--|--|--|--
126`charsetEncoding` | `string` | Yes (default: `utf8`) | The character set used by the file.
127`fileType` | `string` | Yes (default: extension of the submitted file) | The type of the submitted file (e.g. `.pptx`, `.txt`).
128`preserveExistingMacrons` | `boolean` | Yes (default: `false`) | A value indicating whether or not existing macrons on the input text will be preserved.
129`file` | File data | No | The file to macronise.
130
131### Response Fields
132
133Name | Type | Optional | Description
134--|--|--|--
135`fileName` | `string` | No | The name of the macronised file.
136`filePath` | `string` | No | The path to submit to the `download` endpoint in order to retrieve the file.
137`fileType` | `string` | No | The type of the macronised file (e.g. `.pptx`, `.txt`).
138
139#### Example Response
140
141```json
142{
143 "fileName": "macron-text.txt",
144 "filePath": "mi-tmp-7501722420086381909.txt",
145 "fileType": ".txt"
146}
147```
148
149## File Download
150
151- Endpoint: `/download`
152- Method: `GET`
153- Response Content Type: `application/x-download`
154
155The `download` endpoint provides access to files macronised via the `file` endpoint.
156
157### Expected Paramters
158
159Name | Type | Optional | Description
160--|--|--|--
161`filepath` | `string` | No | The path to retrieve the file at. Sent in the response from a call to the `file` endpoint.
162`fileType` | `string` | No | The name to use in the download response.
Note: See TracBrowser for help on using the repository browser.