root/main/trunk/gli/src/org/greenstone/gatherer/util/Readme_Using_SafeProcess.txt @ 31711

Revision 31711, 23.4 KB (checked in by ak19, 2 years ago)

Prelinary stages of a README for developers wanting to use SafeProcess?.java

Line 
1SAFEPROCESS README
2
3GS Java developers should use SafeProcess.java in place of directly using Java's Process class, unless any issues are discovered with SafeProcess.java hereafter.
4
5
6A tailored version of SafeProcess is found both in GLI and GS3's Java src code:
7- gli/src/org/greenstone/gatherer/util/SafeProcess.java
8- greenstone3/src/java/org/greenstone/util/SafeProcess.java
9
10_______________________________________________________________________________
11WHY WE SHOULD GO THROUGH SAFEPROCESS INSTEAD OF USING JAVA'S PROCESS DIRECTLY
12_______________________________________________________________________________
13
14It's easy to misuse Java's Process class, and this can result in unexpected behaviour and random errors that are hard to track down. Further, the older GLI and GS3 Java src code used to use Java's Process class in different and inconsistent ways, some worse than others. Going through one class, SafeProcess, provides consistency. So if it's buggy, you can fix it in one place.
15
16SafeProcess handles the internal Process' iostreams (input, output and error) properly using separate worker Threads, following http://www.javaworld.com/article/2071275/core-java/when-runtime-exec---won-t.html?page=2.
17
18SafeProcess handles this itself so that the code that uses SafeProcess doesn't need to deal with it, unless customising the processing of the Process' iostreams.
19
20
21
22_____________________________________________________
23MODEL OF SAFEPROCESS THREADS AND INTERRUPT BEHAVIOUR
24_____________________________________________________
25
26This section is especially IMPORTANT TO READ IF you're thinking of customising SafeProcess' handling of the internal Process' iostreams, since each of them are always handled in their own distinct worker threads, and you need to remember to deal with ThreadSafety issues.
27
28The primary process Thread:
29The main process internal to a SafeProcess instance is run in whatever thread the SafeProcess instance's runProcess() (or runBasicProcess()) methods are called.
30The SafeProcess instance internally keeps track of this thread, so that any cancel operation can send the InterruptedException to this primary thread.
31
32SafeProcess does NOT inherit from Thread, it is therefore not a Thread. It just runs its internal process in whatever Thread called SafeProcess's runProcess/runBasicProcess methods.
33
34A SafeProcess thread may further launch 0 or 3 additional worker threads, depending on whether runBasicProcess() or a runProcess() variant was called.
35
36In total, using SafeProcess involves
37- 1 thread (the primary thread) when runBasicProcess() is called
38- or 4 threads (the primary thread + 3 worker threads) when any variant of runBasicProcess() is called. If a variant of runProcess() was called, then 3 additional worker threads are spawned by SafeProcess. One for each iostream of the Process: inputstream, outputstream and errorstream. Note that the SafeProcess class will read from the internal Process' outputStream and ErrorStream, and can send/write a string to the Process' inputstream.
39
40
41            SafeProcess thread
42                |
43                |
44        ________________|_______________________________
45        |       |       |       |
46        |       |       |       |
47        |       |       |       |
48    Worker Thread     SafeProcess     Worker Thread   Worker Thread
49    for Process'        Thread     for Process'    for Process'
50    InputStream    (primary)       OutputStream    ErrorStream
51
52
53__________________________________________________
54USAGE: DEBUGGING
55__________________________________________________
56
57There's a lot of info that a SafeProcess instance will log during its execution and termination phases, including Exceptions but also very basic things, such as the command that the SafeProcess instance is running.
58
59By default, such debugging-related logging is turned off. You can turn it on by adjusting the static SafeProcess.DEBUG = 1 and recompiling.
60
61For GS3, the log output uses log4j.
62For GLI, it goes to System.err at present. But if you want it to go to GLI's DebugStream, uncomment the relevant lines in the 3 variants of the static log() functions in SafeProcess.java. Then recompile.
63
64If you're using SafeProcess yourself and wish to log debug statements that are related to how well you're using SafeProcess, you can call one of the static SafeProcess.log() functions yourself. Remember to set the static SafeProcess.DEBUG = 1.
65
66__________________________________________________
67USAGE: INSTANTIATION & DEFAULT RUNPROCESS VARIANTS
68__________________________________________________
69
70Usage of the SafeProcess class generally follows the following sequence:
71
721. instantiate a SafeProcess object using one of the constructors
73    - public SafeProcess(String[] cmd_args)
74    - public SafeProcess(String cmdStr)
75    - public SafeProcess(String[] cmd_args, String[] envparams, File launchDir)
76    Either or both of envparams and launchdir can be null.
77
78
792. optionally configure your SafeProcess instance.
80Use an appropriate setter method to set some additional fields:
81
82    - public void setInputString(String sendStr)
83    Call this if you wish to write any string to the process' inputstream
84
85    - public void setSplitStdOutputNewLines(boolean split)
86    - public void setSplitStdErrorNewLines(boolean split)
87    Pass in true to either of the above methods if you want to preserve individual lines in the content retrieved from the internal Process' stderr and stdoutput.
88    By default, lines are not split, so you get a String of one single line for the entire output from the Process' stdoutput, and a String of one single line from the Process' stderr output.
89   
90
91    - public void setExceptionHandler(SafeProcess.ExceptionHandler exception_handler)
92    Use this to register a SafeProcess ExceptionHandler whose gotException() method will get called for each exception encountered during the *primary* thread's execution: the thread that runs the Process. If you wish to handle exceptions that could happen when reading from the Process' outputStream/errorStream, you will need to do that separately. See the section CUSTOMISING. By default, exceptions when reading from Process' output and errorStreams are logged.
93   
94    - public void setMainHandler(SafeProcess.MainProcessHandler handler)
95    To set a handler that will handle the primary (SafeProcess) thread, your handler will need to implement the hooks that will get called during the internal Process' life cycle, such as before and after process.destroy() is called upon a cancel operation.
96
97
983. call one of the following runProcess() variants on your SafeProcess instance:
99
100    a. public int runBasicProcess()
101
102This variant will not handle the Process' iostreams at all, so it won't launch any worker threads. It merely runs the internal process in the thread from which it is called (the primary thread) and waits until it's done.
103Use this variant if you know that the external process you wish SafeProcess to run will NOT be expecting any input nor be producing any output (in stdout or stderror).
104
105NOTE: Do not call runBasicProcess() if you merely wish to ignore the process' iostreams. Because, in Java 6 and earlier, this can block indefinitely if any of the Process' iostreams contain anything or expect anything. If you wish to ignore a process' iostreams, or if you're not sure if they may need handling, call the zero-argument runProcess() variant described below.
106
107    b. public int runProcess()
108
109This zero argument variant will handle all the Process' iostreams in the DEFAULT manner.
110
111- If you wanted to send something to the Process' inputstream, you ought to have configured this by calling setInputString(sendStr) before calling runProcess(). The DEFAULT behaviour is for sendStr to be written to the internal Process' inputstream as soon as the process runs (happens in its own thread).
112
113- once runProcess() finishes, you can inspect what had come out of the Process' output and error streams by calling, see point 4 below.
114The DEFAULT behaviour of SafeProcess' processing of the stdout and stderr streams is for both streams to be read one line at a time until they have nothing left. Note that both streams are always dealt with in their separate worker threads.
115
116Resources are safely allocated at the end of the worker threads dealing with each of the internal Process' iostream, regardless of whether the Process is allowed to terminate naturally or whether SafeProcess is instructed to prematurely terminate it.
117
118    c. public int runProcess(SafeProcess.LineByLineHandler outLineByLineHandler, SafeProcess.LineByLineHandler errLineByLineHandler)
119Use this variant if you want to do anything specific when each line comes in from the internal Process' stderr and stdout streams. Passing in null for either parameter will return to the default behaviour for that stream.
120You'll want to read further details in the section on CUSTOMISING.
121
122    d. public int runProcess(CustomProcessHandler procInHandler, CustomProcessHandler procOutHandler, CustomProcessHandler procErrHandler)
123Use this variant if you want to completely override the way the internal Process' iostreams are handled. Passing in null for any of the parameters will return to the default behaviour for that stream. You'll want to read further details in the section on CUSTOMISING.
124
125
126
127If you want to completely override the default behaviour of any of SafeProcess' iostream related worker threads (such as if you want to read a char at a time from the stderr stream and do something, instead of the default behaviour of reading a line at a time from it), then call this method.
128
129You need to pass in an *instance* of SafeProcess.CustomProcessHandler for *each* of the 3 iostreams, since they're running in separate threads and so can't share the same instance.
130
131You can pass in null for any of these, if you want the default SafeProcess handling to apply for that stream's worker thread. For any iostream's default handling that you want to override, however, you'd extend SafeProcess.CustomProcessHandler.
132
133You would need to take care of ThreadSafety yourself if you're extending a SafeProcess.CustomProcessHandler. You can maintain concurrency by using synchronization, for instance. See the section USEFUL SYNCHRONISATION NOTES.
134
135For EXAMPLES OF USE, see GS3's GS2PerlConstructor.java and GLI's DownloadJob.java.
136
137
138Further NOTES:
139All of the variants of the runProcess methods return the internal Process' exit value *after* the internal Process has completed. The exit value will be -1 if the process was prematurely terminated such as by a cancel operation, which would have interrupted the primary and subsidiary threads.
140
141Running a process blocks the primary thread until complete. So all the runProcess variants above *block*. (Unless and until an external thread calls the cancelRunningProcess() method on a reference to the SafeProcess instance, discussed in the section on CANCELLING).
142
143
1444. if necessary, once the process has *terminated*,
145
146- you can read whatever came out from the Process' stderr or stdout by calling whichever is required of:
147public String getStdOutput()
148public String getStdError()
149 
150If you want to do anything special when handling any of the Process' iostreams, see the CUSTOMISING section.
151
152- You can also inspect the exit value of running the process by calling
153public int getExitValue()
154
155The exit value will be -1 if the process was prematurely terminated such as by a cancel operation, which would have interrupted the primary and subsidiary threads.
156The exit value will also be -1 if you call getExitValue() before setting off the SafeProcess via any of the runProcess() variants.
157
158_______________________________________________________________________
159INTERNAL IMPLEMENTATION DETAILS - USEFUL IF CUSTOMISING OR CANCELLING
160_______________________________________________________________________
161
162doRuntimeExec
163waitForStreams()
164joins() discussion
165
166__________________________________________________
167USAGE: CUSTOMISING
168__________________________________________________
169
170This section deals with if you want to provide customise behaviour for any of the worker threads that deal with the internal Process object's iostreams, or if you want to do some extra work at any major milestone in the outer SafeProcess instance's life cycle.
171
172
173CUSTOMISING THE BEHAVIOUR OF THE WORKER THREADS THAT HANDLE THE INTERNAL PROCESS' IOSTREAMS:
174
175Always bear in mind that each of the streams is handled in its own separate worker thread!
176So you'll need to take care of ThreadSafety yourself, if you're implementing a SafeProcess.LineByLineHandler or SafeProcess.CustomProcessHandler. You can maintain concurrency by using synchronization, for instance. See the section USEFUL SYNCHRONISATION NOTES.
177
178Customising the worker threads can be done in one of two ways:
179- you extend SafeProcess.LineByLineHandler (for the Process' stderr/stdout streams you're interested in) and then call the matching runProcess() method
180- if you want to do something drastically different, you extend SafeProcess.CustomProcessHandler (for any of the internal Process' iostreams you're interested in) and then call the matching runProcess() method
181
182    a. public int runProcess(SafeProcess.LineByLineHandler outLineByLineHandler, SafeProcess.LineByLineHandler errLineByLineHandler)
183
184If you want to deal with EACH LINE coming out of the internal Process' stdout and/or stderr streams yourself, implement SafeProcess.LineByLineHandler. You may want a different implementation if you want to deal with the Process' stdout and stderr streams differently, or the same implementation if you want to deal with them identically. Either way, you need a *separate* SafeProcess.LineByLineHandler instance for each of these two streams, since they're running in separate threads. Pass the distinct instances in to the runProcess method as parameters.
185
186You can pass null for either parameter, in which case the default behaviour for that iostream takes place, as described for the zero-argument runProcess() variant.
187
188
189Writing your own SafeProcess.LineByLineHandler:
190- extend SafeProcess.LineByLineHandler. You can make it a static inner class where you need it.
191- make the constructor call the super constructor, passing in one of SafeProcess.STDERR|STDOUT|STDIN, to indicate the specific iostream for any instantiated instance.
192- For debugging, in order to identify what stream (stderr or stdout) you're working with, you can print the thread's name by calling the CustomProcessHandler's method
193    public String getThreadNamePrefix()
194- Provide an implementation for the LineByLineHandler methods
195    public void gotLine(String line)
196    public void gotException(Exception e);
197
198Whenever the worker thread gets a line from that outputstream (stderr or stdout) of the internal process, it will call that stream's associated gotLine(line) implemention.
199Whenever the worker thread encounters an exception during the processing of that outputstream (stderr or stdout), it will call that stream's associated gotException(exception) implemention.
200
201Remember, you can use the same implementation (LineByLineHandler subclass) for both stderr and stdout outputstreams, but you must use different *instances* of the class for each stream!
202
203For EXAMPLES OF USE, see GLI's GShell.java.
204
205
206    b. public int runProcess(CustomProcessHandler procInHandler, CustomProcessHandler procOutHandler, CustomProcessHandler procErrHandler)
207
208If you want to completely override the default behaviour of any of SafeProcess' iostream related worker threads (such as if you want to read a char at a time from the stderr stream and do something, instead of the default behaviour of reading a line at a time from it), then call this method.
209
210You need to pass in an *instance* of SafeProcess.CustomProcessHandler for *each* of the 3 iostreams, since they're running in separate threads and so can't share the same instance.
211
212You can pass in null for any of these, if you want the default SafeProcess handling to apply for that stream's worker thread. For any iostream's default handling that you want to override, however, you'd implement SafeProcess.CustomProcessHandler.
213
214
215Writing your own SafeProcess.CustomProcessHandler:
216- extend SafeProcess.CustomProcessHandler. You can make it a static inner class where you need it.
217- make the constructor call the super constructor, passing in one of SafeProcess.STDERR|STDOUT|STDIN, to indicate the specific iostream for any instantiated instance.
218- implement the following method of CustomProcessHandler
219    public abstract void run(Closeable stream);
220
221Start by first casting the stream to InputStream, if the CustomProcessHandler is to read from the internal Process' stderr or stdout streams,
222else cast to Outputstream if CustomProcessHandler is to write to the internal Process' stdin stream.
223- Since you're implementing the "run()" method of the worker Thread for that iostream, you will write the code to handle any exceptions yourself
224- For debugging, in order to identify what stream you're working with, you can print the thread's name by calling the CustomProcessHandler's method
225    public String getThreadNamePrefix()
226
227Remember, you can use the same implementation (CustomProcessHandler subclass) for each of the internal process' iostreams, but you must use different *instances* of the class for each stream!
228
229For EXAMPLES OF USE, see GS3's GS2PerlConstructor.java and GLI's DownloadJob.java.
230
231
232ADDING CUSTOM HANDLERS TO HOOK INTO KEY MOMENTS OF THE EXECUTION OF THE PRIMARY THREAD:
233
234Remember that the primary thread is the thread in which the internal Process is executed in, which is whatever Thread the runProcess() method is called on your SafeProcess instance.
235
236If you just want to handle exceptions that may occur at any stage during the execution of the primary thread
237- provide an implementation of SafeProcess.ExceptionHandler:
238    public static interface ExceptionHandler {
239
240        public void gotException(Exception e);
241    }
242- configure your SafeProcess instance by calling setExceptionHandler(SafeProcess.ExceptionHandler eh) on it *before* calling a runProcess() method on that instance.
243- For EXAMPLES OF USE, see GLI's GShell.java.
244
245If, during the execution of the primary thread, you want to do some complex things during a cancel operation, such as or before and after a process is destroyed, then
246- implement SafeProcess.MainProcessHandler (see below)
247- configure your SafeProcess instance by calling setMainHandler(MainProcessHandler mph) on it *before* calling a runProcess() method on that instance.
248- For EXAMPLES OF USE, see GLI's DownloadJob.java.
249
250    public static interface MainProcessHandler {
251
252        /**
253         * Called before the streamgobbler join()s.
254         * If not overriding, the default implementation should be:
255         * public boolean beforeWaitingForStreamsToEnd(boolean forciblyTerminating) { return forciblyTerminating; }
256         * When overriding:
257         * @param forciblyTerminating is true if currently it's been decided that the process needs to be
258         * forcibly terminated. Return false if you don't want it to be. For a basic implementation,
259         * return the parameter.
260         * @return true if the process is still running and therefore still needs to be destroyed, or if
261         * you can't determine whether it's still running or not. Process.destroy() will then be called.
262         * @return false if the process has already naturally terminated by this stage. Process.destroy()
263         * won't be called, and neither will the before- and after- processDestroy methods of this class.
264        */
265        public boolean beforeWaitingForStreamsToEnd(boolean forciblyTerminating);
266
267        /**
268         * Called after the streamgobbler join()s have finished.
269         * If not overriding, the default implementation should be:
270         * public boolean afterStreamsEnded(boolean forciblyTerminating) { return forciblyTerminating; }
271         * When overriding:
272         * @param forciblyTerminating is true if currently it's been decided that the process needs to be
273         * forcibly terminated. Return false if you don't want it to be. For a basic implementation,
274         * return the parameter (usual case).
275         * @return true if the process is still running and therefore still needs to be destroyed, or if
276         * can't determine whether it's still running or not. Process.destroy() will then be called.
277         * @return false if the process has already naturally terminated by this stage. Process.destroy()
278         * won't be called, and neither will the before- and after- processDestroy methods of this class.
279        */
280        public boolean afterStreamsEnded(boolean forciblyTerminating);
281
282        /**
283         * called after join()s and before process.destroy()/destroyProcess(Process), iff forciblyTerminating
284         */
285        public void beforeProcessDestroy();
286 
287        /**
288         * Called after process.destroy()/destroyProcess(Process), iff forciblyTerminating
289         */
290        public void afterProcessDestroy();
291
292        /**
293         * Always called after process ended: whether it got destroyed or not
294         */
295        public void doneCleanup(boolean wasForciblyTerminated);
296    }
297
298
299__________________________________________________
300USAGE: CANCELLING
301__________________________________________________
302
303For EXAMPLES OF USE, see GLI's GShell.java and DownloadJob.java.
304
305__________________________________________________
306USAGE: IMPLEMENTING HOOKS AROUND CANCELLING
307__________________________________________________
308
309For EXAMPLES OF USE, see GLI's DownloadJob.java.
310
311__________________________________________________
312USEFUL STATIC METHODS AND STATIC INNER CLASSES
313__________________________________________________
314
315public static boolean isAvailable(String programName)
316    - Runs the `which` cmd over the program and returns true or false
317    `which` is included in winbin for Windows, and is part of unix systems
318
319static public boolean processRunning(Process process)
320    - returns true if the Process is currently running and hasn't come to an end.
321    - It does not
322
323public static long getProcessID(Process p)
324    - uses java native access (JNA, version 4.1.0 from 2013). The JNA jar files have been included into gli's lib and GS3's lib.
325    - For more details, see gli/lib/README.txt, section "B. THE jna.jar and jna-platform.jar FILES"
326
327public static void destroyProcess(Process p)
328    - will terminate any subprocesses launched by the Process p
329    - uses java native access to get processid
330    - uses OS system calls to terminate the Process and any subprocesses
331
332public static boolean closeProcess(Process prcs)
333    - will attempt to close your Process iostreams and destroy() the Process object at the end.
334
335public static boolean closeResource(Closeable resourceHandle)
336    - will attempt to cleanly close your resource (file/stream handle), logging on Exception
337
338public static boolean closeSocket(Socket resourceHandle)
339    - will attempt to cleanly close your Socket, logging on Exception. In Java 6, Sockets didn't yet implement Closeable,
340    so this method is to ensure backwards compatibility
341
342public static class OutputStreamGobbler extends Thread
343    - A class that can write to a Process' inputstream from its own Thread.
344
345public static class InputStreamGobbler extends Thread
346    - Class that can read from a Process' output or error stream in its own Thread.
347    - A separate instance should be created for each stream, so that each stream has its own Thread.
348
349
350
351Package access:
352static methods to prematurely terminate any process denoted by processID and any subprocesses it may have launched:
353    static boolean killUnixProcessWithID(long processID)
354    static void killWinProcessWithID(long processID)
355
356
357__________________________________________________
358OTHER USEFUL INSTANCE METHODS
359__________________________________________________
360
361public synchronized boolean processRunning()
362    - returns true if the SafeProcess instance has started its internal Process or is still cleaning up on its termination
363
364public boolean cancelRunningProcess()
365    - cancel the SafeProcess instance
366    - untimately synchronized, so threadsafe
367__________________________________________________
368USEFUL SYNCHRONISATION NOTES
369__________________________________________________
370
Note: See TracBrowser for help on using the browser.