SAFEPROCESS README GS Java developers should use SafeProcess.java in place of directly using Java's Process class, unless any issues are discovered with SafeProcess.java hereafter. A tailored version of SafeProcess is found both in GLI and GS3's Java src code: - gli/src/org/greenstone/gatherer/util/SafeProcess.java - greenstone3/src/java/org/greenstone/util/SafeProcess.java _______________________________________________________________________________ WHY WE SHOULD GO THROUGH SAFEPROCESS INSTEAD OF USING JAVA'S PROCESS DIRECTLY _______________________________________________________________________________ It's easy to misuse Java's Process class, and this can result in unexpected behaviour and random errors that are hard to track down. Further, the older GLI and GS3 Java src code used to use Java's Process class in different and inconsistent ways, some worse than others. Going through one class, SafeProcess, provides consistency. So if it's buggy, you can fix it in one place. SafeProcess handles the internal Process' iostreams (input, output and error) properly using separate worker Threads, following http://www.javaworld.com/article/2071275/core-java/when-runtime-exec---won-t.html?page=2. SafeProcess handles this itself so that the code that uses SafeProcess doesn't need to deal with it, unless customising the processing of the Process' iostreams. _____________________________________________________ MODEL OF SAFEPROCESS THREADS AND INTERRUPT BEHAVIOUR _____________________________________________________ This section is especially IMPORTANT TO READ IF you're thinking of customising SafeProcess' handling of the internal Process' iostreams, since each of them are always handled in their own distinct worker threads, and you need to remember to deal with ThreadSafety issues. The primary process Thread: The main process internal to a SafeProcess instance is run in whatever thread the SafeProcess instance's runProcess() (or runBasicProcess()) methods are called. The SafeProcess instance internally keeps track of this thread, so that any cancel operation can send the InterruptedException to this primary thread. SafeProcess does NOT inherit from Thread, it is therefore not a Thread. It just runs its internal process in whatever Thread called SafeProcess's runProcess/runBasicProcess methods. A SafeProcess thread may further launch 0 or 3 additional worker threads, depending on whether runBasicProcess() or a runProcess() variant was called. In total, using SafeProcess involves - 1 thread (the primary thread) when runBasicProcess() is called - or 4 threads (the primary thread + 3 worker threads) when any variant of runBasicProcess() is called. If a variant of runProcess() was called, then 3 additional worker threads are spawned by SafeProcess. One for each iostream of the Process: inputstream, outputstream and errorstream. Note that the SafeProcess class will read from the internal Process' outputStream and ErrorStream, and can send/write a string to the Process' inputstream. SafeProcess thread | | ________________|_______________________________ | | | | | | | | | | | | Worker Thread SafeProcess Worker Thread Worker Thread for Process' Thread for Process' for Process' InputStream (primary) OutputStream ErrorStream __________________________________________________ USAGE: DEBUGGING __________________________________________________ There's a lot of info that a SafeProcess instance will log during its execution and termination phases, including Exceptions but also very basic things, such as the command that the SafeProcess instance is running. By default, such debugging-related logging is turned off. You can turn it on by adjusting the static SafeProcess.DEBUG = 1 and recompiling. For GS3, the log output uses log4j. For GLI, it goes to System.err at present. But if you want it to go to GLI's DebugStream, uncomment the relevant lines in the 3 variants of the static log() functions in SafeProcess.java. Then recompile. If you're using SafeProcess yourself and wish to log debug statements that are related to how well you're using SafeProcess, you can call one of the static SafeProcess.log() functions yourself. Remember to set the static SafeProcess.DEBUG = 1. __________________________________________________ USAGE: INSTANTIATION & DEFAULT RUNPROCESS VARIANTS __________________________________________________ Usage of the SafeProcess class generally follows the following sequence: 1. instantiate a SafeProcess object using one of the constructors - public SafeProcess(String[] cmd_args) - public SafeProcess(String cmdStr) - public SafeProcess(String[] cmd_args, String[] envparams, File launchDir) Either or both of envparams and launchdir can be null. 2. optionally configure your SafeProcess instance. Use an appropriate setter method to set some additional fields: - public void setInputString(String sendStr) Call this if you wish to write any string to the process' inputstream - public void setSplitStdOutputNewLines(boolean split) - public void setSplitStdErrorNewLines(boolean split) Pass in true to either of the above methods if you want to preserve individual lines in the content retrieved from the internal Process' stderr and stdoutput. By default, lines are not split, so you get a String of one single line for the entire output from the Process' stdoutput, and a String of one single line from the Process' stderr output. - public void setExceptionHandler(SafeProcess.ExceptionHandler exception_handler) Use this to register a SafeProcess ExceptionHandler whose gotException() method will get called for each exception encountered during the *primary* thread's execution: the thread that runs the Process. If you wish to handle exceptions that could happen when reading from the Process' outputStream/errorStream, you will need to do that separately. See the section CUSTOMISING. By default, exceptions when reading from Process' output and errorStreams are logged. - public void setMainHandler(SafeProcess.MainProcessHandler handler) To set a handler that will handle the primary (SafeProcess) thread, your handler will need to implement the hooks that will get called during the internal Process' life cycle, such as before and after process.destroy() is called upon a cancel operation. 3. call one of the following runProcess() variants on your SafeProcess instance: a. public int runBasicProcess() This variant will not handle the Process' iostreams at all, so it won't launch any worker threads. It merely runs the internal process in the thread from which it is called (the primary thread) and waits until it's done. Use this variant if you know that the external process you wish SafeProcess to run will NOT be expecting any input nor be producing any output (in stdout or stderror). NOTE: Do not call runBasicProcess() if you merely wish to ignore the process' iostreams. Because, in Java 6 and earlier, this can block indefinitely if any of the Process' iostreams contain anything or expect anything. If you wish to ignore a process' iostreams, or if you're not sure if they may need handling, call the zero-argument runProcess() variant described below. b. public int runProcess() This zero argument variant will handle all the Process' iostreams in the DEFAULT manner. - If you wanted to send something to the Process' inputstream, you ought to have configured this by calling setInputString(sendStr) before calling runProcess(). The DEFAULT behaviour is for sendStr to be written to the internal Process' inputstream as soon as the process runs (happens in its own thread). - once runProcess() finishes, you can inspect what had come out of the Process' output and error streams by calling, see point 4 below. The DEFAULT behaviour of SafeProcess' processing of the stdout and stderr streams is for both streams to be read one line at a time until they have nothing left. Note that both streams are always dealt with in their separate worker threads. Resources are safely allocated at the end of the worker threads dealing with each of the internal Process' iostream, regardless of whether the Process is allowed to terminate naturally or whether SafeProcess is instructed to prematurely terminate it. c. public int runProcess(SafeProcess.LineByLineHandler outLineByLineHandler, SafeProcess.LineByLineHandler errLineByLineHandler) Use this variant if you want to do anything specific when each line comes in from the internal Process' stderr and stdout streams. Passing in null for either parameter will return to the default behaviour for that stream. You'll want to read further details in the section on CUSTOMISING. d. public int runProcess(CustomProcessHandler procInHandler, CustomProcessHandler procOutHandler, CustomProcessHandler procErrHandler) Use this variant if you want to completely override the way the internal Process' iostreams are handled. Passing in null for any of the parameters will return to the default behaviour for that stream. You'll want to read further details in the section on CUSTOMISING. If you want to completely override the default behaviour of any of SafeProcess' iostream related worker threads (such as if you want to read a char at a time from the stderr stream and do something, instead of the default behaviour of reading a line at a time from it), then call this method. You need to pass in an *instance* of SafeProcess.CustomProcessHandler for *each* of the 3 iostreams, since they're running in separate threads and so can't share the same instance. You can pass in null for any of these, if you want the default SafeProcess handling to apply for that stream's worker thread. For any iostream's default handling that you want to override, however, you'd extend SafeProcess.CustomProcessHandler. You would need to take care of ThreadSafety yourself if you're extending a SafeProcess.CustomProcessHandler. You can maintain concurrency by using synchronization, for instance. See the section USEFUL SYNCHRONISATION NOTES. For EXAMPLES OF USE, see GS3's GS2PerlConstructor.java and GLI's DownloadJob.java. Further NOTES: All of the variants of the runProcess methods return the internal Process' exit value *after* the internal Process has completed. The exit value will be -1 if the process was prematurely terminated such as by a cancel operation, which would have interrupted the primary and subsidiary threads. Running a process blocks the primary thread until complete. So all the runProcess variants above *block*. (Unless and until an external thread calls the cancelRunningProcess() method on a reference to the SafeProcess instance, discussed in the section on CANCELLING). 4. if necessary, once the process has *terminated*, - you can read whatever came out from the Process' stderr or stdout by calling whichever is required of: public String getStdOutput() public String getStdError() If you want to do anything special when handling any of the Process' iostreams, see the CUSTOMISING section. - You can also inspect the exit value of running the process by calling public int getExitValue() The exit value will be -1 if the process was prematurely terminated such as by a cancel operation, which would have interrupted the primary and subsidiary threads. The exit value will also be -1 if you call getExitValue() before setting off the SafeProcess via any of the runProcess() variants. _______________________________________________________________________ INTERNAL IMPLEMENTATION DETAILS - USEFUL IF CUSTOMISING OR CANCELLING _______________________________________________________________________ doRuntimeExec waitForStreams() joins() discussion __________________________________________________ USAGE: CUSTOMISING __________________________________________________ This section deals with if you want to provide customise behaviour for any of the worker threads that deal with the internal Process object's iostreams, or if you want to do some extra work at any major milestone in the outer SafeProcess instance's life cycle. CUSTOMISING THE BEHAVIOUR OF THE WORKER THREADS THAT HANDLE THE INTERNAL PROCESS' IOSTREAMS: Always bear in mind that each of the streams is handled in its own separate worker thread! So you'll need to take care of ThreadSafety yourself, if you're implementing a SafeProcess.LineByLineHandler or SafeProcess.CustomProcessHandler. You can maintain concurrency by using synchronization, for instance. See the section USEFUL SYNCHRONISATION NOTES. Customising the worker threads can be done in one of two ways: - you extend SafeProcess.LineByLineHandler (for the Process' stderr/stdout streams you're interested in) and then call the matching runProcess() method - if you want to do something drastically different, you extend SafeProcess.CustomProcessHandler (for any of the internal Process' iostreams you're interested in) and then call the matching runProcess() method a. public int runProcess(SafeProcess.LineByLineHandler outLineByLineHandler, SafeProcess.LineByLineHandler errLineByLineHandler) If you want to deal with EACH LINE coming out of the internal Process' stdout and/or stderr streams yourself, implement SafeProcess.LineByLineHandler. You may want a different implementation if you want to deal with the Process' stdout and stderr streams differently, or the same implementation if you want to deal with them identically. Either way, you need a *separate* SafeProcess.LineByLineHandler instance for each of these two streams, since they're running in separate threads. Pass the distinct instances in to the runProcess method as parameters. You can pass null for either parameter, in which case the default behaviour for that iostream takes place, as described for the zero-argument runProcess() variant. Writing your own SafeProcess.LineByLineHandler: - extend SafeProcess.LineByLineHandler. You can make it a static inner class where you need it. - make the constructor call the super constructor, passing in one of SafeProcess.STDERR|STDOUT|STDIN, to indicate the specific iostream for any instantiated instance. - For debugging, in order to identify what stream (stderr or stdout) you're working with, you can print the thread's name by calling the CustomProcessHandler's method public String getThreadNamePrefix() - Provide an implementation for the LineByLineHandler methods public void gotLine(String line) public void gotException(Exception e); Whenever the worker thread gets a line from that outputstream (stderr or stdout) of the internal process, it will call that stream's associated gotLine(line) implemention. Whenever the worker thread encounters an exception during the processing of that outputstream (stderr or stdout), it will call that stream's associated gotException(exception) implemention. Remember, you can use the same implementation (LineByLineHandler subclass) for both stderr and stdout outputstreams, but you must use different *instances* of the class for each stream! For EXAMPLES OF USE, see GLI's GShell.java. b. public int runProcess(CustomProcessHandler procInHandler, CustomProcessHandler procOutHandler, CustomProcessHandler procErrHandler) If you want to completely override the default behaviour of any of SafeProcess' iostream related worker threads (such as if you want to read a char at a time from the stderr stream and do something, instead of the default behaviour of reading a line at a time from it), then call this method. You need to pass in an *instance* of SafeProcess.CustomProcessHandler for *each* of the 3 iostreams, since they're running in separate threads and so can't share the same instance. You can pass in null for any of these, if you want the default SafeProcess handling to apply for that stream's worker thread. For any iostream's default handling that you want to override, however, you'd implement SafeProcess.CustomProcessHandler. Writing your own SafeProcess.CustomProcessHandler: - extend SafeProcess.CustomProcessHandler. You can make it a static inner class where you need it. - make the constructor call the super constructor, passing in one of SafeProcess.STDERR|STDOUT|STDIN, to indicate the specific iostream for any instantiated instance. - implement the following method of CustomProcessHandler public abstract void run(Closeable stream); Start by first casting the stream to InputStream, if the CustomProcessHandler is to read from the internal Process' stderr or stdout streams, else cast to Outputstream if CustomProcessHandler is to write to the internal Process' stdin stream. - Since you're implementing the "run()" method of the worker Thread for that iostream, you will write the code to handle any exceptions yourself - For debugging, in order to identify what stream you're working with, you can print the thread's name by calling the CustomProcessHandler's method public String getThreadNamePrefix() Remember, you can use the same implementation (CustomProcessHandler subclass) for each of the internal process' iostreams, but you must use different *instances* of the class for each stream! For EXAMPLES OF USE, see GS3's GS2PerlConstructor.java and GLI's DownloadJob.java. ADDING CUSTOM HANDLERS TO HOOK INTO KEY MOMENTS OF THE EXECUTION OF THE PRIMARY THREAD: Remember that the primary thread is the thread in which the internal Process is executed in, which is whatever Thread the runProcess() method is called on your SafeProcess instance. If you just want to handle exceptions that may occur at any stage during the execution of the primary thread - provide an implementation of SafeProcess.ExceptionHandler: public static interface ExceptionHandler { public void gotException(Exception e); } - configure your SafeProcess instance by calling setExceptionHandler(SafeProcess.ExceptionHandler eh) on it *before* calling a runProcess() method on that instance. - For EXAMPLES OF USE, see GLI's GShell.java. If, during the execution of the primary thread, you want to do some complex things during a cancel operation, such as or before and after a process is destroyed, then - implement SafeProcess.MainProcessHandler (see below) - configure your SafeProcess instance by calling setMainHandler(MainProcessHandler mph) on it *before* calling a runProcess() method on that instance. - For EXAMPLES OF USE, see GLI's DownloadJob.java. public static interface MainProcessHandler { /** * Called before the streamgobbler join()s. * If not overriding, the default implementation should be: * public boolean beforeWaitingForStreamsToEnd(boolean forciblyTerminating) { return forciblyTerminating; } * When overriding: * @param forciblyTerminating is true if currently it's been decided that the process needs to be * forcibly terminated. Return false if you don't want it to be. For a basic implementation, * return the parameter. * @return true if the process is still running and therefore still needs to be destroyed, or if * you can't determine whether it's still running or not. Process.destroy() will then be called. * @return false if the process has already naturally terminated by this stage. Process.destroy() * won't be called, and neither will the before- and after- processDestroy methods of this class. */ public boolean beforeWaitingForStreamsToEnd(boolean forciblyTerminating); /** * Called after the streamgobbler join()s have finished. * If not overriding, the default implementation should be: * public boolean afterStreamsEnded(boolean forciblyTerminating) { return forciblyTerminating; } * When overriding: * @param forciblyTerminating is true if currently it's been decided that the process needs to be * forcibly terminated. Return false if you don't want it to be. For a basic implementation, * return the parameter (usual case). * @return true if the process is still running and therefore still needs to be destroyed, or if * can't determine whether it's still running or not. Process.destroy() will then be called. * @return false if the process has already naturally terminated by this stage. Process.destroy() * won't be called, and neither will the before- and after- processDestroy methods of this class. */ public boolean afterStreamsEnded(boolean forciblyTerminating); /** * called after join()s and before process.destroy()/destroyProcess(Process), iff forciblyTerminating */ public void beforeProcessDestroy(); /** * Called after process.destroy()/destroyProcess(Process), iff forciblyTerminating */ public void afterProcessDestroy(); /** * Always called after process ended: whether it got destroyed or not */ public void doneCleanup(boolean wasForciblyTerminated); } __________________________________________________ USAGE: CANCELLING __________________________________________________ For EXAMPLES OF USE, see GLI's GShell.java and DownloadJob.java. __________________________________________________ USAGE: IMPLEMENTING HOOKS AROUND CANCELLING __________________________________________________ For EXAMPLES OF USE, see GLI's DownloadJob.java. __________________________________________________ USEFUL STATIC METHODS AND STATIC INNER CLASSES __________________________________________________ public static boolean isAvailable(String programName) - Runs the `which` cmd over the program and returns true or false `which` is included in winbin for Windows, and is part of unix systems static public boolean processRunning(Process process) - returns true if the Process is currently running and hasn't come to an end. - It does not public static long getProcessID(Process p) - uses java native access (JNA, version 4.1.0 from 2013). The JNA jar files have been included into gli's lib and GS3's lib. - For more details, see gli/lib/README.txt, section "B. THE jna.jar and jna-platform.jar FILES" public static void destroyProcess(Process p) - will terminate any subprocesses launched by the Process p - uses java native access to get processid - uses OS system calls to terminate the Process and any subprocesses public static boolean closeProcess(Process prcs) - will attempt to close your Process iostreams and destroy() the Process object at the end. public static boolean closeResource(Closeable resourceHandle) - will attempt to cleanly close your resource (file/stream handle), logging on Exception public static boolean closeSocket(Socket resourceHandle) - will attempt to cleanly close your Socket, logging on Exception. In Java 6, Sockets didn't yet implement Closeable, so this method is to ensure backwards compatibility public static class OutputStreamGobbler extends Thread - A class that can write to a Process' inputstream from its own Thread. public static class InputStreamGobbler extends Thread - Class that can read from a Process' output or error stream in its own Thread. - A separate instance should be created for each stream, so that each stream has its own Thread. Package access: static methods to prematurely terminate any process denoted by processID and any subprocesses it may have launched: static boolean killUnixProcessWithID(long processID) static void killWinProcessWithID(long processID) __________________________________________________ OTHER USEFUL INSTANCE METHODS __________________________________________________ public synchronized boolean processRunning() - returns true if the SafeProcess instance has started its internal Process or is still cleaning up on its termination public boolean cancelRunningProcess() - cancel the SafeProcess instance - untimately synchronized, so threadsafe __________________________________________________ USEFUL SYNCHRONISATION NOTES __________________________________________________