Changeset 32322


Ignore:
Timestamp:
2018-08-06T19:08:14+12:00 (6 years ago)
Author:
ak19
Message:

Furter updates to SafeProcess Readme, even though Dr Bainbridge has fixed the OpenOffice extension that led to all these Readme updates.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/gli/src/org/greenstone/gatherer/util/Readme_Using_SafeProcess.txt

    r32321 r32322  
    3030    N. GLI vs GS3 CODE's SAFEPROCESS.JAVA - THE DIFFERENCES
    3131    O. FUTURE WORK, IMPROVEMENTS
     32    P. ALTERNATIVE FUTURE WORK: A case for rewriting SafeProcess to use IO::Select (Selector in Java) instead of multi-threading
     33    Q. PUBLIC API OF SAFEPROCESS, ITS SUB CLASSES AND INTERFACES
     34
    3235
    3336_______________________________________________________________________________
     
    926929_____________________________________________________________________________________________________________________
    927930
    928 P. STREAM GOBBLER **THREADS** ARE ALLOWED TO BLOCK (JVM WILL TIMESLICE). CAUSE INTERRUPTEDEXCEPTIONS TO END BLOCKS
     931P. ALTERNATIVE FUTURE WORK: A case for rewriting SafeProcess to use IO::Select (Selector in Java) instead of multi-threading
    929932_____________________________________________________________________________________________________________________
    930933
     934(a) STREAM GOBBLER **THREADS** ARE ALLOWED TO BLOCK (JVM WILL TIMESLICE). CAUSE INTERRUPTEDEXCEPTIONS TO END BLOCKS
    931935In the InputStreamGobbler Thread classes for handling the proc.errstream and proc.outstream, the run() methods have a loop on readLine():
    932936
     
    942946     // So must check BufferedReader.ready() before attempting readLine() on it
    943947     // See https://stackoverflow.com/questions/15521352/bufferedreader-readline-blocks
     948     // (also http://www.java-gaming.org/index.php?topic=37191.0)
    944949     while ( !this.isInterrupted() && br.ready() && (line = br.readLine()) != null ) { ...
    945950
     
    981986
    982987
    983 ANOTHER IMPORTANT LESSON FROM ANDREW:
     988(b) ANOTHER IMPORTANT LESSON FROM ANDREW:
    984989Andrew Mackintosh further said that the read() block calls in worker Threads are far better in terms of not wasting processor power than a constantly active while loop in the worker Thread.
    985 1. So doing this:
     9901. So doing something like this:
    986991
    987992    while(!thisthread.isInterrupted() && br.read() ) // br.read() possibly blocks
     
    10031008Other links:
    10041009- EOF/EOS (End of Stream) is indicated by a return value of -1 for read and null for readLine().
    1005 https://stackoverflow.com/questions/36569875/how-does-bufferedreader-readline-handle-eof-or-slow-input
    1006 https://stackoverflow.com/questions/3714090/how-to-see-if-a-reader-is-at-eof
     1010- https://stackoverflow.com/questions/36569875/how-does-bufferedreader-readline-handle-eof-or-slow-input
     1011- https://stackoverflow.com/questions/3714090/how-to-see-if-a-reader-is-at-eof
     1012- https://arstechnica.com/civis/viewtopic.php?t=259678
     1013
     1014(c) I next tried reading a char at a time by calling BufferedReader.read(), but it blocked forever. With further testing, I confirmed that for the indefinitely blocking process stream (stdout in the use case where things went wrong), the bufferedreader and underlying stream is never ready().
     1015
     1016Andrew then pointed me to the API for the BufferedReader.read(cbuf) variant at https://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#read(char[],%20int,%20int), which like the read() variant would return -1 on eof/eos. The API for read(cbuf) claimed that it would return the number of bytes read, -1 if eof/eos or return when ready() returns false. The latter was not true, and this overloaded method behaved the same as the read() variant that reads one char at a time: the buffer and underlying stream were never ready() and hence all variants of BufferedReader's reading methods always blocked.
     1017
     1018Following advice at https://www.experts-exchange.com/questions/20083639/BufferedReader-not-ready.html and elsewhere, I tried reading directly from InputStream a.o.t. going through BufferedReader, and tested the InputStream.available() (equivalent of ready()) before attempting to read() from the stream,
     1019
     1020- https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available()
     1021- https://stackoverflow.com/questions/3695372/what-does-inputstream-available-do-in-java
     1022
     1023(Further, the example of InterruptibleReadline at https://stackoverflow.com/questions/3595926/how-to-interrupt-bufferedreaders-readline is buggy and doesn't work when corrected: it's the same problem.)
     1024
     1025
     1026I finally came to the realisation our threaded SafeProcess would never work, and the references on the web to using java.nio when you want to do non-blocking IO started to make sense.
     1027I had considered and attempted setting a timeout to help detect when a stream might block forever, versus when a stream would eventually get data. The classes of java.nio were again recommended when you want to work with timeouts during IO operations.
     1028
     1029
     1030(d) java.nio - stands for New IO
     1031Thinking it stood for non-blocking IO, I googled for:
     1032   java nio non blocking io
     1033to confirm whether this was so, and not that nio stood for "native IO". The first search result was:
     1034
     1035"Blocking vs. Non-blocking IO. Java IO's various streams are blocking. That means, that when a thread invokes a read() or write() , that thread is blocked until there is some data to read, or the data is fully written. The thread can do nothing else in the meantime. Jun 23, 2014"
     1036Java NIO vs. IO - Jenkov Tutorials
     1037tutorials.jenkov.com/java-nio/nio-vs-io.html"
     1038
     1039If only I had found information like this earlier. However, I may not have understood it or its significance at the time (before working on NetStinky).
     1040- https://www.experts-exchange.com/questions/20083639/BufferedReader-not-ready.html
     1041- https://www.quora.com/How-can-I-get-BufferedReader-to-only-wait-for-an-input-from-System-in-for-a-given-number-of-milliseconds
     1042
     1043There is a Selector class in Java.nio which works like perl and C++ IO::Select().
     1044
     1045
     1046Tutorials and further pages to read:
     1047- http://tutorials.jenkov.com/java-nio/nio-vs-io.html (general comparison and when to use java.io vs java.nio)
     1048- http://www.baeldung.com/java-nio-selector (the most useful for us)
     1049
     1050
     1051Bookmarked:
     1052- https://stackoverflow.com/questions/6619516/using-filechannel-to-write-any-inputstream
     1053says that to get a channel for an inputstream, you do:
     1054    Channels.newChannel(InputStream in)
     1055    http://docs.oracle.com/javase/7/docs/api/java/nio/channels/Channels.html
     1056- More information on channels: http://www.java2s.com/Tutorials/Java/Java_io/0930__Java_nio_Channels.htm   
     1057- Different kind of buffers: http://www.ntu.edu.sg/home/ehchua/programming/java/J5b_IO_advanced.html
     1058- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/InterruptibleChannel.html (we want to use interruptible channels, so our channel must implement this)
     1059- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/spi/AbstractSelectableChannel.html (we want to add our channel to a select, so our channel must implement this)
     1060- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/Channels.html (factory for obtaining a channel)
     1061- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/ReadableByteChannel.html (object returned by doing Channels.newChannel(InputStream in) is of type ReadableByteChannel)
     1062
     1063I think Selector is the way to go to reimplement SafeProcess using Select to do non-blocking IO (with built in timeouts) in place of multi-threading. Therefore read through http://www.baeldung.com/java-nio-selector to refresh memory on behaviour of IO::select() in general and learn about how to do it in Java in particular.
     1064
     1065The idea is to move the multi-threaded SafeProcess to SafeProcess2.java and re-write SafeProces.java to use Java's IO::Select() (Selector). Then the classes that use SafeProcess would make the same calls as before. If we ever do go this route, see Section Q below for the public API of SafeProcess that needs to be reimplemented with java.nio/Selector.
     1066
     1067
     1068General information:
     1069https://stackoverflow.com/questions/355089/difference-between-stringbuilder-and-stringbuffer
     1070
     1071
     1072(e) Dr Bainbridge fixed the problem at the source: in the OpenOfficeConverter.pm which was launching soffice in headless mode with the wrong command: when 2>&1 is further redirected to a file (or somewhere else like /dev/null), need to do the file redirect first and then 2>&1.
     1073
     1074So, "cmd >file 2>&1" is the right way, whereas "cmd 2>&1 >file" is the wrong way around. OpenOfficeConverter.pm was launching soffice the wrong way around and Dr Bainbridge fixed it.
     1075
     1076He says we don't need to pursue rewriting SafeProcess to use Selector, as the issue in GLI was ultimately due to an error in the command that was launched in perl. However, Dr Bainbridge said it was OK to commit the links I've found on using Selector and a description of the issue that led to considering Selector.
     1077
     1078
     1079
     1080_____________________________________________________________
     1081Q. PUBLIC API OF SAFEPROCESS, ITS SUB CLASSES AND INTERFACES
     1082_____________________________________________________________
     1083WE ONLY CARE ABOUT NON-GOBBLER CLASSES (SINCE STREAMGOBBLERS ARE THREAD SPECIFIC)
     1084
     1085public class SafeProcess
     1086    public static int DEBUG = 1;
     1087
     1088    public static final int STDERR = 0;
     1089    public static final int STDOUT = 1;
     1090    public static final int STDIN = 2;
     1091    public static String WIN_KILL_CMD;
     1092    public Boolean interruptible = Boolean.TRUE;
     1093
     1094
     1095 public SafeProcess(String[] cmd_args)
     1096 public SafeProcess(String cmdStr)
     1097 public SafeProcess(String[] cmd_args, String[] envparams, File launchDir)
     1098 
     1099
     1100    public String getStdOutput() { return outputStr; }
     1101    public String getStdError() { return errorStr; }
     1102    public int getExitValue() { return exitValue; }
     1103
     1104 
     1105    public void setInputString(String sendStr)
     1106    public void setExceptionHandler(ExceptionHandler exception_handler)
     1107    public void setMainHandler(MainProcessHandler handler)
     1108
     1109   
     1110    public void setSplitStdOutputNewLines(boolean split)
     1111    public void setSplitStdErrorNewLines(boolean split)
     1112
     1113    public boolean cancelRunningProcess() // calls cancelRunningProcess(!forceWaitUntilInterruptible);
     1114    public synchronized boolean cancelRunningProcess(boolean forceWaitUntilInterruptible)
     1115
     1116
     1117    public synchronized boolean processRunning()
     1118
     1119    public int runBasicProcess()
     1120    public int runProcess()
     1121    public int runProcess(CustomProcessHandler procInHandler,
     1122               CustomProcessHandler procOutHandler,
     1123               CustomProcessHandler procErrHandler)
     1124    public int runProcess(LineByLineHandler outLineByLineHandler, LineByLineHandler errLineByLineHandler)
     1125
     1126
     1127
     1128// KEEP AS IS:
     1129    public static long getProcessID(Process p)
     1130    static void killWinProcessWithID(long processID)
     1131    static boolean killUnixProcessWithID(long processID, boolean force, boolean killEntireTree)
     1132    public static void destroyProcess(Process p)
     1133
     1134    // LOGGING - KEEP
     1135    public static void log(String msg)
     1136    public static void log(String msg, Exception e)
     1137    public static void log(Exception e)
     1138    public static void log(String msg, Exception e, boolean printStackTrace)
     1139    public static String streamToString(int src)
     1140
     1141    // UTILITY FUNCTIONS - MOVE OUT OF CLASS INTO PACKAGE WHEN WE HAVE SAFEPROCESS AND SAFEPROCESS2
     1142public static boolean closeResource(Closeable resourceHandle)
     1143public static boolean closeSocket(Socket resourceHandle)
     1144public static boolean closeProcess(Process prcs)
     1145static public boolean processRunning(Process process)
     1146
     1147// Uses SafeProcess, but once that is working the new way can KEEP AS IS:
     1148    public static boolean isAvailable(String program)
     1149   
     1150
     1151INTERNAL INTERFACES AND ABSTRACT CLASSES:
     1152
     1153public static interface ExceptionHandler
     1154    public void gotException(Exception e)
     1155
     1156public static interface MainProcessHandler
     1157    public boolean beforeWaitingForStreamsToEnd(boolean forciblyTerminating);
     1158    public boolean afterStreamsEnded(boolean forciblyTerminating);
     1159    public void beforeProcessDestroy();
     1160    public void afterProcessDestroy();
     1161    public void doneCleanup(boolean wasForciblyTerminated);
     1162
     1163public static abstract class CustomProcessHandler
     1164    public String getThreadNamePrefix()
     1165    public abstract void run(Closeable stream); //InputStream or OutputStream
     1166
     1167
     1168public static abstract class LineByLineHandler
     1169    public String getThreadNamePrefix()
     1170    public abstract void gotLine(String line); // first non-null line
     1171    public abstract void gotException(Exception e); // for when an exception occurs instead of getting a line
     1172
     1173
Note: See TracChangeset for help on using the changeset viewer.