Changeset 32322

Show
Ignore:
Timestamp:
06.08.2018 19:08:14 (13 months ago)
Author:
ak19
Message:

Furter updates to SafeProcess? Readme, even though Dr Bainbridge has fixed the OpenOffice? extension that led to all these Readme updates.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/gli/src/org/greenstone/gatherer/util/Readme_Using_SafeProcess.txt

    r32321 r32322  
    3030    N. GLI vs GS3 CODE's SAFEPROCESS.JAVA - THE DIFFERENCES 
    3131    O. FUTURE WORK, IMPROVEMENTS 
     32    P. ALTERNATIVE FUTURE WORK: A case for rewriting SafeProcess to use IO::Select (Selector in Java) instead of multi-threading 
     33    Q. PUBLIC API OF SAFEPROCESS, ITS SUB CLASSES AND INTERFACES 
     34 
    3235 
    3336_______________________________________________________________________________ 
     
    926929_____________________________________________________________________________________________________________________ 
    927930 
    928 P. STREAM GOBBLER **THREADS** ARE ALLOWED TO BLOCK (JVM WILL TIMESLICE). CAUSE INTERRUPTEDEXCEPTIONS TO END BLOCKS 
     931P. ALTERNATIVE FUTURE WORK: A case for rewriting SafeProcess to use IO::Select (Selector in Java) instead of multi-threading 
    929932_____________________________________________________________________________________________________________________ 
    930933 
     934(a) STREAM GOBBLER **THREADS** ARE ALLOWED TO BLOCK (JVM WILL TIMESLICE). CAUSE INTERRUPTEDEXCEPTIONS TO END BLOCKS 
    931935In the InputStreamGobbler Thread classes for handling the proc.errstream and proc.outstream, the run() methods have a loop on readLine(): 
    932936 
     
    942946     // So must check BufferedReader.ready() before attempting readLine() on it 
    943947     // See https://stackoverflow.com/questions/15521352/bufferedreader-readline-blocks 
     948     // (also http://www.java-gaming.org/index.php?topic=37191.0) 
    944949     while ( !this.isInterrupted() && br.ready() && (line = br.readLine()) != null ) { ... 
    945950 
     
    981986 
    982987 
    983 ANOTHER IMPORTANT LESSON FROM ANDREW: 
     988(b) ANOTHER IMPORTANT LESSON FROM ANDREW: 
    984989Andrew Mackintosh further said that the read() block calls in worker Threads are far better in terms of not wasting processor power than a constantly active while loop in the worker Thread. 
    985 1. So doing this: 
     9901. So doing something like this: 
    986991 
    987992    while(!thisthread.isInterrupted() && br.read() ) // br.read() possibly blocks 
     
    10031008Other links: 
    10041009- EOF/EOS (End of Stream) is indicated by a return value of -1 for read and null for readLine(). 
    1005 https://stackoverflow.com/questions/36569875/how-does-bufferedreader-readline-handle-eof-or-slow-input 
    1006 https://stackoverflow.com/questions/3714090/how-to-see-if-a-reader-is-at-eof 
     1010- https://stackoverflow.com/questions/36569875/how-does-bufferedreader-readline-handle-eof-or-slow-input 
     1011- https://stackoverflow.com/questions/3714090/how-to-see-if-a-reader-is-at-eof 
     1012- https://arstechnica.com/civis/viewtopic.php?t=259678 
     1013 
     1014(c) I next tried reading a char at a time by calling BufferedReader.read(), but it blocked forever. With further testing, I confirmed that for the indefinitely blocking process stream (stdout in the use case where things went wrong), the bufferedreader and underlying stream is never ready(). 
     1015 
     1016Andrew then pointed me to the API for the BufferedReader.read(cbuf) variant at https://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#read(char[],%20int,%20int), which like the read() variant would return -1 on eof/eos. The API for read(cbuf) claimed that it would return the number of bytes read, -1 if eof/eos or return when ready() returns false. The latter was not true, and this overloaded method behaved the same as the read() variant that reads one char at a time: the buffer and underlying stream were never ready() and hence all variants of BufferedReader's reading methods always blocked. 
     1017 
     1018Following advice at https://www.experts-exchange.com/questions/20083639/BufferedReader-not-ready.html and elsewhere, I tried reading directly from InputStream a.o.t. going through BufferedReader, and tested the InputStream.available() (equivalent of ready()) before attempting to read() from the stream,  
     1019 
     1020- https://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#available() 
     1021- https://stackoverflow.com/questions/3695372/what-does-inputstream-available-do-in-java 
     1022 
     1023(Further, the example of InterruptibleReadline at https://stackoverflow.com/questions/3595926/how-to-interrupt-bufferedreaders-readline is buggy and doesn't work when corrected: it's the same problem.) 
     1024 
     1025 
     1026I finally came to the realisation our threaded SafeProcess would never work, and the references on the web to using java.nio when you want to do non-blocking IO started to make sense. 
     1027I had considered and attempted setting a timeout to help detect when a stream might block forever, versus when a stream would eventually get data. The classes of java.nio were again recommended when you want to work with timeouts during IO operations. 
     1028 
     1029 
     1030(d) java.nio - stands for New IO 
     1031Thinking it stood for non-blocking IO, I googled for: 
     1032   java nio non blocking io 
     1033to confirm whether this was so, and not that nio stood for "native IO". The first search result was: 
     1034 
     1035"Blocking vs. Non-blocking IO. Java IO's various streams are blocking. That means, that when a thread invokes a read() or write() , that thread is blocked until there is some data to read, or the data is fully written. The thread can do nothing else in the meantime. Jun 23, 2014" 
     1036Java NIO vs. IO - Jenkov Tutorials 
     1037tutorials.jenkov.com/java-nio/nio-vs-io.html" 
     1038 
     1039If only I had found information like this earlier. However, I may not have understood it or its significance at the time (before working on NetStinky). 
     1040- https://www.experts-exchange.com/questions/20083639/BufferedReader-not-ready.html 
     1041- https://www.quora.com/How-can-I-get-BufferedReader-to-only-wait-for-an-input-from-System-in-for-a-given-number-of-milliseconds 
     1042 
     1043There is a Selector class in Java.nio which works like perl and C++ IO::Select(). 
     1044 
     1045 
     1046Tutorials and further pages to read: 
     1047- http://tutorials.jenkov.com/java-nio/nio-vs-io.html (general comparison and when to use java.io vs java.nio) 
     1048- http://www.baeldung.com/java-nio-selector (the most useful for us) 
     1049 
     1050 
     1051Bookmarked: 
     1052- https://stackoverflow.com/questions/6619516/using-filechannel-to-write-any-inputstream 
     1053says that to get a channel for an inputstream, you do: 
     1054    Channels.newChannel(InputStream in) 
     1055    http://docs.oracle.com/javase/7/docs/api/java/nio/channels/Channels.html 
     1056- More information on channels: http://www.java2s.com/Tutorials/Java/Java_io/0930__Java_nio_Channels.htm     
     1057- Different kind of buffers: http://www.ntu.edu.sg/home/ehchua/programming/java/J5b_IO_advanced.html 
     1058- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/InterruptibleChannel.html (we want to use interruptible channels, so our channel must implement this) 
     1059- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/spi/AbstractSelectableChannel.html (we want to add our channel to a select, so our channel must implement this) 
     1060- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/Channels.html (factory for obtaining a channel) 
     1061- https://docs.oracle.com/javase/7/docs/api/java/nio/channels/ReadableByteChannel.html (object returned by doing Channels.newChannel(InputStream in) is of type ReadableByteChannel) 
     1062 
     1063I think Selector is the way to go to reimplement SafeProcess using Select to do non-blocking IO (with built in timeouts) in place of multi-threading. Therefore read through http://www.baeldung.com/java-nio-selector to refresh memory on behaviour of IO::select() in general and learn about how to do it in Java in particular. 
     1064 
     1065The idea is to move the multi-threaded SafeProcess to SafeProcess2.java and re-write SafeProces.java to use Java's IO::Select() (Selector). Then the classes that use SafeProcess would make the same calls as before. If we ever do go this route, see Section Q below for the public API of SafeProcess that needs to be reimplemented with java.nio/Selector. 
     1066 
     1067 
     1068General information: 
     1069https://stackoverflow.com/questions/355089/difference-between-stringbuilder-and-stringbuffer 
     1070 
     1071 
     1072(e) Dr Bainbridge fixed the problem at the source: in the OpenOfficeConverter.pm which was launching soffice in headless mode with the wrong command: when 2>&1 is further redirected to a file (or somewhere else like /dev/null), need to do the file redirect first and then 2>&1. 
     1073 
     1074So, "cmd >file 2>&1" is the right way, whereas "cmd 2>&1 >file" is the wrong way around. OpenOfficeConverter.pm was launching soffice the wrong way around and Dr Bainbridge fixed it. 
     1075 
     1076He says we don't need to pursue rewriting SafeProcess to use Selector, as the issue in GLI was ultimately due to an error in the command that was launched in perl. However, Dr Bainbridge said it was OK to commit the links I've found on using Selector and a description of the issue that led to considering Selector. 
     1077 
     1078 
     1079 
     1080_____________________________________________________________ 
     1081Q. PUBLIC API OF SAFEPROCESS, ITS SUB CLASSES AND INTERFACES 
     1082_____________________________________________________________ 
     1083WE ONLY CARE ABOUT NON-GOBBLER CLASSES (SINCE STREAMGOBBLERS ARE THREAD SPECIFIC) 
     1084 
     1085public class SafeProcess 
     1086    public static int DEBUG = 1; 
     1087 
     1088    public static final int STDERR = 0; 
     1089    public static final int STDOUT = 1; 
     1090    public static final int STDIN = 2; 
     1091    public static String WIN_KILL_CMD; 
     1092    public Boolean interruptible = Boolean.TRUE;  
     1093 
     1094 
     1095 public SafeProcess(String[] cmd_args) 
     1096 public SafeProcess(String cmdStr) 
     1097 public SafeProcess(String[] cmd_args, String[] envparams, File launchDir) 
     1098   
     1099 
     1100    public String getStdOutput() { return outputStr; } 
     1101    public String getStdError() { return errorStr; } 
     1102    public int getExitValue() { return exitValue; } 
     1103 
     1104  
     1105    public void setInputString(String sendStr)  
     1106    public void setExceptionHandler(ExceptionHandler exception_handler) 
     1107    public void setMainHandler(MainProcessHandler handler) 
     1108 
     1109     
     1110    public void setSplitStdOutputNewLines(boolean split) 
     1111    public void setSplitStdErrorNewLines(boolean split) 
     1112 
     1113    public boolean cancelRunningProcess() // calls cancelRunningProcess(!forceWaitUntilInterruptible); 
     1114    public synchronized boolean cancelRunningProcess(boolean forceWaitUntilInterruptible) 
     1115 
     1116 
     1117    public synchronized boolean processRunning() 
     1118 
     1119    public int runBasicProcess() 
     1120    public int runProcess() 
     1121    public int runProcess(CustomProcessHandler procInHandler, 
     1122               CustomProcessHandler procOutHandler, 
     1123               CustomProcessHandler procErrHandler) 
     1124    public int runProcess(LineByLineHandler outLineByLineHandler, LineByLineHandler errLineByLineHandler) 
     1125 
     1126 
     1127 
     1128// KEEP AS IS: 
     1129    public static long getProcessID(Process p) 
     1130    static void killWinProcessWithID(long processID) 
     1131    static boolean killUnixProcessWithID(long processID, boolean force, boolean killEntireTree) 
     1132    public static void destroyProcess(Process p) 
     1133 
     1134    // LOGGING - KEEP  
     1135    public static void log(String msg) 
     1136    public static void log(String msg, Exception e) 
     1137    public static void log(Exception e) 
     1138    public static void log(String msg, Exception e, boolean printStackTrace) 
     1139    public static String streamToString(int src) 
     1140 
     1141    // UTILITY FUNCTIONS - MOVE OUT OF CLASS INTO PACKAGE WHEN WE HAVE SAFEPROCESS AND SAFEPROCESS2 
     1142public static boolean closeResource(Closeable resourceHandle) 
     1143public static boolean closeSocket(Socket resourceHandle) 
     1144public static boolean closeProcess(Process prcs) 
     1145static public boolean processRunning(Process process) 
     1146 
     1147// Uses SafeProcess, but once that is working the new way can KEEP AS IS: 
     1148    public static boolean isAvailable(String program) 
     1149     
     1150 
     1151INTERNAL INTERFACES AND ABSTRACT CLASSES: 
     1152 
     1153public static interface ExceptionHandler 
     1154    public void gotException(Exception e) 
     1155 
     1156public static interface MainProcessHandler 
     1157    public boolean beforeWaitingForStreamsToEnd(boolean forciblyTerminating); 
     1158    public boolean afterStreamsEnded(boolean forciblyTerminating); 
     1159    public void beforeProcessDestroy();  
     1160    public void afterProcessDestroy(); 
     1161    public void doneCleanup(boolean wasForciblyTerminated); 
     1162 
     1163public static abstract class CustomProcessHandler 
     1164    public String getThreadNamePrefix() 
     1165    public abstract void run(Closeable stream); //InputStream or OutputStream 
     1166 
     1167 
     1168public static abstract class LineByLineHandler 
     1169    public String getThreadNamePrefix() 
     1170    public abstract void gotLine(String line); // first non-null line 
     1171    public abstract void gotException(Exception e); // for when an exception occurs instead of getting a line 
     1172 
     1173