Show
Ignore:
Timestamp:
13.03.2017 19:48:56 (3 years ago)
Author:
ak19
Message:

BUGFIX to servercontrol.pm. servercontrol::config() was merging stderr and stdout of wget command in order to work out response code, response message (both going to stderr) and html page's text string (goes to stdout) in order to parse the ping response. This worked fine all the times I'd tested it before, such as some months back when I tested the incremental build tutorial. But the merge of stderr and stdout failed today and showed how bad the idea to merge the two was: the very line in the HTML string from STDOUT that was being parsed and compared against an expected value, was interspersed with output from stderr. So the regex didn't match and ultimately the collection was assumed deactivated when activated and vice-versa. Two fixes attempted and committing the fix that worked: the wgeet command stores the downloaded HTML to a file named by timestamp and deleted as soon as read. The failed attempt was to use open3, but there were warnings in the perl online manual about the dangers blocking when attempting to read from stderr and stdout streams, and I'm not sure if this is what I encountered, but I decided against it and returned to using the successful file version of the fix.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/servercontrol.pm

    r31488 r31507  
    140140 
    141141    my $wget_file_path = &FileUtils::filenameConcatenate($ENV{'GSDLHOME'}, "bin", $ENV{'GSDLOS'}, "wget"); 
    142  
     142    my $tmpfilename = time . ".html"; # random name for file wherein we'll store the HTML page retrieved by wget 
     143     
    143144    # https://www.gnu.org/software/wget/manual/wget.html 
    144145    # output-document set to - (STDOUT), so page is streamed to STDOUT 
     
    147148    # Searching for "perl backtick operator redirect stderr to stdout": 
    148149    # http://www.perlmonks.org/?node=How%20can%20I%20capture%20STDERR%20from%20an%20external%20command%3F 
    149     $wgetCommand = "\"$wget_file_path\" --output-document=- -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1";    
    150     #$wgetCommand = "\"$wget_file_path\" --spider -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # won't save page 
     150    ##$wgetCommand = "\"$wget_file_path\" --spider -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # won't save page 
     151    #$wgetCommand = "\"$wget_file_path\" --output-document=- -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # THIS CAN MIX UP STDERR WITH STDOUT IN THE VERY LINE WE REGEX TEST AGAINST EXPECTED OUTPUT!! 
     152    $wgetCommand = "\"$wget_file_path\" --output-document=$tmpfilename -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # keep stderr (response code, response_content) separate from html page content 
     153     
    151154    ##print STDERR "@@@@ $wgetCommand\n"; 
    152155 
     
    185188         
    186189        # check the page content is as expected 
    187         my $resultstr = $response_content; 
     190        #my $resultstr = $response_content; 
     191         
     192        open(FIN,"<$tmpfilename") or die "servercontrol.pm: Unable to open $tmpfilename to read ping response page...ERROR: $!\n"; 
     193        my $resultstr; 
     194        # Read in the entire contents of the file in one hit 
     195        sysread(FIN, $resultstr, -s FIN);        
     196        close(FIN); 
     197        &FileUtils::removeFiles("$tmpfilename"); 
     198         
    188199        #$resultstr =~ s@.*gs_content\"\>@@s;   ## only true for default library servlet     
    189200        #$resultstr =~ s@</div>.*@@s;