Changeset 31507

2017-03-13T19:48:56+13:00 (5 years ago)

BUGFIX to servercontrol::config() was merging stderr and stdout of wget command in order to work out response code, response message (both going to stderr) and html page's text string (goes to stdout) in order to parse the ping response. This worked fine all the times I'd tested it before, such as some months back when I tested the incremental build tutorial. But the merge of stderr and stdout failed today and showed how bad the idea to merge the two was: the very line in the HTML string from STDOUT that was being parsed and compared against an expected value, was interspersed with output from stderr. So the regex didn't match and ultimately the collection was assumed deactivated when activated and vice-versa. Two fixes attempted and committing the fix that worked: the wgeet command stores the downloaded HTML to a file named by timestamp and deleted as soon as read. The failed attempt was to use open3, but there were warnings in the perl online manual about the dangers blocking when attempting to read from stderr and stdout streams, and I'm not sure if this is what I encountered, but I decided against it and returned to using the successful file version of the fix.

1 edited


  • main/trunk/greenstone2/perllib/

    r31488 r31507  
    141141    my $wget_file_path = &FileUtils::filenameConcatenate($ENV{'GSDLHOME'}, "bin", $ENV{'GSDLOS'}, "wget");
     142    my $tmpfilename = time . ".html"; # random name for file wherein we'll store the HTML page retrieved by wget
    143144    #
    144145    # output-document set to - (STDOUT), so page is streamed to STDOUT
    147148    # Searching for "perl backtick operator redirect stderr to stdout":
    148149    #
    149     $wgetCommand = "\"$wget_file_path\" --output-document=- -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1";   
    150     #$wgetCommand = "\"$wget_file_path\" --spider -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # won't save page
     150    ##$wgetCommand = "\"$wget_file_path\" --spider -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # won't save page
     151    #$wgetCommand = "\"$wget_file_path\" --output-document=- -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # THIS CAN MIX UP STDERR WITH STDOUT IN THE VERY LINE WE REGEX TEST AGAINST EXPECTED OUTPUT!!
     152    $wgetCommand = "\"$wget_file_path\" --output-document=$tmpfilename -T 5 -t 1 \"$library_url$wgetCommand\" 2>&1"; # keep stderr (response code, response_content) separate from html page content
    151154    ##print STDERR "@@@@ $wgetCommand\n";
    186189        # check the page content is as expected
    187         my $resultstr = $response_content;
     190        #my $resultstr = $response_content;
     192        open(FIN,"<$tmpfilename") or die " Unable to open $tmpfilename to read ping response page...ERROR: $!\n";
     193        my $resultstr;
     194        # Read in the entire contents of the file in one hit
     195        sysread(FIN, $resultstr, -s FIN);       
     196        close(FIN);
     197        &FileUtils::removeFiles("$tmpfilename");
    188199        #$resultstr =~ s@.*gs_content\"\>@@s;   ## only true for default library servlet   
    189200        #$resultstr =~ s@</div>.*@@s;
Note: See TracChangeset for help on using the changeset viewer.