Changeset 31975 for main/trunk

Show
Ignore:
Timestamp:
19.09.2017 18:59:51 (2 years ago)
Author:
ak19
Message:

Another bugfix to downloading. Downloading over OAI wasn't working. Dr Bainbridge discovered that this was because OAIDownload.pm was still doublequoting filepaths and URLs too, whereas open3() launching the wget cmd can't handle quotes in its arguments. WgetDownload? used split to convert the cmd string into a cmd array. A clean solution was not passing WgetDownload::useWget() methods an array of cmd parameters (too involved and error prone to change all the calling code constructing the parameter cmd string), but to use the quotewords() method in place of split. This will preserve spaces in double quoted params in the cmd string, while splitting on spaces outside quoted strings. Then it also removes double quotes (and unescapes double backslashes). Tested on Mac: both OAI and Web downloading now work.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/downloaders/WgetDownload.pm

    r31957 r31975  
    4040use IO::Select; 
    4141use IO::Socket; 
     42use Text::ParseWords; # part of Core modules. Needed to use quotewords() subroutine 
    4243 
    4344#use IO::Select qw( ); 
     
    368369    # http://www.perlmonks.org/?node_id=394709 and that ends up causing problems in terminating wget, as 2 processes 
    369370    # got launched then which don't have parent-child pid relationship (so that terminating one doesn't terminate the other). 
    370     my @commandargs = split(' ', $cmdWget); 
    371     unshift(@commandargs, $wget_file_path); 
    372     $command = "$wget_file_path $cmdWget"; 
    373 #    print STDOUT "Command is: $command\n"; # displayed in GLI output 
    374 #    print STDERR "Command is: $command\n"; # goes into ServerInfoDialog 
     371 
     372    # remove leading and trailing spaces, https://stackoverflow.com/questions/4597937/perl-function-to-trim-string-leading-and-trailing-whitespace 
     373    $cmdWget =~ s/^\s+//; 
     374    $cmdWget =~ s/\s+$//; 
     375 
     376    # split on "words" 
     377    #my @commandargs = split(' ', $cmdWget); 
     378    # quotewords: to split on spaces except within quotes, then removes quotes and unescapes double backslash too  
     379      # https://stackoverflow.com/questions/19762412/regex-to-split-key-value-pairs-ignoring-space-in-double-quotes 
     380      # https://docstore.mik.ua/orelly/perl/perlnut/c08_389.htm 
     381    my @commandargs = quotewords('\s+', 0, $cmdWget); 
     382    unshift(@commandargs, $wget_file_path); # prepend the wget cmd 
     383    #print STDERR "Command is: ".join(",", @commandargs) . "\n"; # goes into ServerInfoDialog 
    375384     
    376385    # Wget's output needs to be monitored to find out when it has naturally terminated. 
     
    639648    my $wget_file_path = &FileUtils::filenameConcatenate($ENV{'GSDLHOME'}, "bin", $ENV{'GSDLOS'}, "wget"); 
    640649    # compose the command as an array for open3, to preserve spaces in any filepath 
    641     my @commandargs = split(' ', $cmdWget); 
    642     unshift(@commandargs, $wget_file_path); 
    643     my $command = "$wget_file_path $cmdWget"; 
    644     #print STDOUT "Command is: $command\n"; 
     650    # Do so by removing leading and trailing spaces, then splitting on "words" (preserving spaces in quoted words and removing quotes) 
     651    $cmdWget =~ s/^\s+//; 
     652    $cmdWget =~ s/\s+$//; 
     653    my @commandargs = quotewords('\s+', 0, $cmdWget); 
     654    unshift(@commandargs, $wget_file_path); # prepend wget cmd to the command array  
     655    #print STDOUT "Command is: ".join(",", @commandargs) . "\n"; 
    645656 
    646657    eval {     # see p.568 of Perl Cookbook