09/19/17 18:59:51 (3 years ago)

Another bugfix to downloading. Downloading over OAI wasn't working. Dr Bainbridge discovered that this was because OAIDownload.pm was still doublequoting filepaths and URLs too, whereas open3() launching the wget cmd can't handle quotes in its arguments. WgetDownload used split to convert the cmd string into a cmd array. A clean solution was not passing WgetDownload::useWget() methods an array of cmd parameters (too involved and error prone to change all the calling code constructing the parameter cmd string), but to use the quotewords() method in place of split. This will preserve spaces in double quoted params in the cmd string, while splitting on spaces outside quoted strings. Then it also removes double quotes (and unescapes double backslashes). Tested on Mac: both OAI and Web downloading now work.

1 edited


  • main/trunk/greenstone2/perllib/downloaders/WgetDownload.pm

    r31957 r31975  
    4040use IO::Select;
    4141use IO::Socket;
     42use Text::ParseWords; # part of Core modules. Needed to use quotewords() subroutine
    4344#use IO::Select qw( );
    368369    # http://www.perlmonks.org/?node_id=394709 and that ends up causing problems in terminating wget, as 2 processes
    369370    # got launched then which don't have parent-child pid relationship (so that terminating one doesn't terminate the other).
    370     my @commandargs = split(' ', $cmdWget);
    371     unshift(@commandargs, $wget_file_path);
    372     $command = "$wget_file_path $cmdWget";
    373 #    print STDOUT "Command is: $command\n"; # displayed in GLI output
    374 #    print STDERR "Command is: $command\n"; # goes into ServerInfoDialog
     372    # remove leading and trailing spaces, https://stackoverflow.com/questions/4597937/perl-function-to-trim-string-leading-and-trailing-whitespace
     373    $cmdWget =~ s/^\s+//;
     374    $cmdWget =~ s/\s+$//;
     376    # split on "words"
     377    #my @commandargs = split(' ', $cmdWget);
     378    # quotewords: to split on spaces except within quotes, then removes quotes and unescapes double backslash too
     379      # https://stackoverflow.com/questions/19762412/regex-to-split-key-value-pairs-ignoring-space-in-double-quotes
     380      # https://docstore.mik.ua/orelly/perl/perlnut/c08_389.htm
     381    my @commandargs = quotewords('\s+', 0, $cmdWget);
     382    unshift(@commandargs, $wget_file_path); # prepend the wget cmd
     383    #print STDERR "Command is: ".join(",", @commandargs) . "\n"; # goes into ServerInfoDialog
    376385    # Wget's output needs to be monitored to find out when it has naturally terminated.
    639648    my $wget_file_path = &FileUtils::filenameConcatenate($ENV{'GSDLHOME'}, "bin", $ENV{'GSDLOS'}, "wget");
    640649    # compose the command as an array for open3, to preserve spaces in any filepath
    641     my @commandargs = split(' ', $cmdWget);
    642     unshift(@commandargs, $wget_file_path);
    643     my $command = "$wget_file_path $cmdWget";
    644     #print STDOUT "Command is: $command\n";
     650    # Do so by removing leading and trailing spaces, then splitting on "words" (preserving spaces in quoted words and removing quotes)
     651    $cmdWget =~ s/^\s+//;
     652    $cmdWget =~ s/\s+$//;
     653    my @commandargs = quotewords('\s+', 0, $cmdWget);
     654    unshift(@commandargs, $wget_file_path); # prepend wget cmd to the command array
     655    #print STDOUT "Command is: ".join(",", @commandargs) . "\n";
    646657    eval {     # see p.568 of Perl Cookbook
Note: See TracChangeset for help on using the changeset viewer.