Changeset 37047


Ignore:
Timestamp:
2022-12-23T10:24:09+13:00 (16 months ago)
Author:
davidb
Message:

Introduction of 'metadata_separate_fields', a plugin option that controls which fields get the value separation split applied to. By default all fields get split when the value split character is specified, however you can get situations where you want to split on (say) ',' for a Keyword field but not in a Abstract field that happens to use commas

Location:
main/trunk/greenstone2/perllib
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/plugins/CSVFieldSeparator.pm

    r34249 r37047  
    4646      { 'name' => "metadata_value_separator",
    4747    'desc' => "{CSVFieldSeparator.metadata_value_separator}",
     48    'type' => "string",
     49    'deft' => "",
     50        'reqd' => "no" },
     51      { 'name' => "metadata_separate_fields",
     52    'desc' => "{CSVFieldSeparator.metadata_separate_fields}",
    4853    'type' => "string",
    4954    'deft' => "",
  • main/trunk/greenstone2/perllib/plugins/CSVPlugin.pm

    r36587 r37047  
    193193    }
    194194
     195    my $md_sep_fields = $self->{'metadata_separate_fields'};
     196    undef $md_sep_fields if ($md_sep_fields eq "");
     197
     198    my $md_sep_fields_lookup = undef;
     199    if (defined $md_sep_fields) {
     200    $md_sep_fields_lookup = {};
     201
     202    my @md_fields = split(/\s*,\s*/,$md_sep_fields);
     203
     204    for my $md_field (@md_fields) {
     205        $md_sep_fields_lookup->{$md_field} = 1;
     206    }
     207    }
     208
    195209    my $csv = Text::CSV->new();
    196210    $csv->sep_char($separate_char);
     
    247261        my $md_name = $csv_file_fields[$i];
    248262        $csv_line_metadata{$md_name} = [];
    249         if (defined $md_val_sep) {         
     263
     264        my $needs_md_val_sep = 0;
     265        if (defined $md_val_sep) {
     266            # Default coming in is 'no' (0)
     267            # => Check to see if any conditions met to turn this into a 'yes' (1)
    250268           
     269            # check to see if md_sep_fields is in play, and if it is
     270            # => determine if this $md_name is one of the ones in $md_sep_fields_lookup
     271           
     272            if (defined $md_sep_fields_lookup) {
     273            if ($md_sep_fields_lookup->{$md_name}) {
     274                $needs_md_val_sep = 1;
     275            }
     276            }
     277            else {
     278            # if not set, then we apply the md_val_sep to all metadata fields
     279            $needs_md_val_sep = 1;
     280            }
     281        }
     282       
     283        if ($needs_md_val_sep) {           
     284           
    251285            my @within_md_vals = split(/${md_val_sep}/,$md_val);
    252286           
  • main/trunk/greenstone2/perllib/strings.properties

    r36912 r37047  
    879879ConvertToRogPlugin.desc:A plugin that inherits from RogPlugin.
    880880
    881 CSVFieldSeparator.csv_field_separator: The character you've consistently used to seperate each cell of a row in your csv spreadsheet file. CSV stands for comma separated values, however you can specify the csv_field_separator character you used in your csv files here. If you leave this option on auto, the Plugin will try to autodetect your csv field separator character.
    882 
    883 CSVFieldSeparator.metadata_value_separator: The character you've consistently used to separate multiple metadata values for a single metadata field within a cell of the csv spreadsheet. If you used the vertical bar as the separator character, then set metadata_value_separator to \| (backslash vertical bar).
     881CSVFieldSeparator.csv_field_separator:The character you've consistently used to seperate each cell of a row in your csv spreadsheet file. CSV stands for comma separated values, however you can specify the csv_field_separator character you used in your csv files here. If you leave this option on auto, the Plugin will try to autodetect your csv field separator character.
     882
     883CSVFieldSeparator.metadata_value_separator:The character you've consistently used to separate multiple metadata values for a single metadata field within a cell of the csv spreadsheet. If you used the vertical bar as the separator character, then set metadata_value_separator to \| (backslash vertical bar).
     884
     885CSVFieldSeparator.metadata_separate_fields:A comma separated list of metadata fields that the metadata_value_separator is to be applied to.  If left blank then metadata_value_separator is applied to all the metadata fields in the CSV file.
    884886
    885887CSVPlugin.desc:A plugin for files in comma-separated value format. Metadata can be assigned to source documents (specified in the Filename field), or new documents created for each line of the file.
Note: See TracChangeset for help on using the changeset viewer.