Changeset 37048


Ignore:
Timestamp:
2022-12-23T10:28:25+13:00 (17 months ago)
Author:
davidb
Message:

Useful support routine added that only sets the document field to say it has no text if the text field is empty. This routine helps plugins such as ImagePlugin that never used to have any text and would therefore set this field to 'no text'. Newer work such as GoogleVisionImagePlugin in Inherits from ImagePlugin and as a result of calling the Google Visions API can now find text in a image. This new support routine helps with setting dummy text on when needed, but if there is evidence of text that is stored in 'doc_obj' then it does not set dummy text; there is also a useful debugging statement added that prints out which *inheritend* plugin in checking the process expression -- commented out

File:
1 edited

Legend:

Unmodified
Added
Removed
  • main/trunk/greenstone2/perllib/plugins/BaseImporter.pm

    r36910 r37048  
    488488    }
    489489
     490    # print STDERR "**** BaseImport::can_process_this_file(): ", ref($self), " checking $filename =~ /$self->{'process_exp'}/\n";
     491   
    490492    if ($self->{'process_exp'} ne "" && $filename =~ /$self->{'process_exp'}/) {
    491493    return 1;
     
    703705    $doc_obj->add_utf8_text($section, &gsprintf::lookup_string("{BaseImporter.dummy_text}",1));
    704706    #$doc_obj->add_text($section, &gsprintf::lookup_string("{BaseImporter.dummy_text}",1));
    705    
    706    
     707}
     708
     709sub add_dummy_text_if_empty {
     710    my $self = shift(@_);
     711    my ($doc_obj, $section) = @_;
     712
     713    my $section_text_len = $doc_obj->get_text_length($section);
     714
     715    if ($section_text_len == 0) {
     716    $self->add_dummy_text($doc_obj,$section);
     717    }
    707718}
    708719
Note: See TracChangeset for help on using the changeset viewer.