Changeset 34221 for main/trunk/greenstone2/perllib/docprint.pm
- Timestamp:
- 2020-06-30T00:19:32+12:00 (4 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
main/trunk/greenstone2/perllib/docprint.pm
r34220 r34221 103 103 # (XML::Parser will barf on anything it doesn't consider to be 104 104 # valid UTF-8 text, including things like \c@, \cC etc.) 105 # Will treat tab chars, \x09, as a special case right after this 106 $all_text =~ s/[\x00-\x08\x0B\x0C\x0E-\x1F]//g; 105 # and the tab character too (x09) 107 106 108 # $all_text gets written out into an xml context and represents the html version of a doc, 109 # allowing the use of html entities for the tab character (	) 110 # Tabs (ASCII \x09) may be meaningful spacing in such cases whether the html emanated from a 111 # text file, original html or other doc. Particularly when tabs are nested in <pre> tags. 112 # Instead of removing tabs, replacing tabs with their entity reference will allow <pre> tags 113 # to continue preserving any tabs in the final html display. 114 # Hopefully with this, XML::Parser will not choke on tabs, and we get tab stop spaces preserved 115 # in the html output. 116 # This may be the best location to do this replacement and not in TextPlugin, because an html 117 # source doc may contain <pre> elements with tab stops, so then HTMLPlugin would have to do the 118 # replacement too. 119 $all_text =~ s/\x09/	/g; 120 107 $all_text =~ s/[\x00-\x09\x0B\x0C\x0E-\x1F]//g; 108 121 109 return $all_text; 122 110 }
Note:
See TracChangeset
for help on using the changeset viewer.