|
|
@32224
|
6 years |
ak19 |
Adding PDF to text support for Windows using Xpdf's pdftotext tool. …
|
|
|
@32223
|
6 years |
ak19 |
When no output mode for PDFPlugin has been set by the user, the output …
|
|
|
@32222
|
6 years |
ak19 |
q
|
|
|
@32215
|
6 years |
ak19 |
Before reorganising our PDFPlugin in whatever way we ultimately …
|
|
|
@32210
|
6 years |
ak19 |
When PDFPlugin is set to paged_html output mode, it now finally …
|
|
|
@32206
|
6 years |
ak19 |
1. ConvertBinaryFile.pm no longer knows more than necessary about …
|
|
|
@32205
|
6 years |
ak19 |
First set of commits to do with implementing the new 'paged_html' …
|
|
|
@32192
|
6 years |
kjdon |
with new result and result_str return values from convert, need to …
|
|
|
@32186
|
6 years |
kjdon |
if the eval didn't work, all the return values might be undefined, so …
|
|
|
@32185
|
6 years |
kjdon |
use new return values from ImageConverter::convert
|
|
|
@32184
|
6 years |
kjdon |
change the return values of convert to match tmp_area_convert_file in …
|
|
|
@32183
|
6 years |
kjdon |
image height and width might be returned as 'unknown', in particular …
|
|
|
@32159
|
6 years |
ak19 |
incremental building was not being incremental when no metadata was …
|
|
|
@32131
|
6 years |
kjdon |
don't want the initial , if trying to match 41 times. this is a syntax …
|
|
|
@32129
|
6 years |
kjdon |
After () in a regex, {} signifys quantifiers. eg (xx){2,4} - 2-4 …
|
|
|
@32122
|
6 years |
kjdon |
we had [a-z]{2..} which causes an error in later versions of perl. …
|
|
|
@32096
|
6 years |
ak19 |
Marking all the uses of sysread() with a comment saying they're a …
|
|
|
@32028
|
7 years |
ak19 |
1. Bugfix to previous commit: var might not be on an …
|
|
|
@32026
|
7 years |
ak19 |
Some more placeholder strings for the UnknownConverterPlugin to …
|
|
|
@31958
|
7 years |
kjdon |
use identify to work out the filetype of the original file, rather …
|
|
|
@31955
|
7 years |
Georgiy Litvinov |
Prevent page from reloading on open same-section links
|
|
|
@31926
|
7 years |
kjdon |
added a new option 'store_original_image'. If this is set, and the …
|
|
|
@31780
|
7 years |
ak19 |
When testing GS3.08's GLI on Ubuntu v 16.04, found its perl v 5.22.1 …
|
|
|
@31766
|
7 years |
ak19 |
1. Refactored ConvertBinaryFile:tmp_area_convert_file() to do the …
|
|
|
@31765
|
7 years |
ak19 |
Cosmetic change to error message
|
|
|
@31764
|
7 years |
ak19 |
Should replace INPUT_FILE placeholder with softlink path tmp_filename …
|
|
|
@31763
|
7 years |
ak19 |
Fixing somethings before attempting to refactor tmp_area_convert_file
|
|
|
@31762
|
7 years |
ak19 |
Changed the placeholder names to what Dr Bainbridge suggested, which …
|
|
|
@31761
|
7 years |
ak19 |
Moved function generate_item_file that's shared between …
|
|
|
@31760
|
7 years |
ak19 |
Making the plugin active. It's rudimentary, but works when I pass in …
|
|
|
@31759
|
7 years |
ak19 |
The previous commit put text into doc.xml, but no text was visible in …
|
|
|
@31757
|
7 years |
ak19 |
Fixed the earlier problems, which, it turned out, had to do with the …
|
|
|
@31745
|
7 years |
ak19 |
Another change that's needed, this time to add the plugin.
|
|
|
@31744
|
7 years |
ak19 |
Further changes to new UnknownConverterPlugin that's still in …
|
|
|
@31743
|
7 years |
ak19 |
Committing first attempt at new UnknownConverterPlugin, which hasn't …
|
|
|
@31742
|
7 years |
ak19 |
No need to hardcode the plugin name
|
|
|
@31690
|
7 years |
kjdon |
removing debug statements
|
|
|
@31689
|
7 years |
kjdon |
removing debug statements
|
|
|
@31688
|
7 years |
kjdon |
removing debug statements
|
|
|
@31497
|
7 years |
kjdon |
oops, had commented out a line which meant normal ascii images weren't …
|
|
|
@31494
|
7 years |
kjdon |
updated text string keys based on new plugin names
|
|
|
@31493
|
7 years |
kjdon |
removed smart_block option. Its been deprecated for long enough
|
|
|
@31492
|
7 years |
kjdon |
renamed EncodingUtil to CommonUtil, BasePlugin to BaseImporter. The …
|
|
|
@31491
|
7 years |
kjdon |
need to normalize the name when we look up in hte block hash too, for macos
|
|
|
@31487
|
7 years |
ak19 |
Important import statement for the recent commits related to encoding.
|
|
|
@31480
|
7 years |
kjdon |
util::block_file moved to EncodingUtil::block_raw_filename
|
|
|
@31479
|
7 years |
kjdon |
inherit from EncodingUtil instead of PrintInfo
|
|
|
@31478
|
7 years |
kjdon |
blocking stuff moved to here
|
|
|
@31477
|
7 years |
kjdon |
blocking moved to EncodingUtil. debug stuff still in here. needs tidying up
|
|
|
@31476
|
7 years |
kjdon |
blocking moved to EncodingUTil
|
|
|
@31474
|
7 years |
kjdon |
encoding_list is in EncodingUtil now
|
|
|
@31459
|
7 years |
kjdon |
now inherits from EncodingUtil. When using local directory in metadata …
|
|
|
@31458
|
7 years |
kjdon |
encoding list now comes from EncodingUtil, not BasePlugin
|
|
|
@31457
|
7 years |
kjdon |
baseplugin now inherits from EncodingUtil, and all its encoding …
|
|
|
@31456
|
7 years |
kjdon |
new base plugin for directories and files. DirectoryPLugin needs …
|
|
|
@31446
|
7 years |
ak19 |
use guess_filesystem_encoding instead of utf8 hard coded. hope it …
|
|
|
@31445
|
7 years |
ak19 |
added a method guessing_filesystem_encoding. use this to try and work …
|
|
|
@31444
|
7 years |
ak19 |
block hash filenames should be windows long names
|
|
|
@31440
|
7 years |
kjdon |
nearly there for handling russian etc subfolders in import. need to …
|
|
|
@31439
|
7 years |
kjdon |
chnaged a comment
|
|
|
@31438
|
7 years |
kjdon |
added a couple of comments
|
|
|
@31420
|
7 years |
kjdon |
lookup_string with extra '1' arg returns perl internal unicode aware …
|
|
|
@31415
|
7 years |
Georgiy Litvinov |
Modified html links pointed to different section in the same document.
|
|
|
@31284
|
7 years |
davidb |
Initial cut at plugin for processing HathiTrust METS files
|
|
|
@31113
|
7 years |
ak19 |
Text item files now handle UTF-8 properly by reading in the file correctly.
|
|
|
@30857
|
8 years |
ak19 |
Unless new line endings (particularly carriage return characters …
|
|
|
@30742
|
8 years |
kjdon |
paged docs without images look weird in gs3. need to make a new type, …
|
|
|
@30681
|
8 years |
ak19 |
3 new strings introduced by Kathy contained the :, which is used as a …
|
|
|
@30600
|
8 years |
ak19 |
An empty metadata.xml was unrecognised by MetadataXMLPlugin because …
|
|
|
@30492
|
8 years |
Georgiy Litvinov |
Fix for previous commit.
|
|
|
@30491
|
8 years |
Georgiy Litvinov |
Removed high and low surrogates from converted html
|
|
|
@30427
|
8 years |
davidb |
Technique for working out cached-dir name for file updated to allow it …
|
|
|
@30358
|
8 years |
Georgiy Litvinov |
Fix for -associate_tail_re option. Files with extensions that could be …
|
|
|
@30022
|
9 years |
ak19 |
Finally committing Dr Bainbridge's suggested fix (tested) to handle …
|
|
|
@29820
|
9 years |
kjdon |
EmbeddedMEtadataPlugin needs to make raw filenames into unicode for …
|
|
|
@29818
|
9 years |
kjdon |
removing debug and old test code
|
|
|
@29817
|
9 years |
kjdon |
removing debug statements
|
|
|
@29796
|
9 years |
kjdon |
don't need use Win32 and anyway, can't have it when not running on windows
|
|
|
@29795
|
9 years |
kjdon |
change to using util method raw_filename_to_unicode. got this working …
|
|
|
@29763
|
9 years |
ak19 |
on macos, accented chars in filenames are in decomposed form, eg the …
|
|
|
@29762
|
9 years |
ak19 |
check if the filenames are url encoded - this happens for eg accented …
|
|
|
@29760
|
9 years |
kjdon |
try decoding against locale rather than utf8. will this work on …
|
|
|
@29745
|
9 years |
kjdon |
using Encode::decode to make the filenames 'unicode aware'. For …
|
|
|
@29476
|
9 years |
sjs49 |
First of 2 commits to get diffcol on the 64 bit Ubuntu that has perl …
|
|
|
@29102
|
10 years |
kjdon |
added the string for PDFPlugin.use_realistic_book option
|
|
|
@29101
|
10 years |
kjdon |
added -use_realistic_book option. htis makes user you are converting …
|
|
|
@28836
|
10 years |
ak19 |
A question on the mailing list involved accented characters in custom …
|
|
|
@28803
|
10 years |
ak19 |
Testing with accented characters in MARC data showed up problems in …
|
|
|
@28783
|
10 years |
ak19 |
Treatment of 'and' in the MARC*Plugin.pm an issue for Greenstone …
|
|
|
@28782
|
10 years |
ak19 |
Routine for reading in text files failed to 'decode' from UTF-8 to …
|
|
|
@28669
|
10 years |
ak19 |
This plugin is similar to CSVPlugin, but for tab-separated metadata files
|
|
|
@28638
|
10 years |
kjdon |
don't process a doc.xml entry if the group-position > 1: we have …
|
|
|
@28603
|
10 years |
ak19 |
Found some issues when wanting to add in the CDS-ISIS tutorial …
|
|
|
@28563
|
10 years |
kjdon |
changing some util:: methods to FileUtils:: methods
|
|
|
@28560
|
10 years |
ak19 |
1. New subroutine util::set_gnomelib_env that sets the environment for …
|
|
|
@28489
|
11 years |
davidb |
Support for Cygwin added
|
|
|
@28381
|
11 years |
ak19 |
Bugfix. When dealing with filenames with special characters that are …
|
|
|
@28375
|
11 years |
davidb |
A set of changes to help Greenstone building code (perl) run under …
|
|
|
@28355
|
11 years |
ak19 |
1. Now gsConvert.pl calls the new pptextract.vbs VBScript (which …
|
|
|
@28319
|
11 years |
ak19 |
The replace-with-src-doc feature had stopped working. It needed …
|
|
|