|
|
@31487
|
7 years |
ak19 |
Important import statement for the recent commits related to encoding.
|
|
|
@31480
|
7 years |
kjdon |
util::block_file moved to EncodingUtil::block_raw_filename
|
|
|
@31479
|
7 years |
kjdon |
inherit from EncodingUtil instead of PrintInfo
|
|
|
@31478
|
7 years |
kjdon |
blocking stuff moved to here
|
|
|
@31477
|
7 years |
kjdon |
blocking moved to EncodingUtil. debug stuff still in here. needs tidying up
|
|
|
@31476
|
7 years |
kjdon |
blocking moved to EncodingUTil
|
|
|
@31474
|
7 years |
kjdon |
encoding_list is in EncodingUtil now
|
|
|
@31459
|
7 years |
kjdon |
now inherits from EncodingUtil. When using local directory in metadata …
|
|
|
@31458
|
7 years |
kjdon |
encoding list now comes from EncodingUtil, not BasePlugin
|
|
|
@31457
|
7 years |
kjdon |
baseplugin now inherits from EncodingUtil, and all its encoding …
|
|
|
@31456
|
7 years |
kjdon |
new base plugin for directories and files. DirectoryPLugin needs …
|
|
|
@31446
|
7 years |
ak19 |
use guess_filesystem_encoding instead of utf8 hard coded. hope it …
|
|
|
@31445
|
7 years |
ak19 |
added a method guessing_filesystem_encoding. use this to try and work …
|
|
|
@31444
|
7 years |
ak19 |
block hash filenames should be windows long names
|
|
|
@31440
|
7 years |
kjdon |
nearly there for handling russian etc subfolders in import. need to …
|
|
|
@31439
|
7 years |
kjdon |
chnaged a comment
|
|
|
@31438
|
7 years |
kjdon |
added a couple of comments
|
|
|
@31420
|
7 years |
kjdon |
lookup_string with extra '1' arg returns perl internal unicode aware …
|
|
|
@31415
|
7 years |
Georgiy Litvinov |
Modified html links pointed to different section in the same document.
|
|
|
@31284
|
7 years |
davidb |
Initial cut at plugin for processing HathiTrust METS files
|
|
|
@31113
|
7 years |
ak19 |
Text item files now handle UTF-8 properly by reading in the file correctly.
|
|
|
@30857
|
8 years |
ak19 |
Unless new line endings (particularly carriage return characters …
|
|
|
@30742
|
8 years |
kjdon |
paged docs without images look weird in gs3. need to make a new type, …
|
|
|
@30681
|
8 years |
ak19 |
3 new strings introduced by Kathy contained the :, which is used as a …
|
|
|
@30600
|
8 years |
ak19 |
An empty metadata.xml was unrecognised by MetadataXMLPlugin because …
|
|
|
@30492
|
8 years |
Georgiy Litvinov |
Fix for previous commit.
|
|
|
@30491
|
8 years |
Georgiy Litvinov |
Removed high and low surrogates from converted html
|
|
|
@30427
|
8 years |
davidb |
Technique for working out cached-dir name for file updated to allow it …
|
|
|
@30358
|
8 years |
Georgiy Litvinov |
Fix for -associate_tail_re option. Files with extensions that could be …
|
|
|
@30022
|
9 years |
ak19 |
Finally committing Dr Bainbridge's suggested fix (tested) to handle …
|
|
|
@29820
|
9 years |
kjdon |
EmbeddedMEtadataPlugin needs to make raw filenames into unicode for …
|
|
|
@29818
|
9 years |
kjdon |
removing debug and old test code
|
|
|
@29817
|
9 years |
kjdon |
removing debug statements
|
|
|
@29796
|
9 years |
kjdon |
don't need use Win32 and anyway, can't have it when not running on windows
|
|
|
@29795
|
9 years |
kjdon |
change to using util method raw_filename_to_unicode. got this working …
|
|
|
@29763
|
9 years |
ak19 |
on macos, accented chars in filenames are in decomposed form, eg the …
|
|
|
@29762
|
9 years |
ak19 |
check if the filenames are url encoded - this happens for eg accented …
|
|
|
@29760
|
9 years |
kjdon |
try decoding against locale rather than utf8. will this work on …
|
|
|
@29745
|
9 years |
kjdon |
using Encode::decode to make the filenames 'unicode aware'. For …
|
|
|
@29476
|
9 years |
sjs49 |
First of 2 commits to get diffcol on the 64 bit Ubuntu that has perl …
|
|
|
@29102
|
10 years |
kjdon |
added the string for PDFPlugin.use_realistic_book option
|
|
|
@29101
|
10 years |
kjdon |
added -use_realistic_book option. htis makes user you are converting …
|
|
|
@28836
|
10 years |
ak19 |
A question on the mailing list involved accented characters in custom …
|
|
|
@28803
|
10 years |
ak19 |
Testing with accented characters in MARC data showed up problems in …
|
|
|
@28783
|
10 years |
ak19 |
Treatment of 'and' in the MARC*Plugin.pm an issue for Greenstone …
|
|
|
@28782
|
10 years |
ak19 |
Routine for reading in text files failed to 'decode' from UTF-8 to …
|
|
|
@28669
|
10 years |
ak19 |
This plugin is similar to CSVPlugin, but for tab-separated metadata files
|
|
|
@28638
|
10 years |
kjdon |
don't process a doc.xml entry if the group-position > 1: we have …
|
|
|
@28603
|
10 years |
ak19 |
Found some issues when wanting to add in the CDS-ISIS tutorial …
|
|
|
@28563
|
10 years |
kjdon |
changing some util:: methods to FileUtils:: methods
|
|
|
@28560
|
10 years |
ak19 |
1. New subroutine util::set_gnomelib_env that sets the environment for …
|
|
|
@28489
|
11 years |
davidb |
Support for Cygwin added
|
|
|
@28381
|
11 years |
ak19 |
Bugfix. When dealing with filenames with special characters that are …
|
|
|
@28375
|
11 years |
davidb |
A set of changes to help Greenstone building code (perl) run under …
|
|
|
@28355
|
11 years |
ak19 |
1. Now gsConvert.pl calls the new pptextract.vbs VBScript (which …
|
|
|
@28319
|
11 years |
ak19 |
The replace-with-src-doc feature had stopped working. It needed …
|
|
|
@28285
|
11 years |
ak19 |
Deprecated util:: subroutines replaced with their FileUtils equivalents
|
|
|
@28267
|
11 years |
davidb |
Code change to allow doc.xml files that do not have a DOCTYPE line
|
|
|
@28265
|
11 years |
davidb |
Revised RE for accepting doc.xml files to allow for time-stamped ones
|
|
|
@28196
|
11 years |
kjdon |
util::mk_dir should be FileUtils::makeDirectory
|
|
|
@27982
|
11 years |
sjm84 |
Fixed an error that was occuring on Windows due to backslashes
|
|
|
@27973
|
11 years |
ak19 |
Reinstating Dr Bainbridge's fix to getting the extra meta in sorted …
|
|
|
@27957
|
11 years |
ak19 |
For now, undoing the change made to BasePlugin for the diffcol nightly …
|
|
|
@27949
|
11 years |
ak19 |
Need to sort extra metadata (e.g. ex.PDF.* and ex.File.* meta …
|
|
|
@27927
|
11 years |
ak19 |
Correcting error introduced in earlier commit.
|
|
|
@27916
|
11 years |
ak19 |
1. Reference to undefined variable. 2. Using FileUtils:: subroutines …
|
|
|
@27787
|
11 years |
kjdon |
making the thumbicon img tag valid HTML - adding alt att, and putting …
|
|
|
@27742
|
11 years |
ak19 |
Remove Windows carriage returns when Greenstone assigns titles, where …
|
|
|
@27703
|
11 years |
ak19 |
Dr Bainbridge fixed the final diffcol issue with Small-HTML on windows …
|
|
|
@27697
|
11 years |
ak19 |
Dr Bainbridge fixed it so that the gdb files generated on Windows for …
|
|
|
@27578
|
11 years |
ak19 |
Doing a sort on all occurrences of readdir, so readdir lists dir …
|
|
|
@27519
|
11 years |
ak19 |
Using the recommended FileUtils.pm equivalents for util.pm subroutines.
|
|
|
@27509
|
11 years |
ak19 |
Using the recommended FileUtils.pm methods in place of the deprecated …
|
|
|
@27503
|
11 years |
kjdon |
modified to handle files with just a single record. So no collection …
|
|
|
@27502
|
11 years |
kjdon |
trying to fix double encoding issue for isis files. not sure that I …
|
|
|
@27354
|
11 years |
kjdon |
changed some deprecated util methods for FileUtils methods
|
|
|
@27321
|
11 years |
ak19 |
Two bugfixes: 1. Handling of quotes not just the CSV fields containing …
|
|
|
@27306
|
11 years |
jmt12 |
Moving the critical file-related functions (copy, rm, etc) out of …
|
|
|
@27283
|
11 years |
ak19 |
1. Fixed an encoding bug that Diego helpfully discovered. Metadata …
|
|
|
@27141
|
11 years |
kjdon |
fixed extract_metadata so that it will get all ocurrences of a …
|
|
|
@27106
|
11 years |
kjdon |
need to do the same utf8 decode step that is used in ReadTextFile on …
|
|
|
@26893
|
11 years |
kjdon |
ConvertBinaryFile needs to reset the doc OID after all the processing …
|
|
|
@26867
|
11 years |
davidb |
Added a block for pesky .DS_Store files, generated by Macs
|
|
|
@26866
|
11 years |
davidb |
Introduction of -aspectpad... options. Useful when working with …
|
|
|
@26536
|
11 years |
davidb |
Introduction of two new OIDtype values (hash_on_full_filename and …
|
|
|
@26222
|
12 years |
kjdon |
added plugin_specific_process method - inheriting plugins can use this …
|
|
|
@26221
|
12 years |
kjdon |
new OIDtype, filename, will use the file name without any folders or …
|
|
|
@26146
|
12 years |
davidb |
Refinement to EmbeddedMetadataPlugin that allows it to operate with …
|
|
|
@25971
|
12 years |
kjdon |
removed two forgotten debug statements
|
|
|
@25961
|
12 years |
kjdon |
more cunning document types. gs3 has a new one, pagedhierarchy - for …
|
|
|
@25957
|
12 years |
kjdon |
adding in support for plugins knowing what version of greenstone (2/3) …
|
|
|
@25797
|
12 years |
kjdon |
need to define gsprintf in order to use it
|
|
|
@25787
|
12 years |
ak19 |
Forgot to commit the use of the perl function to find the filesize and …
|
|
|
@25778
|
12 years |
ak19 |
ex.ImageSize and ex.FileSize metadata were being set to the string …
|
|
|
@25743
|
12 years |
kjdon |
if we happen to have a file and matching process expression that …
|
|
|
@25742
|
12 years |
kjdon |
change to use can_process_this_file instead of metadata_read to test …
|
|
|
@25741
|
12 years |
kjdon |
modified process expression to handle .mbox
|
|
|
@25673
|
12 years |
sjm84 |
Links that only contain # values now have a macro added to the front …
|
|
|
@25555
|
12 years |
sjm84 |
We want to be able to get associated files from CSS files
|
|
|
@25508
|
12 years |
ak19 |
Since epub files are zip files with a differently named extension, …
|
|
|