|
|
@33437
|
5 years |
cpb16 |
made progress with morphology. Need to have a better area dimension …
|
|
|
@33436
|
5 years |
ak19 |
3 important changes for 2 separate bugfixes where one bugfix is …
|
|
|
@33435
|
5 years |
ak19 |
Georgian language translations for the language's new glihelp module …
|
|
|
@33434
|
5 years |
ak19 |
Correcting syntax errors in this bash script.
|
|
|
@33433
|
5 years |
ak19 |
New Georgian language translation for perlmodules module of the GS …
|
|
|
@33432
|
5 years |
ak19 |
New Georgian language translation for glidict module of the GS …
|
|
|
@33431
|
5 years |
ak19 |
Corrections of automated processing, noticed when processing Georgian …
|
|
|
@33430
|
5 years |
ak19 |
Undo call to to_utf8() on the query_string argument (arg[q]) to …
|
|
|
@33429
|
5 years |
kjdon |
fixed a bug in get_or_create_shortname where it wasn't storing the new …
|
|
|
@33428
|
5 years |
ak19 |
Working commoncrawl cc-warc-examples' WET wordcount example using …
|
|
|
@33427
|
5 years |
davidb |
Some initial files on how to get going
|
|
|
@33426
|
5 years |
davidb |
Folder to details on how to standup the HTRC DevEnv locally
|
|
|
@33425
|
5 years |
ak19 |
A few more links now that I got past getting the vagrant VM with spark …
|
|
|
@33424
|
5 years |
ak19 |
Georgian (code ka) language translations for the gs3interface module …
|
|
|
@33423
|
5 years |
ak19 |
Adding in the link to the vagrant VM with Hadoop, Spark for cluster …
|
|
|
@33422
|
5 years |
ak19 |
Some more links.
|
|
|
@33421
|
5 years |
ak19 |
Forgot to fix up svn externals property for the Georgian …
|
|
|
@33420
|
5 years |
ak19 |
Update to svnproperty externals for the Georgian (code: ka) …
|
|
|
@33419
|
5 years |
ak19 |
Last evening, I had found some links about how language-detection is …
|
|
|
@33418
|
5 years |
cpb16 |
made progress with morphology, based one image, need to refine …
|
|
|
@33417
|
5 years |
ak19 |
Georgian language translations for the coredm for GS2, gsinstaller …
|
|
|
@33416
|
5 years |
ak19 |
DEC collections weren't getting built on 32 bit linux VM after trying …
|
|
|
@33415
|
5 years |
cpb16 |
updated, after unable to commit due to setup.bash being out of date. …
|
|
|
@33414
|
5 years |
ak19 |
Adding important links
|
|
|
@33413
|
5 years |
ak19 |
Splitting the get_commoncrawl_nz_urls.sh script back into 2 scripts, …
|
|
|
@33412
|
5 years |
ak19 |
config command for wgetting a single file
|
|
|
@33411
|
5 years |
ak19 |
Newer version now doesn't mirror sites with wget but gets WET files …
|
|
|
@33410
|
5 years |
ak19 |
Committing some variable name changes before I replace this file with …
|
|
|
@33409
|
5 years |
ak19 |
Forgot to commit 2 files with links and shuffling some links around …
|
|
|
@33408
|
5 years |
ak19 |
Some rough notes. Will move into appropriate file later.
|
|
|
@33407
|
5 years |
ak19 |
gutil.jar was rebuilt yesterday in GS3 after a bugfix. Recommitting …
|
|
|
@33406
|
5 years |
kjdon |
if there is a semicolon after the file name, it ends up in the URL …
|
|
|
@33405
|
5 years |
ak19 |
Even though we're probably not going to use this code after all, will …
|
|
|
@33404
|
5 years |
ak19 |
1. Links to other Java ways of extracting text from web content. 2. …
|
|
|
@33403
|
5 years |
ak19 |
Mistake to do with launchdir in SafeProcess: if the environment for …
|
|
|
@33402
|
5 years |
ak19 |
Beginnings of the Java class to wget sites and process its pages to …
|
|
|
@33401
|
5 years |
ak19 |
MaoriTextDetector.class file now generated inside its package folder …
|
|
|
@33400
|
5 years |
ak19 |
1. Setting up log4j.properties based on the macronizer's basic one …
|
|
|
@33399
|
5 years |
ak19 |
Putting properties files into the conf folder and keeping the lib …
|
|
|
@33398
|
5 years |
ak19 |
Committing the actual package structure and the updated README after …
|
|
|
@33397
|
5 years |
ak19 |
1. Changing package structure and instructions on compiling/running as …
|
|
|
@33396
|
5 years |
ak19 |
Georgian language gs3colcfg module of GS interface. Many thanks to …
|
|
|
@33395
|
5 years |
ak19 |
Georgian language translation work for the gs3interface module of the …
|
|
|
@33394
|
5 years |
ak19 |
1. Started a file on feasibility with the data now available and some …
|
|
|
@33393
|
5 years |
ak19 |
Modified the get_commoncrawl_nz_urls.sh to also create a reduced urls …
|
|
|
@33392
|
5 years |
ak19 |
Kathy found a problem whereby she wanted to run consecutive buildcols …
|
|
|
@33391
|
5 years |
ak19 |
Some rough bash scripting lines that work but aren't complete.
|
|
|
@33390
|
5 years |
ak19 |
Minor message telling the user to wait for a task that takes some time.
|
|
|
@33389
|
5 years |
kjdon |
store csv field array associated with filename, because you might have …
|
|
|
@33388
|
5 years |
kjdon |
tidied up some debug statements
|
|
|
@33387
|
5 years |
kjdon |
removed all my debug statements
|
|
|
@33386
|
5 years |
kjdon |
modified the test for whether this is the selected node or not. cant …
|
|
|
@33385
|
5 years |
kjdon |
need to import response node as it is not part of same document
|
|
|
@33384
|
5 years |
cpb16 |
backup before intellij working
|
|
|
@33383
|
5 years |
kjdon |
some more work on the help page
|
|
|
@33382
|
5 years |
kjdon |
don't add collection/collname to pref and help link if collname is empty
|
|
|
@33381
|
5 years |
kjdon |
use nice /page/gsdl url for about greenstone page
|
|
|
@33380
|
5 years |
kjdon |
some more mods and strings for collection help page
|
|
|
@33379
|
5 years |
ak19 |
New script to automate getting a file listing of the common crawl URL …
|
|
|
@33378
|
5 years |
ak19 |
New bin/script folder and relocating gen_SentenceDetection_model.sh to …
|
|
|
@33377
|
5 years |
ak19 |
Changes to get gen_SentenceDetection_model.sh to run still from the …
|
|
|
@33376
|
5 years |
ak19 |
Links and extracts I've read so far on the Web Curator Tool (WCT), …
|
|
|
@33375
|
5 years |
cpb16 |
Full backup after running first successful highres classifier run
|
|
|
@33374
|
5 years |
davidb |
added in opt-doc-args-link variable otherwise the transform fails with …
|
|
|
@33373
|
5 years |
kjdon |
need to check for null result from getTextString - otherwise get a …
|
|
|
@33372
|
5 years |
kjdon |
when writing out facets in buildConfig, need to get them from …
|
|
|
@33371
|
5 years |
kjdon |
separate sort and facet fields as the former needs to be single valued …
|
|
|
@33370
|
5 years |
kjdon |
use the new get_or_create_shortname instead of create_shortname
|
|
|
@33369
|
5 years |
kjdon |
instead of create_shortname, now have get_or_create_shortname. this …
|
|
|
@33368
|
5 years |
kjdon |
sort fields cannot be multivalued. Facet fields need to be. SO have …
|
|
|
@33367
|
5 years |
cpb16 |
Pre-hires classification w/o MU
|
|
|
@33366
|
5 years |
davidb |
Formatting refactoring to reduce code duplication
|
|
|
@33365
|
5 years |
davidb |
Exported version of spreadsheet for public download
|
|
|
@33364
|
5 years |
davidb |
Requested word changes to About page
|
|
|
@33363
|
5 years |
davidb |
Customization of help text
|
|
|
@33362
|
5 years |
davidb |
Changes to the wording and formating of Terms and Conditions
|
|
|
@33361
|
5 years |
davidb |
Change of headings that are exported
|
|
|
@33360
|
5 years |
davidb |
Code tidy-up and change of input/output filenanme
|
|
|
@33359
|
5 years |
davidb |
solr needs to add shortnames to the fieldnamemap otherwise it won't …
|
|
|
@33358
|
5 years |
ak19 |
More minor changes to README
|
|
|
@33357
|
5 years |
ak19 |
Minor changes
|
|
|
@33356
|
5 years |
ak19 |
Updating script. Correction to a filepath different in the svn folder …
|
|
|
@33355
|
5 years |
ak19 |
Changes for adding in the new gen_SentenceDetection_model.sh script, …
|
|
|
@33354
|
5 years |
davidb |
Template file for producing OpenOffice spreadsheet format
|
|
|
@33353
|
5 years |
davidb |
Initial set of files to page scrape and turn in the OpenOffice …
|
|
|
@33352
|
5 years |
davidb |
Top-level folder for code to page-scrape BookStumper site
|
|
|
@33351
|
5 years |
davidb |
Top-level folder for code to page-scrape BookStumper site
|
|
|
@33350
|
5 years |
ak19 |
Better comments. Tested macronised vs unmacronised Māori language test …
|
|
|
@33349
|
5 years |
ak19 |
Minor changes to the README for map demo solr-haminfo collection …
|
|
|
@33348
|
5 years |
ak19 |
2 major changes. 1. Forgot to commit Dr Bainbridge's bugfix for why …
|
|
|
@33347
|
5 years |
kjdon |
made it optional whether the user gets shown the terms and conditions …
|
|
|
@33346
|
5 years |
kjdon |
check for empty child_id, and null DBInfo before using them
|
|
|
@33345
|
5 years |
kjdon |
got rid of hard coded empty basket text
|
|
|
@33344
|
5 years |
kjdon |
added favourites empty text
|
|
|
@33343
|
5 years |
kjdon |
add in favourites langfrags (not just berry ones). Change the title …
|
|
|
@33342
|
5 years |
kjdon |
change the empty basket message depending on whether it is a berry …
|
|
|
@33341
|
5 years |
kjdon |
tidied up relational metadata retrieval. implemented descendants and …
|
|
|
@33340
|
5 years |
cpb16 |
transferred backup of low res images. Classifiers work as expected. …
|
|
|
@33339
|
5 years |
ak19 |
Updated README.
|
|
|
@33338
|
5 years |
ak19 |
1.After renaming the java class, changed all occurrences of the old …
|
|
|