|
|
@34033
|
4 years |
davidb |
Solr schema.xml changes that have flowed through from changes in the …
|
|
|
@34032
|
4 years |
davidb |
Files to ignore
|
|
|
@34031
|
4 years |
davidb |
Dedicated scripts for this collection to build and activate it
|
|
|
@34030
|
4 years |
davidb |
Ignore archives and index dirs
|
|
|
@34029
|
4 years |
davidb |
Moved to atea site
|
|
|
@34028
|
4 years |
davidb |
Tweaks to overall interface look-and-feel
|
|
|
@34027
|
4 years |
davidb |
Tweaks to overall interface look-and-feel
|
|
|
@34026
|
4 years |
davidb |
Used to provide the gray jquery-ui theme to Atea
|
|
|
@34025
|
4 years |
davidb |
Icon to glTF 3D model/zip files
|
|
|
@34024
|
4 years |
davidb |
Couple of over-looked files for the initial set of files for Global …
|
|
|
@34023
|
4 years |
davidb |
Initial set of files for Global Digital Heritage glTF demonstration …
|
|
|
@34022
|
4 years |
davidb |
Collection for demonstration VR model artefacts
|
|
|
@34021
|
4 years |
davidb |
Tidy up on help/usage message
|
|
|
@34020
|
4 years |
davidb |
Changed to using newer version (8.5.51) of Tomcat
|
|
|
@34019
|
4 years |
kjdon |
replaced a couple of text strings
|
|
|
@34018
|
4 years |
kjdon |
check for error element in response - add that in if present, instead …
|
|
|
@34017
|
4 years |
kjdon |
add error element, don't just print a message to log, if we have …
|
|
|
@34016
|
4 years |
kjdon |
added cpan folder to @INC, as something is expecting to find JSON.pm - …
|
|
|
@34015
|
4 years |
davidb |
Further elimination of PJ related HTML/templates
|
|
|
@34014
|
4 years |
davidb |
Added in vidoe player template; remove PJ templates
|
|
|
@34013
|
4 years |
davidb |
Added hr line to break up sections
|
|
|
@34012
|
4 years |
davidb |
Images for Atea alt interface
|
|
|
@34011
|
4 years |
ak19 |
Piechart data for sites prepared for crawling and the piecharts for these
|
|
|
@34010
|
4 years |
davidb |
icon image for MP4 video
|
|
|
@34009
|
4 years |
davidb |
PJ based alternative interface for Atea
|
|
|
@34008
|
4 years |
davidb |
Alternative interface look-and-feel for the Atea project
|
|
|
@34007
|
4 years |
ak19 |
Prepared more data for the piecharts. This time for empty web pages vs …
|
|
|
@34006
|
4 years |
ak19 |
Committing more data I've collected for generating pie charts and the …
|
|
|
@34005
|
4 years |
ak19 |
InfoOnEmptyPagesNotInMongoDB.txt is now written out to a file, instead …
|
|
|
@34004
|
4 years |
ak19 |
Renaming csv file to have csv extension
|
|
|
@34003
|
4 years |
ak19 |
Redid the file with info on empty URL web pages as a csv file with …
|
|
|
@34002
|
4 years |
davidb |
Comment-based changes resulting from: (i) merging in differences from …
|
|
|
@34001
|
4 years |
ak19 |
Tentative total urls from common crawl 12 month cral data.
|
|
|
@34000
|
4 years |
ak19 |
Some debugging and other minor changes
|
|
|
@33999
|
4 years |
ak19 |
Common crawl 12 month urls and CC provided stats
|
|
|
@33998
|
4 years |
davidb |
Removed import statement that is no longer used, and was stopping …
|
|
|
@33997
|
4 years |
davidb |
Top-level folder for MARS related Greenstone3 code
|
|
|
@33996
|
4 years |
ak19 |
Accidentally committed the wrong thing in previous commit. Attempting …
|
|
|
@33995
|
4 years |
ak19 |
There was no Expat.so for perl 5.18 so am recompiling and committing that
|
|
|
@33994
|
4 years |
davidb |
The introduction of UTF8Control class means we can now work directly …
|
|
|
@33993
|
4 years |
kjdon |
when downloading a pdf, browsers seem to make more than one request - …
|
|
|
@33992
|
4 years |
davidb |
Notes at start of file updated
|
|
|
@33991
|
4 years |
davidb |
A version of the tomcat/conf/server.xml file that is better aligned …
|
|
|
@33990
|
4 years |
davidb |
Some white-space changes for consistency with newer …
|
|
|
@33989
|
4 years |
davidb |
In a default setup, AJP is not used => so not needed. Commented out to …
|
|
|
@33988
|
4 years |
ak19 |
1. Print out which web pages of which web site's dump.txt were empty. …
|
|
|
@33987
|
4 years |
ak19 |
Output of re-running NutchTextDumpToMongoDB to print out which web …
|
|
|
@33986
|
4 years |
ak19 |
Dr Bainbridge investigated the original data set more
|
|
|
@33985
|
4 years |
ak19 |
Data to back the piechart I need to make that will illustrate how we …
|
|
|
@33984
|
4 years |
ak19 |
Simple class to summarise some basic counts of the input common crawl data
|
|
|
@33983
|
4 years |
ak19 |
More sensible name for method which had too long kept its old name …
|
|
|
@33982
|
4 years |
ak19 |
SummaryTool.java now processed the handcrafted UNIQUE domains counts …
|
|
|
@33981
|
4 years |
ak19 |
As Dr Bainbridge suggested, code now opens a new firefox tab with a …
|
|
|
@33980
|
4 years |
ak19 |
Additional comments
|
|
|
@33979
|
4 years |
ak19 |
Clearly stating that counts are of unique domains
|
|
|
@33978
|
4 years |
ak19 |
Opens all geoJSON maps in new tabs instead of waiting for user to have …
|
|
|
@33977
|
4 years |
ak19 |
Added something on precision vs recall being applicable to our …
|
|
|
@33976
|
4 years |
ak19 |
Adding in what I could remember of Dr Bainbridge's statement about the …
|
|
|
@33975
|
4 years |
kjdon |
some mods to do with allowing multiple oaiservers. need …
|
|
|
@33974
|
4 years |
kjdon |
added in new oai.servlets field - if you want to run two oaiservlets, …
|
|
|
@33973
|
4 years |
kjdon |
tidied up the file a bit. added new servlet_url param to oaiserver - …
|
|
|
@33972
|
4 years |
kjdon |
fixed a typo in a comment
|
|
|
@33971
|
4 years |
kjdon |
get servlet_url param and pass to getOAIConfigXML, as now the files …
|
|
|
@33970
|
4 years |
kjdon |
changed OAIConfig naming to OAIConfig-oaiserver.xml - so multiple …
|
|
|
@33969
|
4 years |
kjdon |
we no longer use OAIConfig.xml as the filename, now we use eg …
|
|
|
@33968
|
4 years |
kjdon |
pass in oai_config from server, rather than reading it in itself
|
|
|
@33967
|
4 years |
kjdon |
you might want to change the oaiserver url, eg if you have 2 oai …
|
|
|
@33966
|
4 years |
ak19 |
Added the origSequence and basicDomain columns to the random 260 web …
|
|
|
@33965
|
4 years |
ak19 |
1. Adding a basicDomain column (stripped of http/https and www prefix) …
|
|
|
@33964
|
4 years |
ak19 |
2 records were missing a value for the qualityLevel column.
|
|
|
@33963
|
4 years |
ak19 |
Added a new helper method to MongoDBQueryer.java to add numPagesInMRI …
|
|
|
@33962
|
4 years |
ak19 |
2 fields changed, as one was missed out and the other incorrectly …
|
|
|
@33961
|
4 years |
ak19 |
New category, LINK_TEXT, introduced for the random web page URL samples.
|
|
|
@33960
|
4 years |
ak19 |
Reviewed all the random sample web page URLs marked …
|
|
|
@33959
|
4 years |
ak19 |
URIEncoding the mapData makes it unparseable by geojson.io
|
|
|
@33958
|
4 years |
ak19 |
There were other xsl files using the original depositorTitleAndLink …
|
|
|
@33957
|
4 years |
ak19 |
1. depositor related interface display modified to work with recent …
|
|
|
@33956
|
4 years |
ak19 |
Related to commit 33953: made lots of accidental commits in rev 33953, …
|
|
|
@33955
|
4 years |
ak19 |
Undoing accidental commit of unintended files.
|
|
|
@33954
|
4 years |
ak19 |
Accidentally committed with other files. Undoing.
|
|
|
@33953
|
4 years |
ak19 |
Depositor link not used
|
|
|
@33952
|
4 years |
ak19 |
Minor changes for processing
|
|
|
@33951
|
4 years |
ak19 |
Reviewed the qualityLevel column where LITTLE_TEXT was assigned.
|
|
|
@33950
|
4 years |
ak19 |
Reviewed the qualityLevel column where MIXED_TEXT was assigned.
|
|
|
@33949
|
4 years |
ak19 |
Reviewed the qualityLevel column where NAV was assigned.
|
|
|
@33948
|
4 years |
ak19 |
Reviewed the random sampled web page URLs marked as …
|
|
|
@33947
|
4 years |
ak19 |
Some more questionmarked field values assigned.
|
|
|
@33946
|
4 years |
ak19 |
1. New function to handle user input assigning the newly introduced …
|
|
|
@33945
|
4 years |
ak19 |
Added a 4th column for all 260 sample web page URLs and have used the …
|
|
|
@33944
|
4 years |
ak19 |
Added the isReallyInMRI column after manually inspecting the remaining …
|
|
|
@33943
|
4 years |
davidb |
Further tweaking of javah check after it failed to work on Bedrock LSB
|
|
|
@33942
|
4 years |
davidb |
Further tweaking of javah check after it failed to work on Bedrock LSB
|
|
|
@33941
|
4 years |
ak19 |
1. Uppercase 3rd field (Y/N/? field) read back in from file before …
|
|
|
@33940
|
4 years |
ak19 |
1. In order to make it easier to do the manual work of inspecting 260 …
|
|
|
@33939
|
4 years |
ak19 |
1. Old random samples file doesn't apply as we're not sampling by …
|
|
|
@33938
|
4 years |
ak19 |
1. Don't regenerate random sample of web page urls and full web page …
|
|
|
@33937
|
4 years |
ak19 |
New counts of manual sites after reingesting into MongoDB. Forgot to …
|
|
|
@33936
|
4 years |
ak19 |
Renaming old file to place with new counts after reingesting into MongoDB.
|
|
|
@33935
|
4 years |
davidb |
Additional check added into get-isis target
|
|
|
@33934
|
4 years |
davidb |
Removal of static code block calling ancient/deprecated static …
|
|
|