source:

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Diff Rev Age Author Log Message
(edit) @34031   4 years davidb Dedicated scripts for this collection to build and activate it
(edit) @34030   4 years davidb Ignore archives and index dirs
(edit) @34029   4 years davidb Moved to atea site
(edit) @34028   4 years davidb Tweaks to overall interface look-and-feel
(edit) @34027   4 years davidb Tweaks to overall interface look-and-feel
(edit) @34026   4 years davidb Used to provide the gray jquery-ui theme to Atea
(edit) @34025   4 years davidb Icon to glTF 3D model/zip files
(edit) @34024   4 years davidb Couple of over-looked files for the initial set of files for Global …
(edit) @34023   4 years davidb Initial set of files for Global Digital Heritage glTF demonstration …
(edit) @34022   4 years davidb Collection for demonstration VR model artefacts
(edit) @34021   4 years davidb Tidy up on help/usage message
(edit) @34020   4 years davidb Changed to using newer version (8.5.51) of Tomcat
(edit) @34019   4 years kjdon replaced a couple of text strings
(edit) @34018   4 years kjdon check for error element in response - add that in if present, instead …
(edit) @34017   4 years kjdon add error element, don't just print a message to log, if we have …
(edit) @34016   4 years kjdon added cpan folder to @INC, as something is expecting to find JSON.pm - …
(edit) @34015   4 years davidb Further elimination of PJ related HTML/templates
(edit) @34014   4 years davidb Added in vidoe player template; remove PJ templates
(edit) @34013   4 years davidb Added hr line to break up sections
(edit) @34012   4 years davidb Images for Atea alt interface
(edit) @34011   4 years ak19 Piechart data for sites prepared for crawling and the piecharts for these
(edit) @34010   4 years davidb icon image for MP4 video
(edit) @34009   4 years davidb PJ based alternative interface for Atea
(edit) @34008   4 years davidb Alternative interface look-and-feel for the Atea project
(edit) @34007   4 years ak19 Prepared more data for the piecharts. This time for empty web pages vs …
(edit) @34006   4 years ak19 Committing more data I've collected for generating pie charts and the …
(edit) @34005   4 years ak19 InfoOnEmptyPagesNotInMongoDB.txt is now written out to a file, instead …
(edit) @34004   4 years ak19 Renaming csv file to have csv extension
(edit) @34003   4 years ak19 Redid the file with info on empty URL web pages as a csv file with …
(edit) @34002   4 years davidb Comment-based changes resulting from: (i) merging in differences from …
(edit) @34001   4 years ak19 Tentative total urls from common crawl 12 month cral data.
(edit) @34000   4 years ak19 Some debugging and other minor changes
(edit) @33999   4 years ak19 Common crawl 12 month urls and CC provided stats
(edit) @33998   4 years davidb Removed import statement that is no longer used, and was stopping …
(edit) @33997   4 years davidb Top-level folder for MARS related Greenstone3 code
(edit) @33996   4 years ak19 Accidentally committed the wrong thing in previous commit. Attempting …
(edit) @33995   4 years ak19 There was no Expat.so for perl 5.18 so am recompiling and committing that
(edit) @33994   4 years davidb The introduction of UTF8Control class means we can now work directly …
(edit) @33993   4 years kjdon when downloading a pdf, browsers seem to make more than one request - …
(edit) @33992   4 years davidb Notes at start of file updated
(edit) @33991   4 years davidb A version of the tomcat/conf/server.xml file that is better aligned …
(edit) @33990   4 years davidb Some white-space changes for consistency with newer …
(edit) @33989   4 years davidb In a default setup, AJP is not used => so not needed. Commented out to …
(edit) @33988   4 years ak19 1. Print out which web pages of which web site's dump.txt were empty. …
(edit) @33987   4 years ak19 Output of re-running NutchTextDumpToMongoDB to print out which web …
(edit) @33986   4 years ak19 Dr Bainbridge investigated the original data set more
(edit) @33985   4 years ak19 Data to back the piechart I need to make that will illustrate how we …
(edit) @33984   4 years ak19 Simple class to summarise some basic counts of the input common crawl data
(edit) @33983   4 years ak19 More sensible name for method which had too long kept its old name …
(edit) @33982   4 years ak19 SummaryTool.java now processed the handcrafted UNIQUE domains counts …
(edit) @33981   4 years ak19 As Dr Bainbridge suggested, code now opens a new firefox tab with a …
(edit) @33980   4 years ak19 Additional comments
(edit) @33979   4 years ak19 Clearly stating that counts are of unique domains
(edit) @33978   4 years ak19 Opens all geoJSON maps in new tabs instead of waiting for user to have …
(edit) @33977   4 years ak19 Added something on precision vs recall being applicable to our …
(edit) @33976   4 years ak19 Adding in what I could remember of Dr Bainbridge's statement about the …
(edit) @33975   4 years kjdon some mods to do with allowing multiple oaiservers. need …
(edit) @33974   4 years kjdon added in new oai.servlets field - if you want to run two oaiservlets, …
(edit) @33973   4 years kjdon tidied up the file a bit. added new servlet_url param to oaiserver - …
(edit) @33972   4 years kjdon fixed a typo in a comment
(edit) @33971   4 years kjdon get servlet_url param and pass to getOAIConfigXML, as now the files …
(edit) @33970   4 years kjdon changed OAIConfig naming to OAIConfig-oaiserver.xml - so multiple …
(edit) @33969   4 years kjdon we no longer use OAIConfig.xml as the filename, now we use eg …
(edit) @33968   4 years kjdon pass in oai_config from server, rather than reading it in itself
(edit) @33967   4 years kjdon you might want to change the oaiserver url, eg if you have 2 oai …
(edit) @33966   4 years ak19 Added the origSequence and basicDomain columns to the random 260 web …
(edit) @33965   4 years ak19 1. Adding a basicDomain column (stripped of http/https and www prefix) …
(edit) @33964   4 years ak19 2 records were missing a value for the qualityLevel column.
(edit) @33963   4 years ak19 Added a new helper method to MongoDBQueryer.java to add numPagesInMRI …
(edit) @33962   4 years ak19 2 fields changed, as one was missed out and the other incorrectly …
(edit) @33961   4 years ak19 New category, LINK_TEXT, introduced for the random web page URL samples.
(edit) @33960   4 years ak19 Reviewed all the random sample web page URLs marked …
(edit) @33959   4 years ak19 URIEncoding the mapData makes it unparseable by geojson.io
(edit) @33958   4 years ak19 There were other xsl files using the original depositorTitleAndLink …
(edit) @33957   4 years ak19 1. depositor related interface display modified to work with recent …
(edit) @33956   4 years ak19 Related to commit 33953: made lots of accidental commits in rev 33953, …
(edit) @33955   4 years ak19 Undoing accidental commit of unintended files.
(edit) @33954   4 years ak19 Accidentally committed with other files. Undoing.
(edit) @33953   4 years ak19 Depositor link not used
(edit) @33952   4 years ak19 Minor changes for processing
(edit) @33951   4 years ak19 Reviewed the qualityLevel column where LITTLE_TEXT was assigned.
(edit) @33950   4 years ak19 Reviewed the qualityLevel column where MIXED_TEXT was assigned.
(edit) @33949   4 years ak19 Reviewed the qualityLevel column where NAV was assigned.
(edit) @33948   4 years ak19 Reviewed the random sampled web page URLs marked as …
(edit) @33947   4 years ak19 Some more questionmarked field values assigned.
(edit) @33946   4 years ak19 1. New function to handle user input assigning the newly introduced …
(edit) @33945   4 years ak19 Added a 4th column for all 260 sample web page URLs and have used the …
(edit) @33944   4 years ak19 Added the isReallyInMRI column after manually inspecting the remaining …
(edit) @33943   4 years davidb Further tweaking of javah check after it failed to work on Bedrock LSB
(edit) @33942   4 years davidb Further tweaking of javah check after it failed to work on Bedrock LSB
(edit) @33941   4 years ak19 1. Uppercase 3rd field (Y/N/? field) read back in from file before …
(edit) @33940   4 years ak19 1. In order to make it easier to do the manual work of inspecting 260 …
(edit) @33939   4 years ak19 1. Old random samples file doesn't apply as we're not sampling by …
(edit) @33938   4 years ak19 1. Don't regenerate random sample of web page urls and full web page …
(edit) @33937   4 years ak19 New counts of manual sites after reingesting into MongoDB. Forgot to …
(edit) @33936   4 years ak19 Renaming old file to place with new counts after reingesting into MongoDB.
(edit) @33935   4 years davidb Additional check added into get-isis target
(edit) @33934   4 years davidb Removal of static code block calling ancient/deprecated static …
(edit) @33933   4 years davidb Changed 8-spaces to tag chars in Makefile.in. Original problem caused …
(edit) @33932   4 years davidb Commented out Java version warning message, as it presents as …
Note: See TracRevisionLog for help on using the revision log.