Timeline



2020-03-18:

18:24 Changeset [34088] by ak19
Cher (Chai Lin): Adding the Opotiki site, with its 3 collections. The …
18:20 Changeset [34087] by ak19
Cher (Chai Lin): Adding the Opotiki interface.

2020-03-15:

17:47 Changeset [34086] by davidb
Refactoring code to have generateRedirect method needed some …
17:41 Changeset [34085] by davidb
Reload collection after formatting/collectionConfig changes
17:40 Changeset [34084] by davidb
upgraded and renamed
17:40 Changeset [34083] by davidb
upgraded and renamed
17:39 Changeset [34082] by davidb
upgraded and renamed
17:37 Changeset [34081] by davidb
Collection desc text improvements; single doc security example added …
17:36 Changeset [34080] by davidb
Collections now strutured as gruops
17:35 Changeset [34079] by davidb
Change to protocol neutral way of retrieving jquery-ui JS; removal of …
17:33 Changeset [34078] by davidb
Some CSS adjustments for group and collectoin main icons
14:59 Changeset [34077] by davidb
Neated version of script, renamed, and moved to be in the collect dir
14:57 Changeset [34076] by davidb
Changed to using WaveSurfer to play audio, rather than raw \<audio\> tag
13:32 Changeset [34075] by davidb
No longer needed: use the col specific reconfigure now
13:31 Changeset [34074] by davidb
Does just the collection

2020-03-14:

23:30 Changeset [34073] by davidb
cwd and site need to be set earlier on
23:21 Changeset [34072] by davidb
Updated to add download link
23:20 Changeset [34071] by davidb
Script changed to work out the site-name from the pwd (in addiition to …
18:24 Changeset [34070] by davidb
Don't want this file under svn control for a collection that starts …
18:18 Changeset [34069] by davidb
Update on files to ignore
18:14 Changeset [34068] by davidb
Standardizing on new name for this type of script
18:13 Changeset [34067] by davidb
Fixed typo, added pw hint
18:12 Changeset [34066] by davidb
Update on files to ignore
18:10 Changeset [34065] by davidb
Do not want these unver svn control for a collection with an empty …
18:09 Changeset [34064] by davidb
Edited to include pw hint
18:06 Changeset [34063] by davidb
Script rename
18:05 Changeset [34062] by davidb
Fixed typo in script
17:53 Changeset [34061] by davidb
Promoting the Alternative interface to now be the main Atea interface
17:51 Changeset [34060] by davidb
Deprecating this version of the Atea interface
17:48 Changeset [34059] by davidb
Moving the original localhost _LL.properties files out of the way for …
17:47 Changeset [34058] by davidb
To be used to hold the original localhost _LL.properties files
17:41 Changeset [34057] by davidb
Updated reference to glTF.png icon
17:40 Changeset [34056] by davidb
Moved from interface location, as referenced in the siteConfig.xml file
17:39 Changeset [34055] by davidb
files to ignore
17:38 Changeset [34054] by davidb
Directories to ignore
17:36 Changeset [34053] by davidb
Don't want these under SVN control in a collection that starts with no …
17:34 Changeset [34052] by davidb
Not working with GLIL, so having such a file will only increasingly …
17:26 Changeset [34051] by davidb
Update on files to ignore
17:23 Changeset [34050] by davidb
Once the zip files are added in, do not want svn to report they files …
17:19 Changeset [34049] by davidb
files to ignore
17:18 Changeset [34048] by davidb
directories to ignore
17:15 Changeset [34047] by davidb
Further directory to ignore
17:10 Changeset [34046] by davidb
Some directories to ignore
17:09 Changeset [34045] by davidb
Some files to ignore
17:08 Changeset [34044] by davidb
Some directories to ignore
17:07 Changeset [34043] by davidb
Some files to ignore
17:06 Changeset [34042] by davidb
Initial set of files
17:06 Changeset [34041] by davidb
Initial set of files
17:00 Changeset [34040] by davidb
Top-level folder for MP3s collection sourced from hemi-dl
17:00 Changeset [34039] by davidb
Top-level folder for PDF collection sourced from hemi-dl
16:54 Changeset [34038] by davidb
Run this script to populate the import folder
16:52 Changeset [34037] by davidb
No longer needed as will be formed by untarring import.tar.gz
16:52 Changeset [34036] by davidb
import folder with the result of running the NZ Digital API searching …
16:49 Changeset [34035] by davidb
Directories to ignore
16:32 Changeset [34034] by davidb
Some files to ignore
16:32 Changeset [34033] by davidb
Solr schema.xml changes that have flowed through from changes in the …
16:25 Changeset [34032] by davidb
Files to ignore
16:24 Changeset [34031] by davidb
Dedicated scripts for this collection to build and activate it
16:23 Changeset [34030] by davidb
Ignore archives and index dirs
16:15 Changeset [34029] by davidb
Moved to atea site

2020-03-13:

23:19 Changeset [34028] by davidb
Tweaks to overall interface look-and-feel
23:19 Changeset [34027] by davidb
Tweaks to overall interface look-and-feel
23:18 Changeset [34026] by davidb
Used to provide the gray jquery-ui theme to Atea
23:16 Changeset [34025] by davidb
Icon to glTF 3D model/zip files
23:15 Changeset [34024] by davidb
Couple of over-looked files for the initial set of files for Global …
23:14 Changeset [34023] by davidb
Initial set of files for Global Digital Heritage glTF demonstration …
23:09 Changeset [34022] by davidb
Collection for demonstration VR model artefacts

2020-03-12:

17:22 Changeset [34021] by davidb
Tidy up on help/usage message
17:20 Changeset [34020] by davidb
Changed to using newer version (8.5.51) of Tomcat
15:04 Changeset [34019] by kjdon
replaced a couple of text strings
13:42 Changeset [34018] by kjdon
check for error element in response - add that in if present, instead …
13:41 Changeset [34017] by kjdon
add error element, don't just print a message to log, if we have …
13:32 Changeset [34016] by kjdon
added cpan folder to @INC, as something is expecting to find JSON.pm - …

2020-03-10:

21:03 Changeset [34015] by davidb
Further elimination of PJ related HTML/templates
21:02 Changeset [34014] by davidb
Added in vidoe player template; remove PJ templates
21:01 Changeset [34013] by davidb
Added hr line to break up sections
20:53 Changeset [34012] by davidb
Images for Atea alt interface
20:45 Changeset [34011] by ak19
Piechart data for sites prepared for crawling and the piecharts for these
20:45 Changeset [34010] by davidb
icon image for MP4 video
20:25 Changeset [34009] by davidb
PJ based alternative interface for Atea
20:17 Changeset [34008] by davidb
Alternative interface look-and-feel for the Atea project
19:56 Changeset [34007] by ak19
Prepared more data for the piecharts. This time for empty web pages vs …
18:51 Changeset [34006] by ak19
Committing more data I've collected for generating pie charts and the …
17:33 Changeset [34005] by ak19
InfoOnEmptyPagesNotInMongoDB.txt is now written out to a file, instead …
17:27 Changeset [34004] by ak19
Renaming csv file to have csv extension
17:26 Changeset [34003] by ak19
Redid the file with info on empty URL web pages as a csv file with …
12:09 Changeset [34002] by davidb
Comment-based changes resulting from: (i) merging in differences from …

2020-03-09:

18:56 Changeset [34001] by ak19
Tentative total urls from common crawl 12 month cral data.
18:55 Changeset [34000] by ak19
Some debugging and other minor changes
17:34 Changeset [33999] by ak19
Common crawl 12 month urls and CC provided stats

2020-03-06:

17:49 Changeset [33998] by davidb
Removed import statement that is no longer used, and was stopping …
15:55 Changeset [33997] by davidb
Top-level folder for MARS related Greenstone3 code
15:18 Changeset [33996] by ak19
Accidentally committed the wrong thing in previous commit. Attempting …
15:14 Changeset [33995] by ak19
There was no Expat.so for perl 5.18 so am recompiling and committing that

2020-03-03:

14:42 Changeset [33994] by davidb
The introduction of UTF8Control class means we can now work directly …

2020-03-02:

14:10 Changeset [33993] by kjdon
when downloading a pdf, browsers seem to make more than one request - …

2020-03-01:

16:41 Changeset [33992] by davidb
Notes at start of file updated
16:35 Changeset [33991] by davidb
A version of the tomcat/conf/server.xml file that is better aligned …
16:29 Changeset [33990] by davidb
Some white-space changes for consistency with newer …
15:16 Changeset [33989] by davidb
In a default setup, AJP is not used => so not needed. Commented out to …

2020-02-28:

22:09 Changeset [33988] by ak19
1. Print out which web pages of which web site's dump.txt were empty. …
22:08 Changeset [33987] by ak19
Output of re-running NutchTextDumpToMongoDB to print out which web …
22:07 Changeset [33986] by ak19
Dr Bainbridge investigated the original data set more

2020-02-27:

21:49 Changeset [33985] by ak19
Data to back the piechart I need to make that will illustrate how we …
21:44 Changeset [33984] by ak19
Simple class to summarise some basic counts of the input common crawl data
20:26 Changeset [33983] by ak19
More sensible name for method which had too long kept its old name …

2020-02-26:

21:59 Changeset [33982] by ak19
SummaryTool.java now processed the handcrafted UNIQUE domains counts …
21:19 Changeset [33981] by ak19
As Dr Bainbridge suggested, code now opens a new firefox tab with a …
21:11 Changeset [33980] by ak19
Additional comments
21:00 Changeset [33979] by ak19
Clearly stating that counts are of unique domains
19:57 Changeset [33978] by ak19
Opens all geoJSON maps in new tabs instead of waiting for user to have …
18:37 Changeset [33977] by ak19
Added something on precision vs recall being applicable to our …
18:28 Changeset [33976] by ak19
Adding in what I could remember of Dr Bainbridge's statement about the …

2020-02-25:

14:46 Changeset [33975] by kjdon
some mods to do with allowing multiple oaiservers. need …
14:14 Changeset [33974] by kjdon
added in new oai.servlets field - if you want to run two oaiservlets, …
14:01 Changeset [33973] by kjdon
tidied up the file a bit. added new servlet_url param to oaiserver - …
13:47 Changeset [33972] by kjdon
fixed a typo in a comment
13:47 Changeset [33971] by kjdon
get servlet_url param and pass to getOAIConfigXML, as now the files …
13:46 Changeset [33970] by kjdon
changed OAIConfig naming to OAIConfig-oaiserver.xml - so multiple …
13:39 Changeset [33969] by kjdon
we no longer use OAIConfig.xml as the filename, now we use eg …
13:37 Changeset [33968] by kjdon
pass in oai_config from server, rather than reading it in itself
13:36 Changeset [33967] by kjdon
you might want to change the oaiserver url, eg if you have 2 oai …

2020-02-21:

21:00 Changeset [33966] by ak19
Added the origSequence and basicDomain columns to the random 260 web …
20:59 Changeset [33965] by ak19
1. Adding a basicDomain column (stripped of http/https and www prefix) …
19:57 Changeset [33964] by ak19
2 records were missing a value for the qualityLevel column.

2020-02-20:

22:12 Changeset [33963] by ak19
Added a new helper method to MongoDBQueryer.java to add numPagesInMRI …
22:07 Changeset [33962] by ak19
2 fields changed, as one was missed out and the other incorrectly …
20:24 Changeset [33961] by ak19
New category, LINK_TEXT, introduced for the random web page URL samples.
20:22 Changeset [33960] by ak19
Reviewed all the random sample web page URLs marked …
20:06 Changeset [33959] by ak19
URIEncoding the mapData makes it unparseable by geojson.io
19:32 Changeset [33958] by ak19
There were other xsl files using the original depositorTitleAndLink …
19:24 Changeset [33957] by ak19
1. depositor related interface display modified to work with recent …
18:28 Changeset [33956] by ak19
Related to commit 33953: made lots of accidental commits in rev 33953, …
18:26 Changeset [33955] by ak19
Undoing accidental commit of unintended files.
18:21 Changeset [33954] by ak19
Accidentally committed with other files. Undoing.
18:19 Changeset [33953] by ak19
Depositor link not used

2020-02-18:

23:35 Changeset [33952] by ak19
Minor changes for processing
23:33 Changeset [33951] by ak19
Reviewed the qualityLevel column where LITTLE_TEXT was assigned.
23:28 Changeset [33950] by ak19
Reviewed the qualityLevel column where MIXED_TEXT was assigned.
23:22 Changeset [33949] by ak19
Reviewed the qualityLevel column where NAV was assigned.
22:56 Changeset [33948] by ak19
Reviewed the random sampled web page URLs marked as …
22:07 Changeset [33947] by ak19
Some more questionmarked field values assigned.
21:58 Changeset [33946] by ak19
1. New function to handle user input assigning the newly introduced …
21:48 Changeset [33945] by ak19
Added a 4th column for all 260 sample web page URLs and have used the …
16:44 Changeset [33944] by ak19
Added the isReallyInMRI column after manually inspecting the remaining …
15:56 Changeset [33943] by davidb
Further tweaking of javah check after it failed to work on Bedrock LSB
15:55 Changeset [33942] by davidb
Further tweaking of javah check after it failed to work on Bedrock LSB
15:18 Changeset [33941] by ak19
1. Uppercase 3rd field (Y/N/? field) read back in from file before …

2020-02-17:

22:16 Changeset [33940] by ak19
1. In order to make it easier to do the manual work of inspecting 260 …
16:22 Changeset [33939] by ak19
1. Old random samples file doesn't apply as we're not sampling by …
16:10 Changeset [33938] by ak19
1. Don't regenerate random sample of web page urls and full web page …
16:06 Changeset [33937] by ak19
New counts of manual sites after reingesting into MongoDB. Forgot to …
16:05 Changeset [33936] by ak19
Renaming old file to place with new counts after reingesting into MongoDB.
Note: See TracTimeline for information about the timeline view.