Timeline
2020-03-18:
- 18:24 Changeset [34088] by
- Cher (Chai Lin): Adding the Opotiki site, with its 3 collections. The …
- 18:20 Changeset [34087] by
- Cher (Chai Lin): Adding the Opotiki interface.
2020-03-15:
- 17:47 Changeset [34086] by
- Refactoring code to have generateRedirect method needed some …
- 17:41 Changeset [34085] by
- Reload collection after formatting/collectionConfig changes
- 17:40 Changeset [34084] by
- upgraded and renamed
- 17:40 Changeset [34083] by
- upgraded and renamed
- 17:39 Changeset [34082] by
- upgraded and renamed
- 17:37 Changeset [34081] by
- Collection desc text improvements; single doc security example added …
- 17:36 Changeset [34080] by
- Collections now strutured as gruops
- 17:35 Changeset [34079] by
- Change to protocol neutral way of retrieving jquery-ui JS; removal of …
- 17:33 Changeset [34078] by
- Some CSS adjustments for group and collectoin main icons
- 14:59 Changeset [34077] by
- Neated version of script, renamed, and moved to be in the collect dir
- 14:57 Changeset [34076] by
- Changed to using WaveSurfer to play audio, rather than raw \<audio\> tag
- 13:32 Changeset [34075] by
- No longer needed: use the col specific reconfigure now
- 13:31 Changeset [34074] by
- Does just the collection
2020-03-14:
- 23:30 Changeset [34073] by
- cwd and site need to be set earlier on
- 23:21 Changeset [34072] by
- Updated to add download link
- 23:20 Changeset [34071] by
- Script changed to work out the site-name from the pwd (in addiition to …
- 18:24 Changeset [34070] by
- Don't want this file under svn control for a collection that starts …
- 18:18 Changeset [34069] by
- Update on files to ignore
- 18:14 Changeset [34068] by
- Standardizing on new name for this type of script
- 18:13 Changeset [34067] by
- Fixed typo, added pw hint
- 18:12 Changeset [34066] by
- Update on files to ignore
- 18:10 Changeset [34065] by
- Do not want these unver svn control for a collection with an empty …
- 18:09 Changeset [34064] by
- Edited to include pw hint
- 18:06 Changeset [34063] by
- Script rename
- 18:05 Changeset [34062] by
- Fixed typo in script
- 17:53 Changeset [34061] by
- Promoting the Alternative interface to now be the main Atea interface
- 17:51 Changeset [34060] by
- Deprecating this version of the Atea interface
- 17:48 Changeset [34059] by
- Moving the original localhost _LL.properties files out of the way for …
- 17:47 Changeset [34058] by
- To be used to hold the original localhost _LL.properties files
- 17:41 Changeset [34057] by
- Updated reference to glTF.png icon
- 17:40 Changeset [34056] by
- Moved from interface location, as referenced in the siteConfig.xml file
- 17:39 Changeset [34055] by
- files to ignore
- 17:38 Changeset [34054] by
- Directories to ignore
- 17:36 Changeset [34053] by
- Don't want these under SVN control in a collection that starts with no …
- 17:34 Changeset [34052] by
- Not working with GLIL, so having such a file will only increasingly …
- 17:26 Changeset [34051] by
- Update on files to ignore
- 17:23 Changeset [34050] by
- Once the zip files are added in, do not want svn to report they files …
- 17:19 Changeset [34049] by
- files to ignore
- 17:18 Changeset [34048] by
- directories to ignore
- 17:15 Changeset [34047] by
- Further directory to ignore
- 17:10 Changeset [34046] by
- Some directories to ignore
- 17:09 Changeset [34045] by
- Some files to ignore
- 17:08 Changeset [34044] by
- Some directories to ignore
- 17:07 Changeset [34043] by
- Some files to ignore
- 17:06 Changeset [34042] by
- Initial set of files
- 17:06 Changeset [34041] by
- Initial set of files
- 17:00 Changeset [34040] by
- Top-level folder for MP3s collection sourced from hemi-dl
- 17:00 Changeset [34039] by
- Top-level folder for PDF collection sourced from hemi-dl
- 16:54 Changeset [34038] by
- Run this script to populate the import folder
- 16:52 Changeset [34037] by
- No longer needed as will be formed by untarring import.tar.gz
- 16:52 Changeset [34036] by
- import folder with the result of running the NZ Digital API searching …
- 16:49 Changeset [34035] by
- Directories to ignore
- 16:32 Changeset [34034] by
- Some files to ignore
- 16:32 Changeset [34033] by
- Solr schema.xml changes that have flowed through from changes in the …
- 16:25 Changeset [34032] by
- Files to ignore
- 16:24 Changeset [34031] by
- Dedicated scripts for this collection to build and activate it
- 16:23 Changeset [34030] by
- Ignore archives and index dirs
- 16:15 Changeset [34029] by
- Moved to atea site
2020-03-13:
- 23:19 Changeset [34028] by
- Tweaks to overall interface look-and-feel
- 23:19 Changeset [34027] by
- Tweaks to overall interface look-and-feel
- 23:18 Changeset [34026] by
- Used to provide the gray jquery-ui theme to Atea
- 23:16 Changeset [34025] by
- Icon to glTF 3D model/zip files
- 23:15 Changeset [34024] by
- Couple of over-looked files for the initial set of files for Global …
- 23:14 Changeset [34023] by
- Initial set of files for Global Digital Heritage glTF demonstration …
- 23:09 Changeset [34022] by
- Collection for demonstration VR model artefacts
2020-03-12:
- 17:22 Changeset [34021] by
- Tidy up on help/usage message
- 17:20 Changeset [34020] by
- Changed to using newer version (8.5.51) of Tomcat
- 15:04 Changeset [34019] by
- replaced a couple of text strings
- 13:42 Changeset [34018] by
- check for error element in response - add that in if present, instead …
- 13:41 Changeset [34017] by
- add error element, don't just print a message to log, if we have …
- 13:32 Changeset [34016] by
- added cpan folder to @INC, as something is expecting to find JSON.pm - …
2020-03-10:
- 21:03 Changeset [34015] by
- Further elimination of PJ related HTML/templates
- 21:02 Changeset [34014] by
- Added in vidoe player template; remove PJ templates
- 21:01 Changeset [34013] by
- Added hr line to break up sections
- 20:53 Changeset [34012] by
- Images for Atea alt interface
- 20:45 Changeset [34011] by
- Piechart data for sites prepared for crawling and the piecharts for these
- 20:45 Changeset [34010] by
- icon image for MP4 video
- 20:25 Changeset [34009] by
- PJ based alternative interface for Atea
- 20:17 Changeset [34008] by
- Alternative interface look-and-feel for the Atea project
- 19:56 Changeset [34007] by
- Prepared more data for the piecharts. This time for empty web pages vs …
- 18:51 Changeset [34006] by
- Committing more data I've collected for generating pie charts and the …
- 17:33 Changeset [34005] by
- InfoOnEmptyPagesNotInMongoDB.txt is now written out to a file, instead …
- 17:27 Changeset [34004] by
- Renaming csv file to have csv extension
- 17:26 Changeset [34003] by
- Redid the file with info on empty URL web pages as a csv file with …
- 12:09 Changeset [34002] by
- Comment-based changes resulting from: (i) merging in differences from …
2020-03-09:
- 18:56 Changeset [34001] by
- Tentative total urls from common crawl 12 month cral data.
- 18:55 Changeset [34000] by
- Some debugging and other minor changes
- 17:34 Changeset [33999] by
- Common crawl 12 month urls and CC provided stats
2020-03-06:
- 17:49 Changeset [33998] by
- Removed import statement that is no longer used, and was stopping …
- 15:55 Changeset [33997] by
- Top-level folder for MARS related Greenstone3 code
- 15:18 Changeset [33996] by
- Accidentally committed the wrong thing in previous commit. Attempting …
- 15:14 Changeset [33995] by
- There was no Expat.so for perl 5.18 so am recompiling and committing that
2020-03-03:
- 14:42 Changeset [33994] by
- The introduction of UTF8Control class means we can now work directly …
2020-03-02:
- 14:10 Changeset [33993] by
- when downloading a pdf, browsers seem to make more than one request - …
2020-03-01:
- 16:41 Changeset [33992] by
- Notes at start of file updated
- 16:35 Changeset [33991] by
- A version of the tomcat/conf/server.xml file that is better aligned …
- 16:29 Changeset [33990] by
- Some white-space changes for consistency with newer …
- 15:16 Changeset [33989] by
- In a default setup, AJP is not used => so not needed. Commented out to …
2020-02-28:
- 22:09 Changeset [33988] by
- 1. Print out which web pages of which web site's dump.txt were empty. …
- 22:08 Changeset [33987] by
- Output of re-running NutchTextDumpToMongoDB to print out which web …
- 22:07 Changeset [33986] by
- Dr Bainbridge investigated the original data set more
2020-02-27:
- 21:49 Changeset [33985] by
- Data to back the piechart I need to make that will illustrate how we …
- 21:44 Changeset [33984] by
- Simple class to summarise some basic counts of the input common crawl data
- 20:26 Changeset [33983] by
- More sensible name for method which had too long kept its old name …
2020-02-26:
- 21:59 Changeset [33982] by
- SummaryTool.java now processed the handcrafted UNIQUE domains counts …
- 21:19 Changeset [33981] by
- As Dr Bainbridge suggested, code now opens a new firefox tab with a …
- 21:11 Changeset [33980] by
- Additional comments
- 21:00 Changeset [33979] by
- Clearly stating that counts are of unique domains
- 19:57 Changeset [33978] by
- Opens all geoJSON maps in new tabs instead of waiting for user to have …
- 18:37 Changeset [33977] by
- Added something on precision vs recall being applicable to our …
- 18:28 Changeset [33976] by
- Adding in what I could remember of Dr Bainbridge's statement about the …
2020-02-25:
- 14:46 Changeset [33975] by
- some mods to do with allowing multiple oaiservers. need …
- 14:14 Changeset [33974] by
- added in new oai.servlets field - if you want to run two oaiservlets, …
- 14:01 Changeset [33973] by
- tidied up the file a bit. added new servlet_url param to oaiserver - …
- 13:47 Changeset [33972] by
- fixed a typo in a comment
- 13:47 Changeset [33971] by
- get servlet_url param and pass to getOAIConfigXML, as now the files …
- 13:46 Changeset [33970] by
- changed OAIConfig naming to OAIConfig-oaiserver.xml - so multiple …
- 13:39 Changeset [33969] by
- we no longer use OAIConfig.xml as the filename, now we use eg …
- 13:37 Changeset [33968] by
- pass in oai_config from server, rather than reading it in itself
- 13:36 Changeset [33967] by
- you might want to change the oaiserver url, eg if you have 2 oai …
2020-02-21:
- 21:00 Changeset [33966] by
- Added the origSequence and basicDomain columns to the random 260 web …
- 20:59 Changeset [33965] by
- 1. Adding a basicDomain column (stripped of http/https and www prefix) …
- 19:57 Changeset [33964] by
- 2 records were missing a value for the qualityLevel column.
2020-02-20:
- 22:12 Changeset [33963] by
- Added a new helper method to MongoDBQueryer.java to add numPagesInMRI …
- 22:07 Changeset [33962] by
- 2 fields changed, as one was missed out and the other incorrectly …
- 20:24 Changeset [33961] by
- New category, LINK_TEXT, introduced for the random web page URL samples.
- 20:22 Changeset [33960] by
- Reviewed all the random sample web page URLs marked …
- 20:06 Changeset [33959] by
- URIEncoding the mapData makes it unparseable by geojson.io
- 19:32 Changeset [33958] by
- There were other xsl files using the original depositorTitleAndLink …
- 19:24 Changeset [33957] by
- 1. depositor related interface display modified to work with recent …
- 18:28 Changeset [33956] by
- Related to commit 33953: made lots of accidental commits in rev 33953, …
- 18:26 Changeset [33955] by
- Undoing accidental commit of unintended files.
- 18:21 Changeset [33954] by
- Accidentally committed with other files. Undoing.
- 18:19 Changeset [33953] by
- Depositor link not used
2020-02-18:
- 23:35 Changeset [33952] by
- Minor changes for processing
- 23:33 Changeset [33951] by
- Reviewed the qualityLevel column where LITTLE_TEXT was assigned.
- 23:28 Changeset [33950] by
- Reviewed the qualityLevel column where MIXED_TEXT was assigned.
- 23:22 Changeset [33949] by
- Reviewed the qualityLevel column where NAV was assigned.
- 22:56 Changeset [33948] by
- Reviewed the random sampled web page URLs marked as …
- 22:07 Changeset [33947] by
- Some more questionmarked field values assigned.
- 21:58 Changeset [33946] by
- 1. New function to handle user input assigning the newly introduced …
- 21:48 Changeset [33945] by
- Added a 4th column for all 260 sample web page URLs and have used the …
- 16:44 Changeset [33944] by
- Added the isReallyInMRI column after manually inspecting the remaining …
- 15:56 Changeset [33943] by
- Further tweaking of javah check after it failed to work on Bedrock LSB
- 15:55 Changeset [33942] by
- Further tweaking of javah check after it failed to work on Bedrock LSB
- 15:18 Changeset [33941] by
- 1. Uppercase 3rd field (Y/N/? field) read back in from file before …
2020-02-17:
- 22:16 Changeset [33940] by
- 1. In order to make it easier to do the manual work of inspecting 260 …
- 16:22 Changeset [33939] by
- 1. Old random samples file doesn't apply as we're not sampling by …
- 16:10 Changeset [33938] by
- 1. Don't regenerate random sample of web page urls and full web page …
- 16:06 Changeset [33937] by
- New counts of manual sites after reingesting into MongoDB. Forgot to …
- 16:05 Changeset [33936] by
- Renaming old file to place with new counts after reingesting into MongoDB.
Note:
See TracTimeline
for information about the timeline view.