Timeline
2020-03-14:
- 23:30 Changeset [34073] by
- cwd and site need to be set earlier on
- 23:21 Changeset [34072] by
- Updated to add download link
- 23:20 Changeset [34071] by
- Script changed to work out the site-name from the pwd (in addiition to …
- 18:24 Changeset [34070] by
- Don't want this file under svn control for a collection that starts …
- 18:18 Changeset [34069] by
- Update on files to ignore
- 18:14 Changeset [34068] by
- Standardizing on new name for this type of script
- 18:13 Changeset [34067] by
- Fixed typo, added pw hint
- 18:12 Changeset [34066] by
- Update on files to ignore
- 18:10 Changeset [34065] by
- Do not want these unver svn control for a collection with an empty …
- 18:09 Changeset [34064] by
- Edited to include pw hint
- 18:06 Changeset [34063] by
- Script rename
- 18:05 Changeset [34062] by
- Fixed typo in script
- 17:53 Changeset [34061] by
- Promoting the Alternative interface to now be the main Atea interface
- 17:51 Changeset [34060] by
- Deprecating this version of the Atea interface
- 17:48 Changeset [34059] by
- Moving the original localhost _LL.properties files out of the way for …
- 17:47 Changeset [34058] by
- To be used to hold the original localhost _LL.properties files
- 17:41 Changeset [34057] by
- Updated reference to glTF.png icon
- 17:40 Changeset [34056] by
- Moved from interface location, as referenced in the siteConfig.xml file
- 17:39 Changeset [34055] by
- files to ignore
- 17:38 Changeset [34054] by
- Directories to ignore
- 17:36 Changeset [34053] by
- Don't want these under SVN control in a collection that starts with no …
- 17:34 Changeset [34052] by
- Not working with GLIL, so having such a file will only increasingly …
- 17:26 Changeset [34051] by
- Update on files to ignore
- 17:23 Changeset [34050] by
- Once the zip files are added in, do not want svn to report they files …
- 17:19 Changeset [34049] by
- files to ignore
- 17:18 Changeset [34048] by
- directories to ignore
- 17:15 Changeset [34047] by
- Further directory to ignore
- 17:10 Changeset [34046] by
- Some directories to ignore
- 17:09 Changeset [34045] by
- Some files to ignore
- 17:08 Changeset [34044] by
- Some directories to ignore
- 17:07 Changeset [34043] by
- Some files to ignore
- 17:06 Changeset [34042] by
- Initial set of files
- 17:06 Changeset [34041] by
- Initial set of files
- 17:00 Changeset [34040] by
- Top-level folder for MP3s collection sourced from hemi-dl
- 17:00 Changeset [34039] by
- Top-level folder for PDF collection sourced from hemi-dl
- 16:54 Changeset [34038] by
- Run this script to populate the import folder
- 16:52 Changeset [34037] by
- No longer needed as will be formed by untarring import.tar.gz
- 16:52 Changeset [34036] by
- import folder with the result of running the NZ Digital API searching …
- 16:49 Changeset [34035] by
- Directories to ignore
- 16:32 Changeset [34034] by
- Some files to ignore
- 16:32 Changeset [34033] by
- Solr schema.xml changes that have flowed through from changes in the …
- 16:25 Changeset [34032] by
- Files to ignore
- 16:24 Changeset [34031] by
- Dedicated scripts for this collection to build and activate it
- 16:23 Changeset [34030] by
- Ignore archives and index dirs
- 16:15 Changeset [34029] by
- Moved to atea site
2020-03-13:
- 23:19 Changeset [34028] by
- Tweaks to overall interface look-and-feel
- 23:19 Changeset [34027] by
- Tweaks to overall interface look-and-feel
- 23:18 Changeset [34026] by
- Used to provide the gray jquery-ui theme to Atea
- 23:16 Changeset [34025] by
- Icon to glTF 3D model/zip files
- 23:15 Changeset [34024] by
- Couple of over-looked files for the initial set of files for Global …
- 23:14 Changeset [34023] by
- Initial set of files for Global Digital Heritage glTF demonstration …
- 23:09 Changeset [34022] by
- Collection for demonstration VR model artefacts
2020-03-12:
- 17:22 Changeset [34021] by
- Tidy up on help/usage message
- 17:20 Changeset [34020] by
- Changed to using newer version (8.5.51) of Tomcat
- 15:04 Changeset [34019] by
- replaced a couple of text strings
- 13:42 Changeset [34018] by
- check for error element in response - add that in if present, instead …
- 13:41 Changeset [34017] by
- add error element, don't just print a message to log, if we have …
- 13:32 Changeset [34016] by
- added cpan folder to @INC, as something is expecting to find JSON.pm - …
2020-03-10:
- 21:03 Changeset [34015] by
- Further elimination of PJ related HTML/templates
- 21:02 Changeset [34014] by
- Added in vidoe player template; remove PJ templates
- 21:01 Changeset [34013] by
- Added hr line to break up sections
- 20:53 Changeset [34012] by
- Images for Atea alt interface
- 20:45 Changeset [34011] by
- Piechart data for sites prepared for crawling and the piecharts for these
- 20:45 Changeset [34010] by
- icon image for MP4 video
- 20:25 Changeset [34009] by
- PJ based alternative interface for Atea
- 20:17 Changeset [34008] by
- Alternative interface look-and-feel for the Atea project
- 19:56 Changeset [34007] by
- Prepared more data for the piecharts. This time for empty web pages vs …
- 18:51 Changeset [34006] by
- Committing more data I've collected for generating pie charts and the …
- 17:33 Changeset [34005] by
- InfoOnEmptyPagesNotInMongoDB.txt is now written out to a file, instead …
- 17:27 Changeset [34004] by
- Renaming csv file to have csv extension
- 17:26 Changeset [34003] by
- Redid the file with info on empty URL web pages as a csv file with …
- 12:09 Changeset [34002] by
- Comment-based changes resulting from: (i) merging in differences from …
2020-03-09:
- 18:56 Changeset [34001] by
- Tentative total urls from common crawl 12 month cral data.
- 18:55 Changeset [34000] by
- Some debugging and other minor changes
- 17:34 Changeset [33999] by
- Common crawl 12 month urls and CC provided stats
2020-03-06:
- 17:49 Changeset [33998] by
- Removed import statement that is no longer used, and was stopping …
- 15:55 Changeset [33997] by
- Top-level folder for MARS related Greenstone3 code
- 15:18 Changeset [33996] by
- Accidentally committed the wrong thing in previous commit. Attempting …
- 15:14 Changeset [33995] by
- There was no Expat.so for perl 5.18 so am recompiling and committing that
2020-03-03:
- 14:42 Changeset [33994] by
- The introduction of UTF8Control class means we can now work directly …
2020-03-02:
- 14:10 Changeset [33993] by
- when downloading a pdf, browsers seem to make more than one request - …
2020-03-01:
- 16:41 Changeset [33992] by
- Notes at start of file updated
- 16:35 Changeset [33991] by
- A version of the tomcat/conf/server.xml file that is better aligned …
- 16:29 Changeset [33990] by
- Some white-space changes for consistency with newer …
- 15:16 Changeset [33989] by
- In a default setup, AJP is not used => so not needed. Commented out to …
2020-02-28:
- 22:09 Changeset [33988] by
- 1. Print out which web pages of which web site's dump.txt were empty. …
- 22:08 Changeset [33987] by
- Output of re-running NutchTextDumpToMongoDB to print out which web …
- 22:07 Changeset [33986] by
- Dr Bainbridge investigated the original data set more
2020-02-27:
- 21:49 Changeset [33985] by
- Data to back the piechart I need to make that will illustrate how we …
- 21:44 Changeset [33984] by
- Simple class to summarise some basic counts of the input common crawl data
- 20:26 Changeset [33983] by
- More sensible name for method which had too long kept its old name …
2020-02-26:
- 21:59 Changeset [33982] by
- SummaryTool.java now processed the handcrafted UNIQUE domains counts …
- 21:19 Changeset [33981] by
- As Dr Bainbridge suggested, code now opens a new firefox tab with a …
- 21:11 Changeset [33980] by
- Additional comments
- 21:00 Changeset [33979] by
- Clearly stating that counts are of unique domains
- 19:57 Changeset [33978] by
- Opens all geoJSON maps in new tabs instead of waiting for user to have …
- 18:37 Changeset [33977] by
- Added something on precision vs recall being applicable to our …
- 18:28 Changeset [33976] by
- Adding in what I could remember of Dr Bainbridge's statement about the …
2020-02-25:
- 14:46 Changeset [33975] by
- some mods to do with allowing multiple oaiservers. need …
- 14:14 Changeset [33974] by
- added in new oai.servlets field - if you want to run two oaiservlets, …
- 14:01 Changeset [33973] by
- tidied up the file a bit. added new servlet_url param to oaiserver - …
- 13:47 Changeset [33972] by
- fixed a typo in a comment
- 13:47 Changeset [33971] by
- get servlet_url param and pass to getOAIConfigXML, as now the files …
- 13:46 Changeset [33970] by
- changed OAIConfig naming to OAIConfig-oaiserver.xml - so multiple …
- 13:39 Changeset [33969] by
- we no longer use OAIConfig.xml as the filename, now we use eg …
- 13:37 Changeset [33968] by
- pass in oai_config from server, rather than reading it in itself
- 13:36 Changeset [33967] by
- you might want to change the oaiserver url, eg if you have 2 oai …
2020-02-21:
- 21:00 Changeset [33966] by
- Added the origSequence and basicDomain columns to the random 260 web …
- 20:59 Changeset [33965] by
- 1. Adding a basicDomain column (stripped of http/https and www prefix) …
- 19:57 Changeset [33964] by
- 2 records were missing a value for the qualityLevel column.
2020-02-20:
- 22:12 Changeset [33963] by
- Added a new helper method to MongoDBQueryer.java to add numPagesInMRI …
- 22:07 Changeset [33962] by
- 2 fields changed, as one was missed out and the other incorrectly …
- 20:24 Changeset [33961] by
- New category, LINK_TEXT, introduced for the random web page URL samples.
- 20:22 Changeset [33960] by
- Reviewed all the random sample web page URLs marked …
- 20:06 Changeset [33959] by
- URIEncoding the mapData makes it unparseable by geojson.io
- 19:32 Changeset [33958] by
- There were other xsl files using the original depositorTitleAndLink …
- 19:24 Changeset [33957] by
- 1. depositor related interface display modified to work with recent …
- 18:28 Changeset [33956] by
- Related to commit 33953: made lots of accidental commits in rev 33953, …
- 18:26 Changeset [33955] by
- Undoing accidental commit of unintended files.
- 18:21 Changeset [33954] by
- Accidentally committed with other files. Undoing.
- 18:19 Changeset [33953] by
- Depositor link not used
2020-02-18:
- 23:35 Changeset [33952] by
- Minor changes for processing
- 23:33 Changeset [33951] by
- Reviewed the qualityLevel column where LITTLE_TEXT was assigned.
- 23:28 Changeset [33950] by
- Reviewed the qualityLevel column where MIXED_TEXT was assigned.
- 23:22 Changeset [33949] by
- Reviewed the qualityLevel column where NAV was assigned.
- 22:56 Changeset [33948] by
- Reviewed the random sampled web page URLs marked as …
- 22:07 Changeset [33947] by
- Some more questionmarked field values assigned.
- 21:58 Changeset [33946] by
- 1. New function to handle user input assigning the newly introduced …
- 21:48 Changeset [33945] by
- Added a 4th column for all 260 sample web page URLs and have used the …
- 16:44 Changeset [33944] by
- Added the isReallyInMRI column after manually inspecting the remaining …
- 15:56 Changeset [33943] by
- Further tweaking of javah check after it failed to work on Bedrock LSB
- 15:55 Changeset [33942] by
- Further tweaking of javah check after it failed to work on Bedrock LSB
- 15:18 Changeset [33941] by
- 1. Uppercase 3rd field (Y/N/? field) read back in from file before …
2020-02-17:
- 22:16 Changeset [33940] by
- 1. In order to make it easier to do the manual work of inspecting 260 …
- 16:22 Changeset [33939] by
- 1. Old random samples file doesn't apply as we're not sampling by …
- 16:10 Changeset [33938] by
- 1. Don't regenerate random sample of web page urls and full web page …
- 16:06 Changeset [33937] by
- New counts of manual sites after reingesting into MongoDB. Forgot to …
- 16:05 Changeset [33936] by
- Renaming old file to place with new counts after reingesting into MongoDB.
2020-02-16:
- 18:16 Changeset [33935] by
- Additional check added into get-isis target
- 17:34 Changeset [33934] by
- Removal of static code block calling ancient/deprecated static …
- 14:19 Changeset [33933] by
- Changed 8-spaces to tag chars in Makefile.in. Original problem caused …
2020-02-15:
- 19:14 Changeset [33932] by
- Commented out Java version warning message, as it presents as …
- 19:10 Changeset [33931] by
- Two changes to setup file. The first was to move the test for ant to …
- 19:00 Changeset [33930] by
- Code used to assume that major number was a single digit, as in 1.6 or …
- 18:57 Changeset [33929] by
- Newer JDKs don't have javah => make file change that takes account of this
- 18:55 Changeset [33928] by
- Streamlining of how test for JDK/javac is done
- 14:57 Changeset [33927] by
- Reworking of javah test
2020-02-14:
- 23:03 Changeset [33926] by
- Investigated some other options for screen capturing and Google chrome …
- 20:41 Changeset [33925] by
- 1. Bugfix: oversight, should return uri encoded URL for mapData, …
- 19:22 Changeset [33924] by
- Adding in Dr Bainbridge's command to check the JSON generated is …
- 18:45 Changeset [33923] by
- Removed non-UTF8 valid char from comment; regenerated tar file
- 18:13 Changeset [33922] by
- Notes about using this site
- 18:11 Changeset [33921] by
- Newer Java's don't have 'javah' any more. The functionality has been …
- 16:55 Changeset [33920] by
- Found to be needed when compiling up on a Google Compute Engine (GCE) …
2020-02-13:
- 22:40 Changeset [33919] by
- SummaryTool now uses the CountryCodeCountsMapData.java class to …
- 19:34 Changeset [33918] by
- Country codes added to each domain's URL of the manual site/domain …
- 18:18 Changeset [33917] by
- Added some better reporting when confirming sample size was correct
- 17:42 Changeset [33916] by
- Updated the rest of the file after reingest
- 17:12 Changeset [33915] by
- Forgot to add a (manual) counts file created last week, and am now …
- 17:09 Changeset [33914] by
- Shortlisted just the domain sites by country into ManualShortlist2.txt …
Note:
See TracTimeline
for information about the timeline view.