Timeline



2020-02-17:

22:16 Changeset [33940] by ak19
1. In order to make it easier to do the manual work of inspecting 260 …
16:22 Changeset [33939] by ak19
1. Old random samples file doesn't apply as we're not sampling by …
16:10 Changeset [33938] by ak19
1. Don't regenerate random sample of web page urls and full web page …
16:06 Changeset [33937] by ak19
New counts of manual sites after reingesting into MongoDB. Forgot to …
16:05 Changeset [33936] by ak19
Renaming old file to place with new counts after reingesting into MongoDB.

2020-02-16:

18:16 Changeset [33935] by davidb
Additional check added into get-isis target
17:34 Changeset [33934] by davidb
Removal of static code block calling ancient/deprecated static …
14:19 Changeset [33933] by davidb
Changed 8-spaces to tag chars in Makefile.in. Original problem caused …

2020-02-15:

19:14 Changeset [33932] by davidb
Commented out Java version warning message, as it presents as …
19:10 Changeset [33931] by davidb
Two changes to setup file. The first was to move the test for ant to …
19:00 Changeset [33930] by davidb
Code used to assume that major number was a single digit, as in 1.6 or …
18:57 Changeset [33929] by davidb
Newer JDKs don't have javah => make file change that takes account of this
18:55 Changeset [33928] by davidb
Streamlining of how test for JDK/javac is done
14:57 Changeset [33927] by davidb
Reworking of javah test

2020-02-14:

23:03 Changeset [33926] by ak19
Investigated some other options for screen capturing and Google chrome …
20:41 Changeset [33925] by ak19
1. Bugfix: oversight, should return uri encoded URL for mapData, …
19:22 Changeset [33924] by ak19
Adding in Dr Bainbridge's command to check the JSON generated is …
18:45 Changeset [33923] by davidb
Removed non-UTF8 valid char from comment; regenerated tar file
18:13 Changeset [33922] by davidb
Notes about using this site
18:11 Changeset [33921] by davidb
Newer Java's don't have 'javah' any more. The functionality has been …
16:55 Changeset [33920] by davidb
Found to be needed when compiling up on a Google Compute Engine (GCE) …

2020-02-13:

22:40 Changeset [33919] by ak19
SummaryTool now uses the CountryCodeCountsMapData.java class to …
19:34 Changeset [33918] by ak19
Country codes added to each domain's URL of the manual site/domain …
18:18 Changeset [33917] by ak19
Added some better reporting when confirming sample size was correct
17:42 Changeset [33916] by ak19
Updated the rest of the file after reingest
17:12 Changeset [33915] by ak19
Forgot to add a (manual) counts file created last week, and am now …
17:09 Changeset [33914] by ak19
Shortlisted just the domain sites by country into ManualShortlist2.txt …

2020-02-12:

21:27 Changeset [33913] by ak19
1. Adjusted table mongodb query statements to be more exact, but same …
19:53 Changeset [33912] by ak19
Forgot to svn add the new MongoDBQueryer.java class with commit 33909. …
19:12 Changeset [33911] by ak19
Correct commit message for previous and current commit: 1. After …
19:05 Changeset [33910] by ak19
1. Implementing tables 3 to 5. 2. Rolled back the introduction of the …
19:02 Changeset [33909] by ak19
1. Implementing tables 3 to 5. 2. Rolled back the introduction of the …

2020-02-10:

09:41 Changeset [33908] by kjdon
meta values are already escaped. Don't want to escape them again …

2020-02-05:

23:38 Changeset [33907] by ak19
See previous commit message. This will be the file with the results …
23:36 Changeset [33906] by ak19
Code is intermediate state. 1. Introduced basicDomain field to MongoDB …
18:49 Changeset [33905] by ak19
More notes
18:48 Changeset [33904] by ak19
Shouldn't greylist anglican.org, as this prevented crawling of …

2020-02-04:

15:50 Changeset [33903] by ak19
My notes when preparing for today's meetings. Some of this may be …
13:05 Changeset [33902] by kjdon
pass in new casefold and accentfold options to format_metadata_for_sorting
13:04 Changeset [33901] by kjdon
new casefold_metadata_for_formatting and …
13:03 Changeset [33900] by kjdon
BaseClassifier casefold/accentfold options
13:03 Changeset [33899] by kjdon
pass in new casefold and accentfold options (BaseClassifier) to …
12:59 Changeset [33898] by kjdon
format_metadata_for_sorting now takes two additional args - casefold …
10:06 Changeset [33897] by kjdon
elsewhere in the code - GSXML.xmlSafe, we are escaping ' => ' we …

2020-02-03:

23:29 Changeset [33896] by ak19
Clarification in comments
23:20 Changeset [33895] by ak19
Minor rename
23:20 Changeset [33894] by ak19
1. Adding map, counts.json and geo-json files for 5b count of sites by …
22:41 Changeset [33893] by ak19
1. Left out region code column. 2. Two more sheets of work in progress …
22:28 Changeset [33892] by ak19
Sheets renamed and spreadsheet renamed
22:27 Changeset [33891] by ak19
Site level detected vs manual inspected data: working shown in file …
20:31 Changeset [33890] by ak19
Finished going through NZ sites listing of numPagesContainingMRI > 0 …
15:48 Changeset [33889] by ak19
1. Additional column: totalPagesAcrossMatchingSites. 2. Screengrab of …
13:08 Changeset [33888] by kjdon
added propertyFile attribute to gsf:interfaceText so that you can …

2020-01-31:

23:49 Changeset [33887] by ak19
1. Added support for writing out tables in csv format too. 2. Second …
23:17 Changeset [33886] by ak19
Minor. File rename
22:54 Changeset [33885] by ak19
Attempting to write the tables. csv not yet supported. Table 1 done.
22:21 Changeset [33884] by ak19
0. Previous commit had lots of modifications, and only 2 files matched …
21:50 Changeset [33883] by ak19
Clarifications

2020-01-30:

22:54 Changeset [33882] by ak19
Code now writes both a listing of all non-autotranslated websites and …
22:08 Changeset [33881] by ak19
Uses lambda expression to process each doc in a mongodb aggregate …
21:17 Changeset [33880] by ak19
Write out the 5counts_tentativeNonAutotranslatedSites.json file with …
20:21 Changeset [33879] by ak19
Have the 2 mongodb aggregate() calls working that
20:18 Changeset [33878] by ak19
Better comment
20:07 Changeset [33877] by ak19
Reordering to have proper descending order of counts

2020-01-29:

21:48 Changeset [33876] by ak19
Some missteps, but have got complex collection.aggregate() working at last.
19:18 Changeset [33875] by ak19
Renaming 2 more files correctly
19:15 Changeset [33874] by ak19
Renaming 2 files correctly

2020-01-24:

21:49 Changeset [33873] by ak19
Beginnings of WebPageURLsListing program whose purpose Dr Bainbridge …
21:44 Changeset [33872] by ak19
1. Added the file containing the 255 random NZ page URLs to sample. 2. …
20:59 Changeset [33871] by ak19
Removed mostly duplicated older version of method but left the …
20:48 Changeset [33870] by ak19
Got the mongodb query working in Java in 2 different ways: the fully …

2020-01-23:

22:59 Changeset [33869] by ak19
First cut at the RandomURLsForDomainGenerator.java class and the …
21:16 Changeset [33868] by ak19
With the updated code for generating the maps from 6a and 6b manual …
21:12 Changeset [33867] by ak19
Moved the code handling of special case large rectangles and those …
18:56 Changeset [33866] by ak19
Dr Bainbridge's fix to Android mobile macronizer user (on Chrome …
18:49 Changeset [33865] by ak19
1. The gs3 context name changed from macronizer to macron-restoration. …
14:09 Changeset [33864] by davidb
Changes to make the Whakatohea banner narrower
11:32 Changeset [33863] by davidb
Script to get sample content for the DL collection
11:17 Changeset [33862] by davidb
Change to specifying the About page text done through about.xml so it …
11:16 Changeset [33861] by davidb
About page text done through about.xml so it can include xslt tags
10:22 Changeset [33860] by davidb
Addition of 3 further CPAN packages, found to be needed on CentOS build
09:56 Changeset [33859] by davidb
Additional CPAN Perl packages found to be needed when compiling up …

2020-01-22:

19:31 Changeset [33858] by ak19
Fixes to the code committed yesterday: correct calculation of the …
16:49 Changeset [33857] by davidb
Next iteration of the about text
16:33 Changeset [33856] by ak19
Forgot to commit. Last week, Dr Bainbridge had properly cropped the …
15:03 Changeset [33855] by davidb
Code added to detect if the CGI parameter already specifies a …

2020-01-21:

22:01 Changeset [33854] by ak19
Manually gone over around 150 webpages of sample size of 255 webpages …
21:58 Changeset [33853] by ak19
Handling map coordinates that are horizontally excessive (beyond …
13:37 Changeset [33852] by davidb
Unused. XSL filename extension potentially causing a problem with how …
Note: See TracTimeline for information about the timeline view.