Timeline



2019-12-19:

22:33 Changeset [33816] by ak19
Finished manually going through the sites that I couldn't easily …
17:17 Changeset [33815] by ak19
Removed old results from before bugfix and improvement to …
17:13 Changeset [33814] by ak19
Put the important mongodb queries and results into …

2019-12-18:

21:38 Changeset [33813] by ak19
With the bugfix from yesterday and the inclusion of http(s):mi.* …
21:36 Changeset [33812] by ak19
Better handling of multi-line comment symbols, so I can now include …
16:51 Changeset [33811] by ak19
Returning to using a single variable, urlContainsLangCodeInPath, to …

2019-12-17:

21:48 Changeset [33810] by ak19
Bugfix: mi in url path should be checked for for each page of site, …
19:53 Changeset [33809] by ak19
Some more GS_README.txt instructions. Not put the mongodb queries in …
19:31 Changeset [33808] by ak19
Storing not just whether /mi(/) suffix is in path, but also whether …
19:29 Changeset [33807] by ak19
Trying to manually go through a shortlisted set of domains to see if …

2019-12-13:

21:31 Changeset [33806] by ak19
More mongodb querying revealed that excluding tentative product sites …
20:08 Changeset [33805] by ak19
1. Moving the static countrycodes.json file to conf folder and updated …
20:00 Changeset [33804] by ak19
1. Updated results from mongodb querying after yesterday's …
19:27 Changeset [33803] by ak19
geojson mapdata and map for mongodb results on …
18:42 Changeset [33802] by ak19
With an extra adult site removed and with setting countrycodes that …
18:40 Changeset [33801] by ak19
1. NutchTextDumpToMongoDB Added an extra field to each document in …

2019-12-12:

18:04 Changeset [33800] by ak19
Removed an adult site from crawled contents and added its url to …
16:08 Changeset [33799] by ak19
1. Adding breadcrumb for next step at end of running …
15:57 Changeset [33798] by ak19
Adding the geojson related files related to querying mongodb for sites …
15:42 Changeset [33797] by ak19
Updated json and imaegs files, and new files for when /mi(/) is in the …
15:42 Changeset [33796] by ak19
Instead of a hack for US' count being too great that its histogram …
11:38 Changeset [33795] by kjdon
remove edit bar and right side bar from print view of document

2019-12-11:

21:57 Changeset [33794] by ak19
Wrote the geojson map data created from the site counts per …
18:28 Changeset [33793] by ak19
Changes for getting a running GS3 server to display collections on …
18:18 Changeset [33792] by ak19
Correcting spellings
17:44 Changeset [33791] by ak19
1. Kathy renamed the gs3interface properties filename from …

2019-12-10:

20:43 Changeset [33790] by ak19
Got the MultiPoint geojson mapdata of the country code counts working: …
20:39 Changeset [33789] by ak19
Redid the mongodb query to get the countrycode counts for all the …
20:39 Changeset [33788] by ak19
Adding all the jar files needed to work in Java with geojson Simple …
20:36 Changeset [33787] by ak19
Documented another mongodb query that I'm using, the one to produce …
11:43 Changeset [33786] by kjdon
add google analytics, stop the outputting of fr, es, ru links as those …
11:42 Changeset [33785] by kjdon
added google analytics
11:40 Changeset [33784] by kjdon
only generate english versions for now as ru, fr, es haven't been …
11:38 Changeset [33783] by kjdon
only generate english versions for now as ru, fr, es haven't been …
11:35 Changeset [33782] by kjdon
updated README
11:35 Changeset [33781] by kjdon
tidied up the intro
11:07 Changeset [33780] by kjdon
added code to add google-analytics to each page
10:51 Changeset [33779] by kjdon
ServiceRack.properties has been renamed to …

2019-12-09:

21:55 Changeset [33778] by ak19
Made a beginning on getting the geojson map data automated. Couldn't …
17:48 Changeset [33777] by ak19
Forgot to document a link, with sample code to use nativecall jar file …
15:41 Changeset [33776] by ak19
Field Separator (IFS) conflicting with backticks and other ways of …
11:37 Changeset [33775] by kjdon
fixed a typo in a comment
11:37 Changeset [33774] by kjdon
getTextString code moved to Dictionary.getTextSTring, as its no longer …
11:36 Changeset [33773] by kjdon
the default dictionary is not ServiceRack.properties any more. Instead …
11:35 Changeset [33772] by kjdon
don't need to pass in ServiceRack to getTextSTring anymore. …
11:33 Changeset [33771] by kjdon
use the new Dictionary.getTExtSTring instead of repeating all the code here
11:32 Changeset [33770] by kjdon
updated to match new args for Dictionary.getTextString
11:29 Changeset [33769] by kjdon
updated getTextSTring to contain all the functionality from …
11:26 Changeset [33768] by kjdon
removed some code that was commented out, and some methods that were …
11:12 Changeset [33767] by kjdon
renaming ServiceRack.properties to core_servlet_dictionary.properties
11:11 Changeset [33766] by kjdon
renaming ServiceRack.properties to core_servlet_dictionary.properties
11:10 Changeset [33765] by kjdon
renaming ServiceRack.properties to core_servlet_dictionary.properties
11:07 Changeset [33764] by kjdon
renaming ServiceRack.properties to core_servlet_dictionary.properties
11:03 Changeset [33763] by kjdon
renaming ServiceRack.properties to core_servlet_dictionary.properties
11:01 Changeset [33762] by kjdon
renaming ServiceRack.properties to core_servlet_dictionary.properties
10:58 Changeset [33761] by kjdon
renaming ServiceRack.properties to core_servlet_dictionary.properties

2019-12-07:

02:05 Changeset [33760] by ak19
AUTOCOMMIT by gen-model-colls.sh script. Message: Rebuilding after GLI …
02:05 Changeset [33759] by ak19
AUTOCOMMIT by gen-model-colls.sh script. Message: Rebuilding after GLI …
01:56 Changeset [33758] by ak19
Removed debugging and last bit of cleanup.
01:40 Changeset [33757] by ak19
1. Windows bugfix for getting exMeta to be loaded into GLI where there …

2019-12-05:

21:58 Changeset [33756] by ak19
Attempted bugfix for ex meta not always loading in gli for docs that …
14:13 Changeset [33755] by kjdon
set the encoding to utf-8 for all the files

2019-12-04:

21:26 Changeset [33754] by davidb
Correcting spelling error in code
21:22 Changeset [33753] by davidb
Made the change Dr Bainbridge wanted: in the two locations that …
21:14 Changeset [33752] by davidb
Followed Dr Bainbridge's suggestion to correct hnz.Identifiers that …
17:56 Changeset [33751] by ak19
Related to previous commit. Dr Bainbridge came up with a better …
16:47 Changeset [33750] by ak19
Fixed a NullPointerException without stacktrace, noticed with …
15:58 Changeset [33749] by ak19
Still on the bugfix for GLI with non-ascii filenames assigned …

2019-12-03:

21:06 Changeset [33748] by ak19
Linux bugfixes to recent commits to do with getting file-level meta …
17:50 Changeset [33747] by ak19
Tidying up code some more and moving unused (but reusable and possibly …
17:31 Changeset [33746] by ak19
1. Bugfix for dealing with + in filenames: file-level metadata now …
16:38 Changeset [33745] by ak19
Fix to function decodeStringContainingHexEntities that I recently …
15:04 Changeset [33744] by ak19
Refactored code to do more inside functions rather than make callers …
12:15 Changeset [33743] by kjdon
added extra info to depositor line
12:14 Changeset [33742] by kjdon
get depositor name from dictionary
11:29 Changeset [33741] by kjdon
changed a comment
11:19 Changeset [33740] by kjdon
added a format statement to Titles classifier. This uses gsf:metadata …
Note: See TracTimeline for information about the timeline view.