|
|
@33909
|
4 years |
ak19 |
1. Implementing tables 3 to 5. 2. Rolled back the introduction of the …
|
|
|
@33906
|
4 years |
ak19 |
Code is intermediate state. 1. Introduced basicDomain field to MongoDB …
|
|
|
@33887
|
4 years |
ak19 |
1. Added support for writing out tables in csv format too. 2. Second …
|
|
|
@33885
|
4 years |
ak19 |
Attempting to write the tables. csv not yet supported. Table 1 done.
|
|
|
@33884
|
4 years |
ak19 |
0. Previous commit had lots of modifications, and only 2 files matched …
|
|
|
@33883
|
4 years |
ak19 |
Clarifications
|
|
|
@33882
|
4 years |
ak19 |
Code now writes both a listing of all non-autotranslated websites and …
|
|
|
@33881
|
4 years |
ak19 |
Uses lambda expression to process each doc in a mongodb aggregate …
|
|
|
@33880
|
4 years |
ak19 |
Write out the 5counts_tentativeNonAutotranslatedSites.json file with …
|
|
|
@33879
|
4 years |
ak19 |
Have the 2 mongodb aggregate() calls working that
|
|
|
@33876
|
4 years |
ak19 |
Some missteps, but have got complex collection.aggregate() working at last.
|
|
|
@33873
|
4 years |
ak19 |
Beginnings of WebPageURLsListing program whose purpose Dr Bainbridge …
|
|
|
@33871
|
4 years |
ak19 |
Removed mostly duplicated older version of method but left the …
|
|
|
@33870
|
4 years |
ak19 |
Got the mongodb query working in Java in 2 different ways: the fully …
|
|
|
@33869
|
4 years |
ak19 |
First cut at the RandomURLsForDomainGenerator.java class and the …
|
|
|
@33867
|
4 years |
ak19 |
Moved the code handling of special case large rectangles and those …
|
|
|
@33858
|
4 years |
ak19 |
Fixes to the code committed yesterday: correct calculation of the …
|
|
|
@33853
|
4 years |
ak19 |
Handling map coordinates that are horizontally excessive (beyond …
|
|
|
@33812
|
4 years |
ak19 |
Better handling of multi-line comment symbols, so I can now include …
|
|
|
@33811
|
4 years |
ak19 |
Returning to using a single variable, urlContainsLangCodeInPath, to …
|
|
|
@33810
|
4 years |
ak19 |
Bugfix: mi in url path should be checked for for each page of site, …
|
|
|
@33808
|
4 years |
ak19 |
Storing not just whether /mi(/) suffix is in path, but also whether …
|
|
|
@33805
|
4 years |
ak19 |
1. Moving the static countrycodes.json file to conf folder and updated …
|
|
|
@33801
|
4 years |
ak19 |
1. NutchTextDumpToMongoDB Added an extra field to each document in …
|
|
|
@33800
|
4 years |
ak19 |
Removed an adult site from crawled contents and added its url to …
|
|
|
@33799
|
4 years |
ak19 |
1. Adding breadcrumb for next step at end of running …
|
|
|
@33796
|
4 years |
ak19 |
Instead of a hack for US' count being too great that its histogram …
|
|
|
@33794
|
4 years |
ak19 |
Wrote the geojson map data created from the site counts per …
|
|
|
@33790
|
4 years |
ak19 |
Got the MultiPoint geojson mapdata of the country code counts working: …
|
|
|
@33778
|
4 years |
ak19 |
Made a beginning on getting the geojson map data automated. Couldn't …
|
|
|
@33698
|
4 years |
ak19 |
Links to more reading
|
|
|
@33674
|
4 years |
ak19 |
Changes to support the top 5 predicted langcodes and their confidence …
|
|
|
@33666
|
4 years |
ak19 |
Having finished sending all the crawl data to mongodb 1. Recrawled the …
|
|
|
@33657
|
4 years |
ak19 |
Some fixes after brief testing against 1/3 of the crawl. Restarted …
|
|
|
@33656
|
4 years |
ak19 |
Final minor changes before I start processing the crawls of node2.
|
|
|
@33655
|
4 years |
ak19 |
Minor change to print statement
|
|
|
@33653
|
4 years |
ak19 |
1. As suggested by Dr Bainbridge, made the code changes to use Morphia …
|
|
|
@33652
|
4 years |
ak19 |
Introducing morphia subpackage
|
|
|
@33651
|
4 years |
ak19 |
1. Bugfix: overlappingSentences works. 2. storing numSentencesInMaor
|
|
|
@33645
|
4 years |
ak19 |
Fix to 2 bugs when sending data to MongoDB: 1. overlappingSentences …
|
|
|
@33635
|
4 years |
ak19 |
Maori-language-detection doesn't use Greenstone 3 at present, it's not …
|
|
copied from gs3-extensions/maori-lang-detection/src/org/greenstone/atea
|
|
|
@33634
|
4 years |
ak19 |
Rewrote NutchTextDumpProcessor as NutchTextDumpToMongoDB.java, which …
|