root/other-projects

Revision Log Mode:


Legend:

Added
Modified
Copied or renamed
Rev Chgset Date Author Log Message
(edit) @33828 [33828] 7 days ak19 Additions and modifications to the write-up.
(edit) @33825 [33825] 8 days ak19 Beginnings of first draft of write up.
(edit) @33824 [33824] 9 days ak19 More instructions and explaining the contents of the mongodb-data folder.
(edit) @33823 [33823] 9 days ak19 Recommitting mongo-data folder with renamed files with numbering.
(edit) @33822 [33822] 9 days ak19 Removing as I'm renaming all the files with prefixes. There are too many …
(edit) @33821 [33821] 9 days ak19 Manually created a shortlist of MRI sites from longer …
(edit) @33820 [33820] 9 days ak19 Forgot to commit before holidays.
(edit) @33816 [33816] 5 weeks ak19 Finished manually going through the sites that I couldn't easily filter …
(edit) @33815 [33815] 5 weeks ak19 Removed old results from before bugfix and improvement to …
(edit) @33814 [33814] 5 weeks ak19 Put the important mongodb queries and results into …
(edit) @33813 [33813] 5 weeks ak19 With the bugfix from yesterday and the inclusion of http(s)://mi.* type …
(edit) @33812 [33812] 5 weeks ak19 Better handling of multi-line comment symbols, so I can now include proper …
(edit) @33811 [33811] 5 weeks ak19 Returning to using a single variable, urlContainsLangCodeInPath, to record …
(edit) @33810 [33810] 5 weeks ak19 Bugfix: mi in url path should be checked for for each page of site, not …
(edit) @33809 [33809] 5 weeks ak19 Some more GS_README.txt instructions. Not put the mongodb queries in here …
(edit) @33808 [33808] 5 weeks ak19 Storing not just whether /mi(/) suffix is in path, but also whether …
(edit) @33807 [33807] 5 weeks ak19 Trying to manually go through a shortlisted set of domains to see if …
(edit) @33806 [33806] 6 weeks ak19 More mongodb querying revealed that excluding tentative product sites (if …
(edit) @33805 [33805] 6 weeks ak19 1. Moving the static countrycodes.json file to conf folder and updated …
(edit) @33804 [33804] 6 weeks ak19 1. Updated results from mongodb querying after yesterday's modifications …
(edit) @33803 [33803] 6 weeks ak19 geojson mapdata and map for mongodb results on sitesWithPagesContainingMRI …
(edit) @33802 [33802] 6 weeks ak19 With an extra adult site removed and with setting countrycodes that …
(edit) @33801 [33801] 6 weeks ak19 1. NutchTextDumpToMongoDB Added an extra field to each document in …
(edit) @33800 [33800] 6 weeks ak19 Removed an adult site from crawled contents and added its url to blacklist …
(edit) @33799 [33799] 6 weeks ak19 1. Adding breadcrumb for next step at end of running …
(edit) @33798 [33798] 6 weeks ak19 Adding the geojson related files related to querying mongodb for sites …
(edit) @33797 [33797] 6 weeks ak19 Updated json and imaegs files, and new files for when /mi(/) is in the URL …
(edit) @33796 [33796] 6 weeks ak19 Instead of a hack for US' count being too great that its histogram goes …
(edit) @33794 [33794] 6 weeks ak19 Wrote the geojson map data created from the site counts per country/region …
(edit) @33790 [33790] 6 weeks ak19 Got the MultiPoint? geojson mapdata of the country code counts working: the …
(edit) @33789 [33789] 6 weeks ak19 Redid the mongodb query to get the countrycode counts for all the …
(edit) @33788 [33788] 6 weeks ak19 Adding all the jar files needed to work in Java with geojson Simple …
(edit) @33787 [33787] 6 weeks ak19 Documented another mongodb query that I'm using, the one to produce the …
(edit) @33778 [33778] 6 weeks ak19 Made a beginning on getting the geojson map data automated. Couldn't work …
(edit) @33776 [33776] 6 weeks ak19 Field Separator (IFS) conflicting with backticks and other ways of getting …
(edit) @33760 [33760] 7 weeks ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Rebuilding after GLI …
(edit) @33759 [33759] 7 weeks ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Rebuilding after GLI …
(edit) @33723 [33723] 8 weeks ak19 On linux 64 bit, the additional wrap command did not work because the …
(edit) @33722 [33722] 2 months ak19 Adding in additional instructions in mongodb.txt, before I forgot how to …
(edit) @33710 [33710] 2 months ak19 Working queries and map coords for geojson.tools (ironically, Lat and Lng …
(edit) @33698 [33698] 2 months ak19 Links to more reading
(edit) @33675 [33675] 2 months ak19 Committing the newer query results (but from before today's reingestion in …
(edit) @33674 [33674] 2 months ak19 Changes to support the top 5 predicted langcodes and their confidence …
(edit) @33666 [33666] 2 months ak19 Having finished sending all the crawl data to mongodb 1. Recrawled the 2 …
(edit) @33657 [33657] 2 months ak19 Some fixes after brief testing against 1/3 of the crawl. Restarted …
(edit) @33656 [33656] 2 months ak19 Final minor changes before I start processing the crawls of node2.
(edit) @33655 [33655] 2 months ak19 Minor change to print statement
(edit) @33654 [33654] 2 months ak19 Removing jar file that wasn't used after all.
(edit) @33653 [33653] 2 months ak19 1. As suggested by Dr Bainbridge, made the code changes to use Morphia as …
(edit) @33652 [33652] 2 months ak19 Introducing morphia subpackage
(edit) @33651 [33651] 2 months ak19 1. Bugfix: overlappingSentences works. 2. storing numSentencesInMaor
(edit) @33646 [33646] 2 months ak19 Saving the mongodb queries and learning links that Dr Bainbridge found …
(edit) @33645 [33645] 2 months ak19 Fix to 2 bugs when sending data to MongoDB: 1. overlappingSentences was …
(edit) @33644 [33644] 2 months ak19 Just committing the growing mongodb.txt file with links and instructions …
(edit) @33643 [33643] 2 months ak19 Brought the template log4j.properties.in back up to speed. I forgot it …
(edit) @33642 [33642] 2 months ak19 Forgot to commit the java driver for mongodb when I committed the Java …
(edit) @33635 [33635] 2 months ak19 Maori-language-detection doesn't use Greenstone 3 at present, it's not a …
(edit) @33589 [33589] 3 months cpb16 final01. Need Map results still
(edit) @33521 [33521] 4 months ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Redoing the CDS-ISIS …
(edit) @33520 [33520] 4 months ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Redoing the CDS-ISIS …
(edit) @33512 [33512] 4 months ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Rebuilding all the …
(edit) @33511 [33511] 4 months ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Rebuilding all the …
(edit) @33458 [33458] 5 months cpb16 Running new morphology version after quick meeting with david last week. …
(edit) @33455 [33455] 5 months cpb16 Started implementing Davids suggested morphology sequence, codeversion9
(edit) @33449 [33449] 5 months cpb16 termnal version executes correctly. (Didnt include init threshold in …
(edit) @33447 [33447] 5 months cpb16 starting to implement terminal version of new morphology. need to fix. …
(edit) @33444 [33444] 5 months cpb16 //Have created a preprocess to remove large objects. …
(edit) @33439 [33439] 5 months cpb16 Have created properties file and accessibility from …
(edit) @33437 [33437] 5 months cpb16 made progress with morphology. Need to have a better area dimension …
(edit) @33427 [33427] 5 months davidb Some initial files on how to get going
(edit) @33426 [33426] 5 months davidb Folder to details on how to standup the HTRC DevEnv? locally
(edit) @33418 [33418] 5 months cpb16 made progress with morphology, based one image, need to refine further, …
(edit) @33415 [33415] 5 months cpb16 updated, after unable to commit due to setup.bash being out of date. Added …
(edit) @33384 [33384] 6 months cpb16 backup before intellij working
(edit) @33375 [33375] 6 months cpb16 Full backup after running first successful highres classifier run
(edit) @33367 [33367] 6 months cpb16 Pre-hires classification w/o MU
(edit) @33354 [33354] 6 months davidb Template file for producing OpenOffice? spreadsheet format
(edit) @33353 [33353] 6 months davidb Initial set of files to page scrape and turn in the OpenOffice? spreadsheet …
(edit) @33352 [33352] 6 months davidb Top-level folder for code to page-scrape BookStumper? site
(edit) @33351 [33351] 6 months davidb Top-level folder for code to page-scrape BookStumper? site
(edit) @33340 [33340] 6 months cpb16 transferred backup of low res images. Classifiers work as expected. …
(edit) @33332 [33332] 6 months ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Recommitting the only …
(edit) @33331 [33331] 6 months ak19 AUTOCOMMIT by gen-model-colls.sh script. Message: Recommitting the only …
(edit) @33326 [33326] 6 months cpb16 Completed linecluster with x position dectection, need to test
(edit) @33325 [33325] 6 months cpb16 Added x pos checker, needs testing, and remove errors
(edit) @33324 [33324] 6 months cpb16 Backup for 4th crash of the day. Need to reimplement x corrodinate checker
(edit) @33319 [33319] 7 months cpb16 added high res download sorter
(edit) @33310 [33310] 7 months cpb16 developing line clustering. Have completed line cluster algorithm. need to …
(edit) @33304 [33304] 7 months cpb16 Backup for computer crash, only lost 5 lines of code in development …
(edit) @33243 [33243] 7 months cpb16 Had break through with the refined houghlinesP algorithm overall accurarcy …
(edit) @33221 [33221] 7 months cpb16 back up pre-houghlineP-refinement progress
(edit) @33170 [33170] 7 months cpb16 refined houghlineP alogirthm
(edit) @33141 [33141] 8 months cpb16 Completed end-to-end pipeline and one classifier
(edit) @33138 [33138] 8 months davidb Scripts that focus on language (for non-music related work)
(edit) @33137 [33137] 8 months davidb Added a bit more detail to instructions for ssl
(edit) @33136 [33136] 8 months davidb Extra echo statement added, to help with details printed as script runs
(edit) @33135 [33135] 8 months davidb These should not be committed into the repository
(edit) @33134 [33134] 8 months davidb Avoid having hypen at start of filename
(edit) @33133 [33133] 8 months davidb No need for this backup file in the repository
(edit) @33132 [33132] 8 months davidb No need for this backup file in the repository
Note: See TracRevisionLog for help on using the revision log.