Search:
Login
Help/Guide
About Trac
Preferences
Home
Timeline
Roadmap
Browse Source
View Tickets
Search
Context Navigation
View Latest Revision
source:
gs3-extensions
/
maori-lang-detection
/
conf
Revision Log Mode:
Stop on copy
Follow copies
Show only adds and deletes
View log starting at
and back to
Show at most
revisions per page.
Show full log messages
Legend:
Added
Modified
Copied or renamed
Diff
Rev
Age
Author
Log Message
(edit)
@33569
5 years
ak19
1. batchcrawl.sh now does what it should have from the start, which is …
(edit)
@33568
5 years
ak19
1. More sites greylisted and blacklisted, discovered as I attempted to …
(edit)
@33565
5 years
ak19
CCWETProcessor: domain url now goes in as a seedURL after the …
(edit)
@33562
5 years
ak19
1. The sites-too-big-to-exhaustively-crawl.txt is now a csv file of a …
(edit)
@33561
5 years
ak19
1. sites-too-big-to-exhaustively-crawl.txt is now a comma separated …
(edit)
@33559
5 years
ak19
1. Special string COPY changed to SUBDOMAIN-COPY after Dr Bainbridge …
(edit)
@33556
5 years
ak19
Blacklisted wikipedia pages that are actually in other languages which …
(edit)
@33555
5 years
ak19
Modified top sites list as Dr Bainbridge described: suffixes for the …
(edit)
@33554
5 years
ak19
Added more to blacklist and greylist. And removed remaining duplicates …
(edit)
@33553
5 years
ak19
Comments
(edit)
@33551
5 years
ak19
Added in top 500 urls from moz.com/top500 and removed duplicates, and …
(edit)
@33550
5 years
ak19
First stage of introducing sites-too-big-to-exhaustively-crawl.tx: …
(edit)
@33532
5 years
ak19
Found the other top 500 sites link again at last which Dr Bainbridge …
(edit)
@33531
5 years
ak19
Added whitelist for mi.wikipedia.org, and updates to blacklist and …
(edit)
@33502
5 years
ak19
Current url pattern blacklist and greylist filter files. Used by …
(edit)
@33480
5 years
ak19
Much harder to remove pages where words are fused together as some are …
(edit)
@33467
5 years
ak19
Improved the code to use a static block to load the needed properties …
(edit)
@33412
5 years
ak19
config command for wgetting a single file
(edit)
@33400
5 years
ak19
1. Setting up log4j.properties based on the macronizer's basic one …
(add)
@33399
5 years
ak19
Putting properties files into the conf folder and keeping the lib …
Note:
See
TracRevisionLog
for help on using the revision log.
Download in other formats:
RSS Feed
ChangeLog