Ignore:
Timestamp:
2019-10-09T18:11:19+13:00 (5 years ago)
Author:
ak19
Message:

Added more to blacklist and greylist. And removed remaining duplicates from topsites list. Committing that before changing the top sites listing's URL patterns again.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • gs3-extensions/maori-lang-detection/conf/sites-too-big-to-exhaustively-crawl.txt

    r33553 r33554  
    1 # URL blacklist
    2 # FORMAT:
    3 # precede URL by ^ to blacklist urls that match the given prefix
    4 # succeed URL by $ to blacklist urls that match the given suffix
    5 # ^url$ will blacklist urls that match the given url completely
    6 # Without either ^ or $ symbol, urls containing the given url will get blacklisted
     1# top sites - base url forms
    72
    83# Contains alexa top sites (where only the first 50 were visible)
     
    441436wikimedia.org
    442437wikipedia.org
    443 wikipedia.org
    444 wikipedia.org
    445438wiktionary.org
    446439wiley.com
     
    457450yahoo.co.
    458451yahoo.com
    459 yahoo.com
    460452yale.edu
    461453yandex.ru
     
    469461zendesk.com
    470462
    471 
Note: See TracChangeset for help on using the changeset viewer.