source: gs3-extensions/maori-lang-detection/conf/url-whitelist-filter.txt@ 33559

Last change on this file since 33559 was 33559, checked in by ak19, 5 years ago
  1. Special string COPY changed to SUBDOMAIN-COPY after Dr Bainbridge explained why it was more accurate to the behaviour. 2. Comments to explain how the sites-too-big-to-exhaustively-crawl.txt should be formatted, what values are expected and how they work. 3. Special blacklisting and whitelisting of urls on yale.edu, coupled with special treatment in topsites file too.
File size: 565 bytes
Line 
1# URL 'whitelist': urls of these forms go into the keep pile.
2# whitelist overrides blacklist and greylist.
3# FORMAT:
4# precede URL by ^ to greylist urls that match the given prefix
5# succeed URL by $ to greylist urls that match the given suffix
6# ^url$ will greylist urls that match the given url completely
7# Without either ^ or $ symbol, urls containing the given url will get greylisted
8
9# Special exception for this url on yale.edu, since we needed to blacklist
10# some particular other urls on yale.edu
11http://korora.econ.yale.edu/phillips/archive/hauraki.htm
Note: See TracBrowser for help on using the repository browser.