source: gs3-extensions/maori-lang-detection/conf/url-greylist-filter.txt@ 33502

Last change on this file since 33502 was 33502, checked in by ak19, 5 years ago

Current url pattern blacklist and greylist filter files. Used by CCWETProcessor.java

File size: 568 bytes
Line 
1# URL 'greylist': save matching urls to one side, to eyeball later and confirm if they should
2# be included or skipped
3# FORMAT:
4# precede URL by ^ to greylist urls that match the given prefix
5# succeed URL by $ to greylist urls that match the given suffix
6# ^url$ will greylist urls that match the given url completely
7# Without either ^ or $ symbol, urls containing the given url will get greylisted
8
9
10# Product sites: unwanted auto-translation pages of online product stores
11/product/
12/products/
13/product-page/
14/product-category/
15
16# Add alexa top sites to greylist
Note: See TracBrowser for help on using the repository browser.