Ignore:
Timestamp:
2019-10-17T23:12:38+13:00 (5 years ago)
Author:
ak19
Message:

NutchTextDumpProcessor prints each crawled site's stats: number of webpages per crawled site and how many of those were detected by OpenNLP as being in Maori (mri). Needed to make a reusable method in CCWETProcessor as public and static.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • gs3-extensions/maori-lang-detection/src/org/greenstone/atea/CCWETProcessor.java

    r33575 r33582  
    240240     * This retains any www. or subdomain prefix.
    241241     */
    242     private String getDomainForURL(String url, boolean withProtocol) {
     242    public static String getDomainForURL(String url, boolean withProtocol) {
    243243    int startIndex = startIndex = url.indexOf("//"); // for http:// or https:// prefix
    244244    startIndex = (startIndex == -1) ? 0 : (startIndex+2); // skip past the protocol's // portion
Note: See TracChangeset for help on using the changeset viewer.