Changeset 33914


Ignore:
Timestamp:
2020-02-13T17:09:07+13:00 (4 years ago)
Author:
ak19
Message:

Shortlisted just the domain sites by country into ManualShortlist2.txt after taking the reingest into MongoDB into account. And then put all these shortlisted domains for which containsMRI=true as per manual inspection into a separate new file.

Location:
other-projects/maori-lang-detection
Files:
1 added
3 edited

Legend:

Unmodified
Added
Removed
  • other-projects/maori-lang-detection/MoreReading/mongodb.txt

    r33913 r33914  
    11031103- RUSSIA: https://www.gismeteo.lv - misidentification of an email address
    11041104- JAPAN: http://yutaka.it-n.jp - many pages of scientific names of (plants?) which are often misdetected as MRI
    1105 !! - Ireland, ie: https://coggle.it
     1105!! - IRELAND, IE: https://coggle.it
    11061106- IRAN: https://www.dideo.ir/v/yt/d6cgya0ze-E - video title from MaoriTelevision website
    11071107- CZECH republic:
     
    13711371X https://docs.google.com, timetable with occasional Maori language word
    13721372+ https://drive.google.com, https://drive.google.com/file/d/1NwuzafjddaP8gxI7O_Zapts5bM7mrtwn/preview is an image of Maori number names. But other page on drive.google.com is a NZ certificate or ID (in English) of a person's position.
    1373 http://ritusehji.blogspot.com - no page with more than 1 sentence detected. But short string of actual MRI content. Educator blog with pictures and English language content.
     1373~+ http://ritusehji.blogspot.com - no page with more than 1 sentence detected. But short string of actual MRI content. Educator blog with pictures and English language content.
    13741374
    13751375
     
    15411541X https://mi.lawyers.cafe - autotranslated
    15421542    X https://mi.centr-zashity.ru - same as lawyers.cafe above: autotranslated
    1543 ! https://policies.oclc.org - not completely translated. Copyright page, privacy statement and cookie statement pages appear to be in Maori. Not sure if autotranslated since other pages aren't available in MI. Dutch equivalent pages seem human translated.
     1543~! https://policies.oclc.org - not completely translated. Copyright page, privacy statement and cookie statement pages appear to be in Maori. Not sure if autotranslated since other pages aren't available in MI. Dutch equivalent pages seem human translated.
    15441544X http://jobdescriptionsample.org - autotranslated
    15451545X http://mi.broadcastbeat.com - autotranslated product site
     
    16191619   IT, AT, RO, CH, RU, BG, MX, JP, CN, IE, IR, FI same
    16201620
    1621 US gained 3:
    1622 anglican.org (NEW)
    1623 articles.imperialtometric.com (from CA)
    1624 daandehn.com (CA)
     1621US gained 3 + 1 from mi in URL path:
     1622+ anglican.org (NEW)
     1623X articles.imperialtometric.com (from CA)
     1624X daandehn.com (from CA)
     1625+ kiwiproperty.com (from AU)
    16251626
    16261627CA lost 2:
    1627 articles.imperialtometric.com (to US)
    1628 daandehn.com (to US)
     1628X articles.imperialtometric.com (to US)
     1629X daandehn.com (to US)
    16291630
    16301631AU:
    1631 lost kiwiproperty.com (to US - mi in URL path version file!)
     1632! lost kiwiproperty.com (to US - mi in URL path version file!)
    16321633
    16331634
    16341635CZ:
    1635 gained viveipcl.com (from UNKNOWN)
     1636X gained viveipcl.com (from UNKNOWN)
    16361637
    16371638UNKNOWN:
    1638 gained hitiaotera.com from IL
     1639X gained hitiaotera.com from IL
    16391640
    16401641IL:
    1641 lost one to (UNKNOWN)
    1642 
     1642X lost one (hitiaotera.com to UNKNOWN)
     1643
     1644
     1645FINAL SITE COUNT (contain >= 1 page with >= 1 MRI sentence)
     1646
     1647DK:
     1648http://ngapuhiradio.com
     1649http://ngapuhitelevision.com
     1650    [http://akona.ngapuhitelevision.com
     1651    http://waiatarangatiratanga.ngapuhitelevision.com
     1652    http://jazz.ngapuhitelevision.com
     1653    http://powhiri.ngapuhitelevision.com
     1654    http://komisch.ngapuhitelevision.com]
     1655
     1656DE
     1657http://www.udhr.de
     1658https://www.cartogiraffe.com/
     1659
     1660AU
     1661https://koreromaori.com
     1662(https://infogram.com/)
     1663
     1664FR
     1665http://chantsdeluttes.free.fr/
     1666
     1667ES
     1668https://www.uv.es/
     1669
     1670IE
     1671https://coggle.it
     1672
     1673CZ:
     1674http://www.henryklahola.nazory.cz
     1675
     1676BG:
     1677http://anitra.net/
     1678
     1679US finals:
     1680http://anglican.org
     1681http://anglicanhistory.org
     1682http://www.unicode.org
     1683https://static-promote.weebly.com
     1684http://aclhokiangarocks.blogspot.com
     1685http://bahaiprayers.net
     1686https://biblehub.com
     1687http://www.muhammad.com
     1688http://www.godrules.net
     1689http://m.biblepub.com
     1690http://www.krassotkin.ru
     1691http://www.gotquestions.org
     1692https://maorinews.com
     1693http://maaori.com
     1694http://kiaorahola.blogspot.com
     1695https://kjohnsonnz.blogspot.com
     1696http://pumanawawhangara.blogspot.com
     1697http://dannykahei.tripod.com
     1698http://burkekm001.tripod.com
     1699http://tkkpipipaopao.blogspot.com
     1700http://manateina.blogspot.com
     1701http://tatai09.blogspot.com
     1702http://www.twttoa.com
     1703http://tuhua2010.blogspot.com
     1704http://piripi.blogspot.com
     1705https://www.breaker.audio
     1706https://drive.google.com
     1707http://ritusehji.blogspot.com
     1708https://in.pinterest.com
     1709
     171029
     1711
     1712https://www.kiwiproperty.com
     1713http://indigenousblogs.com
     1714https://mi.m.wikipedia.org, https://mi.wikipedia.org
     1715http://csunplugged.org, https://www.csunplugged.org
     1716(https://policies.oclc.org)
     1717
     171834 incl with MI in URL Path
     1719
     1720
     1721---------------------
     1722NZ:
     1723    http://www.teipukarea.maori.nz
     1724        http://ngatipahauwera.co.nz
     1725        http://www.oag.govt.nz
     1726        https://sexualviolence.victimsinfo.govt.nz
     1727        http://tmoa.tki.org.nz
     1728        http://www.tewhanake.maori.nz
     1729        http://www.matarikifestival.org.nz
     1730        http://www.otepoti.school.nz
     1731        https://www.maoritelevision.com
     1732        http://pukapuka.nz
     1733        http://community.nzdl.org
     1734        http://maori.livingheritage.org.nz [http://www.livingheritage.org.nz]
     1735        http://pukoro.co.nz
     1736    https://cdn.tehiku.nz [DOMAIN: tehiku.nz]
     1737        http://www.runanga.co.nz
     1738        http://kuraaiwi.maori.nz
     1739        http://kurataiao.tki.org.nz
     1740        http://satellites.co.nz
     1741        http://teaohou.natlib.govt.nz
     1742        http://www.tuwharetoa.iwi.nz
     1743        https://www.terito.school.nz
     1744        https://ttw1.cwp.govt.nz
     1745        https://www.whanau-tahi.school.nz
     1746        https://e-ako-pangarau.nzmaths.co.nz
     1747        https://teaomaori.news
     1748        http://tetaurawhiri.govt.nz
     1749        https://www.tuiatematangi.ac.nz
     1750        http://animations.tewhanake.maori.nz
     1751        https://www.dnc.org.nz
     1752        http://firstworldwar.tki.org.nz [http://www.firstworldwar.tki.org.nz]
     1753        http://www.28maoribattalion.org.nz
     1754        http://www.tewikiotereomaori.co.nz
     1755        http://www.brettgraham.co.nz
     1756        https://hepatakakupu.nz
     1757    http://anglicanprayerbook.nz
     1758        http://arataua.nz
     1759        http://maori.tki.org.nz
     1760        https://paekupu.co.nz
     1761        https://haereheikaiako.co.nz
     1762        https://curriculumtool.education.govt.nz
     1763        http://kurakokiri.maori.nz [includes: http://www.kurakokiri.maori.nz]
     1764        http://www.kkmmaungarongo.co.nz
     1765        http://www.heartland.co.nz
     1766        http://oilcrash.com
     1767        http://www.kura-porirua.school.nz
     1768        https://www.sporty.co.nz
     1769        https://www.tematawai.maori.nz
     1770        https://www.terakipaewhenua.school.nz
     1771        http://www.tetaurawhiri.govt.nz
     1772        http://archive.stats.govt.nz
     1773        http://tiritiowaitangi.govt.nz
     1774        http://www.waiata.maori.nz [includes: http://waiata.maori.nz]
     1775        http://hana.co.nz
     1776        http://kaupare.co.nz
     1777        http://www.tereowrap.nz
     1778        http://www.hrc.co.nz
     1779        http://ngatiporoukiponeke.org.nz
     1780        http://rurued.school.nz
     1781        http://www.twtop.school.nz
     1782        http://www.huri-translations.pf
     1783        https://teara.govt.nz/ [https://admin.teara.govt.nz, http://blog.teara.govt.nz]
     1784        https://tiritiowaitangi.govt.nz
     1785        http://www.tmoa.tki.org.nz
     1786        https://www.komako.org.nz
     1787        http://www.wcl.govt.nz [included: http://kete.wcl.govt.nz]       
     1788        http://punareo.co.nz
     1789        https://rapuatearatika.education.govt.nz
     1790        http://tmmkkm.school.nz
     1791        http://www.cs.waikato.ac.nz
     1792        http://www.kupengahao.co.nz
     1793        https://www.hapuhauora.health.nz
     1794        http://cms.sunsmartschools.co.nz [http://sunsmartschools.co.nz/]
     1795        http://kuraproductions.co.nz
     1796        https://keepourmoneyclean.govt.nz
     1797        http://www.tekura.school.nz
     1798        http://www.tkkmmokopuna.school.nz
     1799        http://hangaraumatihiko.tki.org.nz
     1800        http://www.pakanae.maori.nz
     1801
     1802
     1803    http://holyspirit.nz
     1804    https://www.ngamanawainc.co.nz, [includes http://www.ngamanawainc.co.nz]
     1805    http://www.finlaysonpark.school.nz
     1806    http://www.w3vietnam.org.nz [includes http://w3vietnam.org.nz]
     1807    https://www.takitimu.ac.nz
     1808        https://kotahimiriona.co.nz
     1809        https://rehuamarae.co.nz
     1810        http://reoora.co.nz
     1811
     1812        https://manawatuheritage.pncc.govt.nz
     1813        http://rsnz.natlib.govt.nz
     1814        https://www.taitokerautrust.org.nz
     1815        http://tewikiotereomaori.nz
     1816        https://www.korokikahukura.co.nz
     1817        https://www.pinterest.nz
     1818        https://www.rereahu.maori.nz
     1819        http://givealittle.co.nz
     1820        https://kaiiwicamp.nz [includes http://kaiiwicamp.nz]
     1821        http://ngarauhuia.ngatiapakiterato.iwi.nz
     1822        https://m.wairarapatv.co.nz
     1823
     1824        http://avonside.net
     1825        http://www.maoriinvestments.co.nz
     1826        http://conference.tpwt.maori.nz
     1827        https://www.puau.school.nz
     1828        http://tehauora.org.nz
     1829
     1830        http://temahurehure.maori.nz
     1831        http://www.temarareo.org
     1832        http://www.tetaumuturunanga.iwi.nz
     1833        http://www.writersfestival.co.nz
     1834        http://www.kmk.maori.nz
     1835        https://www.stats.govt.nz [includes http://archive.stats.govt.nz]
     1836
     1837+?       http://ngatiwhakaue.iwi.nz
     1838+?       https://interactives.stuff.co.nz
     1839+?       http://whatonga.school.nz
     1840+?       https://player.vimeo.com
     1841+?       http://southerntribes.co.nz
     1842
     1843?X      https://www.e-agent.nz [includes: https://office.e-agent.nz, http://videos.e-agent.nz]
  • other-projects/maori-lang-detection/mongodb-data/ManualShortlisting.txt

    r33891 r33914  
    17621762        "http://teaohou.natlib.govt.nz", 4/4, 2/4
    17631763        "http://www.tuwharetoa.iwi.nz", 2/3 0/3
    1764 +        "http://auturoa.nz", 0/4 0/3 [lots of MRI terms among English] - COMMUNITY (But there are pages inMRI to be found by non-random sampling, e.g. http://auturoa.nz/KarakiaMoKuaToRangiTeRaa.html)
     1764X        "http://auturoa.nz", 0/4 0/3 [lots of MRI terms among English] - COMMUNITY (But there are pages inMRI to be found by non-random sampling, e.g. http://auturoa.nz/KarakiaMoKuaToRangiTeRaa.html)
    17651765        "https://www.terito.school.nz", 3/3, 0/2 total
    17661766        "https://ttw1.cwp.govt.nz", 3/3 3/3
     
    199119913. GRAND TOTALS
    19921992
    1993 Count per country of web SITES that contain at least 1 web page containing at least 1 genuine MRI sentence:
    1994 
     1993Count per country of web SITES that contain at least 1 web page containing at least 1 genuine MRI sentence. (Number in brackets for overseas is number of sites of that geolocation if nz TLDs were NOT grouped with NZ geolocation under "NZ". Number in brackets for NZ indicates the number of sites that are only of NZ geolocation ignoring nz TLDs hosted overseas.)
     1994
     1995OLD
    19951996countryCode, num manually inspected sites as having pages containing MRI, num sites openNLP detected as having pages containing MRI
    1996 NZ: 126 actual sites out of 176 detected sites
    1997 US: 29 actual out of 486 detected sites
    1998 AU: 2 actual out of 21 detected sites
     1997NZ: 126 actual sites out of 176 (89) detected sites
     1998US: 29 actual out of 422 (486) detected sites
     1999AU: 2 actual out of 5 (21) detected sites
    19992000DE, Germany: 2 actual out of 27 detected sites
    20002001DK, Denmark: 2 out of 8
    20012002BG, Bulgaria: 1 out of 1
    20022003CZ, Czech Republic: 1 out of 4
    2003 ES, Spain: 1 out of 7
    2004 FR, France: 1 out of 36
     2004ES, Spain: 1 out of 5 (7)
     2005FR, France: 1 out of 35 (36)
    20052006IE, Ireland: 1 out of 2
     2007
    20062008
    20072009TOTAL: 166 sites of all the crawled sites where the crawled set of pages per site actually contained at least one sentence in Māori based on manual inspection.
  • other-projects/maori-lang-detection/mongodb-data/ManualShortlisting2.txt

    r33907 r33914  
    200820083. GRAND TOTALS
    20092009
    2010 Count per country of web SITES that contain at least 1 web page containing at least 1 genuine MRI sentence:
    2011 
     2010Count per country of web SITES that contain at least 1 web page containing at least 1 genuine MRI sentence. (Number in brackets for overseas is number of sites of that geolocation if nz TLDs were NOT grouped with NZ geolocation under "NZ". Number in brackets for NZ indicates the number of sites that are only of NZ geolocation ignoring nz TLDs hosted overseas. Numbers only present where different from counts of site by geolocation, which is the number indicated out of brackets.)
     2011
     2012OLD
    20122013countryCode, num manually inspected sites as having pages containing MRI, num sites openNLP detected as having pages containing MRI
    2013 NZ: 126 actual sites out of 176 detected sites
    2014 US: 29 actual out of 486 detected sites
    2015 AU: 2 actual out of 21 detected sites
     2014NZ: 126 actual sites out of 176 (89) detected sites
     2015US: 29 actual out of 422 (486) detected sites
     2016AU: 2 actual out of 5 (21) detected sites
    20162017DE, Germany: 2 actual out of 27 detected sites
    20172018DK, Denmark: 2 out of 8
    20182019BG, Bulgaria: 1 out of 1
    20192020CZ, Czech Republic: 1 out of 4
    2020 ES, Spain: 1 out of 7
    2021 FR, France: 1 out of 36
     2021ES, Spain: 1 out of 5 (7)
     2022FR, France: 1 out of 35 (36)
     2023IE, Ireland: 1 out of 2
     2024
     2025NEW - Adjusted grand totals above with changes to values after reingesting into mongodb (the adjusted values are from section C below). The number in brackets here are the UNIQUE domain names/sites that OpenNLP detected as having pages containing MRI, where different.
     2026
     2027countryCode, num manually inspected sites as having pages containing MRI, num sites openNLP detected as having pages containing MRI
     2028NZ: 124 (113 + 11 non-unique) actual sites out of 176 (159) detected sites
     2029US: 32 actual out of 422 (405) detected sites
     2030AU: 1 actual out of 5 detected sites
     2031DE, Germany: 2 actual out of 26 (24) detected sites
     2032DK, Denmark: 2 out of 8
     2033BG, Bulgaria: 1 out of 1
     2034CZ, Czech Republic: 1 out of 5 (4)
     2035ES, Spain: 1 out of 5
     2036FR, France: 1 out of 35 (34)
    20222037IE, Ireland: 1 out of 2
    20232038
     
    20262041
    20272042========================================
     2043Adjusted grand totals in manualShortlisting.txt with the following.
     2044
     2045----------------------------------------------------------------------
     2046C GEOLOCATION CHANGES AFTER REINGESTING UPON INTRODUCING ANGLICAN.ORG:
     2047----------------------------------------------------------------------
     2048NZ the same as before
     2049   NL, DE, FR, DK, ES, GB same
     2050   IT, AT, RO, CH, RU, BG, MX, JP, CN, IE, IR, FI same
     2051
     2052US gained 3:
     2053+ anglican.org (NEW)
     2054X articles.imperialtometric.com (from CA)
     2055X daandehn.com (CA)
     2056
     2057CA lost 2:
     2058X articles.imperialtometric.com (to US)
     2059X daandehn.com (to US)
     2060
     2061AU:
     2062+ ! lost kiwiproperty.com (to US - mi in URL path version file!)
     2063
     2064
     2065CZ:
     2066X gained viveipcl.com (from UNKNOWN)
     2067
     2068UNKNOWN:
     2069X gained hitiaotera.com from IL
     2070
     2071IL:
     2072X lost one (hitiaotera.com to UNKNOWN)
     2073
     2074-----------------
     2075FINAL SITE COUNT (contain >= 1 page with >= 1 MRI sentence)
     2076-----------------
     2077DK (2):
     2078http://ngapuhiradio.com
     2079http://ngapuhitelevision.com
     2080    [http://akona.ngapuhitelevision.com
     2081    http://waiatarangatiratanga.ngapuhitelevision.com
     2082    http://jazz.ngapuhitelevision.com
     2083    http://powhiri.ngapuhitelevision.com
     2084    http://komisch.ngapuhitelevision.com]
     2085
     2086DE (2)
     2087http://www.udhr.de
     2088https://www.cartogiraffe.com
     2089
     2090AU (1)
     2091https://koreromaori.com
     2092
     2093FR (1)
     2094http://chantsdeluttes.free.fr
     2095
     2096ES (1)
     2097https://www.uv.es
     2098
     2099IE (1)
     2100https://coggle.it
     2101
     2102CZ: (1)
     2103http://www.henryklahola.nazory.cz
     2104
     2105BG: (1)
     2106http://anitra.net
     2107
     2108US finals 31 (33):
     2109http://anglican.org
     2110http://anglicanhistory.org
     2111http://www.unicode.org
     2112https://static-promote.weebly.com
     2113http://aclhokiangarocks.blogspot.com
     2114http://bahaiprayers.net
     2115https://biblehub.com
     2116http://www.muhammad.com
     2117http://www.godrules.net
     2118http://m.biblepub.com
     2119http://www.krassotkin.ru
     2120http://www.gotquestions.org
     2121https://maorinews.com
     2122http://maaori.com
     2123http://kiaorahola.blogspot.com
     2124https://kjohnsonnz.blogspot.com
     2125http://pumanawawhangara.blogspot.com
     2126http://dannykahei.tripod.com
     2127http://burkekm001.tripod.com
     2128http://tkkpipipaopao.blogspot.com
     2129http://manateina.blogspot.com
     2130http://tatai09.blogspot.com
     2131http://www.twttoa.com
     2132http://tuhua2010.blogspot.com
     2133http://piripi.blogspot.com
     2134https://drive.google.com
     2135https://in.pinterest.com
     2136+? https://www.breaker.audio [AUDIO]
     2137+X http://ritusehji.blogspot.com
     213827 (28)
     2139
     2140https://www.kiwiproperty.com
     2141http://indigenousblogs.com
     2142https://mi.m.wikipedia.org [https://mi.wikipedia.org]
     2143http://csunplugged.org [includes https://www.csunplugged.org]
     2144?~ https://policies.oclc.org
     2145
     2146+ 4 (5) = 31 (33) incl with MI in URL Path
     2147
     2148
     2149NZ: 113 unique + 11 non-unique
     2150http://www.teipukarea.maori.nz
     2151http://ngatipahauwera.co.nz
     2152http://www.oag.govt.nz
     2153https://sexualviolence.victimsinfo.govt.nz
     2154http://tmoa.tki.org.nz
     2155http://www.tewhanake.maori.nz
     2156http://www.matarikifestival.org.nz
     2157http://www.otepoti.school.nz
     2158https://www.maoritelevision.com
     2159http://pukapuka.nz
     2160http://community.nzdl.org
     2161http://maori.livingheritage.org.nz [http://www.livingheritage.org.nz]
     2162http://pukoro.co.nz
     2163https://cdn.tehiku.nz [DOMAIN: tehiku.nz]
     2164http://www.runanga.co.nz
     2165http://kuraaiwi.maori.nz
     2166http://kurataiao.tki.org.nz
     2167http://satellites.co.nz
     2168http://teaohou.natlib.govt.nz
     2169http://www.tuwharetoa.iwi.nz
     2170https://www.terito.school.nz
     2171https://ttw1.cwp.govt.nz
     2172https://www.whanau-tahi.school.nz
     2173https://e-ako-pangarau.nzmaths.co.nz
     2174https://teaomaori.news
     2175http://tetaurawhiri.govt.nz
     2176https://www.tuiatematangi.ac.nz
     2177http://animations.tewhanake.maori.nz
     2178https://www.dnc.org.nz
     2179http://firstworldwar.tki.org.nz [http://www.firstworldwar.tki.org.nz]
     2180http://www.28maoribattalion.org.nz
     2181http://www.tewikiotereomaori.co.nz
     2182http://www.brettgraham.co.nz
     2183https://hepatakakupu.nz
     2184http://anglicanprayerbook.nz
     2185http://arataua.nz
     2186http://maori.tki.org.nz
     2187https://paekupu.co.nz
     2188https://haereheikaiako.co.nz
     2189https://curriculumtool.education.govt.nz
     2190http://kurakokiri.maori.nz [includes: http://www.kurakokiri.maori.nz]
     2191http://www.kkmmaungarongo.co.nz
     2192http://www.heartland.co.nz
     2193http://oilcrash.com
     2194http://www.kura-porirua.school.nz
     2195https://www.sporty.co.nz
     2196https://www.tematawai.maori.nz
     2197https://www.terakipaewhenua.school.nz
     2198http://www.tetaurawhiri.govt.nz
     2199http://archive.stats.govt.nz
     2200http://tiritiowaitangi.govt.nz
     2201http://www.waiata.maori.nz [includes: http://waiata.maori.nz]
     2202http://hana.co.nz
     2203http://kaupare.co.nz
     2204http://www.tereowrap.nz
     2205http://www.hrc.co.nz
     2206http://ngatiporoukiponeke.org.nz
     2207http://rurued.school.nz
     2208http://www.twtop.school.nz
     2209http://www.huri-translations.pf
     2210https://teara.govt.nz [https://admin.teara.govt.nz, http://blog.teara.govt.nz]
     2211https://tiritiowaitangi.govt.nz
     2212http://www.tmoa.tki.org.nz
     2213https://www.komako.org.nz
     2214http://www.wcl.govt.nz [included:http://kete.wcl.govt.nz]
     2215http://punareo.co.nz
     2216https://rapuatearatika.education.govt.nz
     2217http://tmmkkm.school.nz
     2218http://www.cs.waikato.ac.nz
     2219http://www.kupengahao.co.nz
     2220https://www.hapuhauora.health.nz
     2221http://cms.sunsmartschools.co.nz [http://sunsmartschools.co.nz/]
     2222http://kuraproductions.co.nz
     2223https://keepourmoneyclean.govt.nz
     2224http://www.tekura.school.nz
     2225http://www.tkkmmokopuna.school.nz
     2226http://hangaraumatihiko.tki.org.nz
     2227http://www.pakanae.maori.nz
     2228--- 78+9
     2229http://holyspirit.nz
     2230https://www.ngamanawainc.co.nz [includes http://www.ngamanawainc.co.nz]
     2231http://www.finlaysonpark.school.nz
     2232http://www.w3vietnam.org.nz [includes http://w3vietnam.org.nz]
     2233https://www.takitimu.ac.nz
     2234https://kotahimiriona.co.nz
     2235https://rehuamarae.co.nz
     2236http://reoora.co.nz
     2237https://manawatuheritage.pncc.govt.nz
     2238http://rsnz.natlib.govt.nz
     2239https://www.taitokerautrust.org.nz
     2240http://tewikiotereomaori.nz
     2241https://www.korokikahukura.co.nz
     2242https://www.pinterest.nz
     2243https://www.rereahu.maori.nz
     2244http://givealittle.co.nz
     2245https://kaiiwicamp.nz [includes http://kaiiwicamp.nz]
     2246http://ngarauhuia.ngatiapakiterato.iwi.nz
     2247https://m.wairarapatv.co.nz
     2248http://avonside.net
     2249http://www.maoriinvestments.co.nz
     2250http://conference.tpwt.maori.nz
     2251https://www.puau.school.nz
     2252http://tehauora.org.nz
     2253http://temahurehure.maori.nz
     2254http://www.temarareo.org
     2255http://www.tetaumuturunanga.iwi.nz
     2256http://www.writersfestival.co.nz
     2257http://www.kmk.maori.nz
     2258https://www.stats.govt.nz [includes http://archive.stats.govt.nz]
     2259---30+4
     2260+? http://ngatiwhakaue.iwi.nz
     2261+? https://interactives.stuff.co.nz
     2262+? http://whatonga.school.nz
     2263+? https://player.vimeo.com
     2264+? http://southerntribes.co.nz
     2265---78+30+(5)=113 unique + 11 non-unique
     2266?X https://www.e-agent.nz [includes: https://office.e-agent.nz,http://videos.e-agent.nz]
Note: See TracChangeset for help on using the changeset viewer.