source: gs3-extensions/maori-lang-detection/lib/config.properties@ 33394

Last change on this file since 33394 was 33394, checked in by ak19, 5 years ago
  1. Started a file on feasibility with the data now available and some links that have interesting or useful information. 2. Minor simplification to get_commoncrawl_nz_urls.sh script. 3. config.props file to be used by Java. Can't find wget configuration settings to limit mirroring of a site to a certain number of pages, but can limit overall download to size (--quote or -Q).
File size: 526 bytes
Line 
1# https://www.linuxjournal.com/content/downloading-entire-web-site-wget
2# https://linuxreviews.org/Wget:_download_whole_or_parts_of_websites_with_ease
3# https://www.webhostface.com/kb/knowledgebase/examples-using-wget/
4# "You can replicate the HTML content of a website with the –mirror option (or -m for short)
5# wget -m http://domain.com"
6# https://www.linuxquestions.org/questions/linux-server-73/wget-how-to-download-more-than-one-file-at-once-instead-of-file-after-file-704693/
7wget.cmd=wget -Q10m -m %%BASE_URL%%
Note: See TracBrowser for help on using the repository browser.