source: gs3-extensions/maori-lang-detection/MoreReading/WebScraping.txt@ 33409

Last change on this file since 33409 was 33409, checked in by ak19, 5 years ago

Forgot to commit 2 files with links and shuffling some links around into the correct files after moving between computers.

File size: 353 bytes
Line 
1
2http://www.basicsbehind.com/extract-text-webpage/
3 http://www.l3s.de/~kohlschuetter/publications/wsdm187-kohlschuetter.pdf
4
5
6https://jsoup.org/
7https://uhack-guide.readthedocs.io/en/latest/technical/scraping/
8
9https://blog.ouseful.info/2015/02/09/getting-text-of-anything-docs-pdfs-images-using-apache-tika/
10 https://tika.apache.org/1.20/examples.html
Note: See TracBrowser for help on using the repository browser.