root/other-projects/hathitrust/vagrant-hadoop-cluster/trunk/README.txt @ 30904

Revision 30904, 0.9 KB (checked in by davidb, 4 years ago)

Extra resource/links added

Line 
1
2Vargrant provisioning files to spin up a modest (4 node) Hadoop
3cluster for experiments processing HTRC Extracted Feature JSON files
4suitable for ingesting into Solr.
5
6Top-level code Apache Spark, processing HDFS stored JSON files, hence
7the need for an underlying Hadoop cluster.
8
9Provisioning based on the following online resources, but updated to
10use newer versions of Ubuntu, Java, and Hadoop.
11
12  http://cscarioni.blogspot.co.nz/2012/09/setting-up-hadoop-virtual-cluster-with.html
13
14  https://github.com/calo81/vagrant-hadoop-cluster
15
16
17Useful documentation about setting up a Hadoop cluster, read:
18
19  http://chaalpritam.blogspot.co.nz/2015/05/hadoop-270-single-node-cluster-setup-on.html
20then
21  http://chaalpritam.blogspot.co.nz/2015/05/hadoop-270-multi-node-cluster-setup-on.html
22
23OR
24
25  https://xuri.me/2015/03/09/setup-hadoop-on-ubuntu-single-node-cluster.html
26then
27  https://xuri.me/2016/03/22/setup-hadoop-on-ubuntu-multi-node-cluster.html
28
29
30
Note: See TracBrowser for help on using the browser.