root/other-projects/hathitrust/vagrant-hadoop-cluster/trunk/README.txt @ 30903

Revision 30903, 0.5 KB (checked in by davidb, 4 years ago)

Vagrant provisioning files for a 4-node Hadoop cluster. See README.txt for more details

Line 
1
2Vargrant provisioning files to spin up a modest (4 node) Hadoop
3cluster for experiments processing HTRC Extracted Feature JSON files
4suitable for ingesting into Solr.
5
6Top-level code Apache Spark, processing HDFS stored JSON files, hence
7the need for an underlying Hadoop cluster.
8
9Provisioning based on the following online resources, but updated to
10use newer versions of Ubuntu, Java, and Hadoop.
11
12  http://cscarioni.blogspot.co.nz/2012/09/setting-up-hadoop-virtual-cluster-with.html
13
14  https://github.com/calo81/vagrant-hadoop-cluster
Note: See TracBrowser for help on using the browser.