Context Navigation

← Previous Change
Next Change →

Vagrant-Spark-Hadoop.txt

Timestamp:

2019-09-13T17:44:41+12:00 (5 years ago)

Author:

ak19

Message:

Improved the code to use a static block to load the needed properties from config.properties and initialise some static final ints from there. Code now uses the logger for debugging. New properties in config.properties. Returned code to use a counter, recordCount, re-zeroed for each WETProcessor since the count was used for unique filenames, and filename prefixes are unique for each warc.wet file. So these prefixes, in combination with keeping track of the recordcount per warc.wet file, each WET record written out to a file is assigned a unique filename. (No longer need a running total of all WET records across warc.wet files processed ensuring uniqueness of filenames.) All appears to still work similarly to previous commit in creating discard and keep subfolders.

File:

: 1 edited

gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt (modified) (2 diffs)

Legend:

: Unmodified
: Added
: Removed

gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt

-              r33457
+              r33467
 vagrant@node1:~$ locate guava.jar
 /usr/share/java/guava.jar
 …
 vagrant@node1:~/ia-hadoop-tools$ hdfs dfs -put /usr/share/java/guava.jar /usr/local/hadoop/share/hadoop/common/.
 put: `/usr/local/hadoop/share/hadoop/common/.': No such file or directory
 # hadoop classpath locations are not hdfs filesystem
+# hadoop classpath locations are not on the hdfs filesystem, but on the regular fs
 vagrant@node1:~/ia-hadoop-tools$ sudo cp /usr/share/java/guava.jar /usr/local/hadoop/share/hadoop/common/.

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 33467 for gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt

Legend:

gs3-extensions/maori-lang-detection/MoreReading/Vagrant-Spark-Hadoop.txt

Download in other formats: