Context Navigation

Changeset 30995 for other-projects/hathitrust

Timestamp:

2016-10-30T21:43:02+13:00 (7 years ago)

Author:

davidb

Message:

Adjustment of NUM_PARTITIONS to be based on Spark recommended calculation

File:

-              r30990
+              r30995
     private static final long serialVersionUID = 1L;
+    public static final int NUM_PARTITIONS = 6; // default would appear to be 2
+    // Following details on number of partitions to use given in
+    //  "Parallelized collections" section of:
+    //   https://spark.apache.org/docs/2.0.1/programming-guide.html
+    //
+    // For a more detailed discussion see:
+    //   http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
+    public static final int NUM_CORES = 6;
+    public static final int NUM_PARTITIONS = 2*NUM_CORES; // default would appear to be 2
     protected String _input_dir;
 …
             System.exit(1);
+        }
+        if (read_only) {
+            // For this case, need to ensure solr-url and output-dir are null
+            output_dir = null;
+            solr_url = null;
+        }
         String input_dir  = filtered_args[0];

Note: See TracChangeset for help on using the changeset viewer.