Changeset 31256 for other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java
- Timestamp:
- 2016-12-20T16:44:40+13:00 (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
other-projects/hathitrust/wcsa/extracted-features-solr/trunk/solr-ingest/src/main/java/org/hathitrust/extractedfeatures/ProcessForWhitelist.java
r31255 r31256 60 60 JavaSparkContext jsc = new JavaSparkContext(conf); 61 61 62 String filename_root = _json_list_filename.replaceAll(".*/","").replaceAll("\\..*$",""); 63 String output_directory = "whitelist-" + filename_root + "-out"; 64 if (ClusterFileIO.exists(output_dir)) 65 { 66 System.err.println("Error: " + output_directory + " already exists. Spark unable to write output data"); 67 jsc.close(); 68 System.exit(1); 69 } 70 62 71 int num_partitions = Integer.getInteger("wcsa-ef-ingest.num-partitions", DEFAULT_NUM_PARTITIONS); 63 72 JavaRDD<String> json_list_data = jsc.textFile(_json_list_filename,num_partitions).cache(); … … 127 136 count_sorted.setName("descending-word-frequency"); 128 137 129 String filename_root = _json_list_filename.replaceAll(".*/","").replaceAll("\\..*$",""); 130 String output_directory = "whitelist-" + filename_root + "-out"; 138 131 139 132 140 //sorted_swaped_back_pair.saveAsTextFile(output_directory);
Note:
See TracChangeset
for help on using the changeset viewer.