Changeset 31013 for other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java
- Timestamp:
- 2016-10-31T20:51:39+13:00 (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/extractedfeatures/ProcessForSolrIngest.java
r31011 r31013 117 117 double per_vol = 100.0/(double)num_volumes; 118 118 119 DoubleAccumulator progress_accum = jsc.sc().doubleAccumulator("Progress Percent"); 120 121 PerPageJSONFlatmap paged_solr_json_flatmap = new PerPageJSONFlatmap(_input_dir,_solr_url,_output_dir,_verbosity, progress_accum,per_vol); 119 DoubleAccumulator per_vol_progress_accum = jsc.sc().doubleAccumulator("Per Volume Progress Percent"); 120 121 PerPageJSONFlatmap paged_solr_json_flatmap 122 = new PerPageJSONFlatmap(_input_dir,_solr_url,_output_dir,_verbosity, per_vol_progress_accum,per_vol); 122 123 JavaRDD<JSONObject> per_page_jsonobjects = json_list_data.flatMap(paged_solr_json_flatmap).cache(); 123 124 124 PerPageJSONMap paged_json_id_map = new PerPageJSONMap(_input_dir,_solr_url,_output_dir,_verbosity, progress_accum,per_vol); 125 //long num_page_ids = per_page_jsonobjects.count(); // trigger lazy eval of: flatmap:per-vol 126 127 DoubleAccumulator per_page_progress_accum = jsc.sc().doubleAccumulator("Pages Processed"); 128 129 PerPageJSONMap paged_json_id_map 130 = new PerPageJSONMap(_input_dir,_solr_url,_output_dir,_verbosity, per_page_progress_accum,1.0); 125 131 JavaRDD<String> per_page_ids = per_page_jsonobjects.map(paged_json_id_map); 126 132
Note:
See TracChangeset
for help on using the changeset viewer.