Context Navigation

← Previous Change
Next Change →

PagedJSON.java

Timestamp:

2016-10-29T16:17:22+13:00 (7 years ago)

Author:

davidb

Message:

Changed to run main processing method as action rather than transform. Done to help accumulator add

File:

: 1 edited

other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/PagedJSON.java (modified) (6 diffs)

Legend:

: Unmodified
: Added
: Removed

other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/PagedJSON.java

-              r30984
+              r30985
 import org.apache.commons.compress.compressors.CompressorException;
 import org.apache.spark.api.java.function.FlatMapFunction;
+import org.apache.spark.api.java.function.VoidFunction;
 import org.apache.spark.util.DoubleAccumulator;
 import org.json.JSONArray;
 …
+class PagedJSON implements FlatMapFunction<String, String>
+//class PagedJSON implements FlatMapFunction<String, String>
+class PagedJSON implements VoidFunction<String>
+{
     private static final long serialVersionUID = 1L;
 …
             String decodedString;
             while ((decodedString = in.readLine()) != null) {
-                //System.out.println(decodedString);
                 sb.append(decodedString);
+            }
 …
+    }
+    public Iterator<String> call(String json_file_in)
+    //public Iterator<String> call(String json_file_in)
+    public void call(String json_file_in)
+    {
         JSONObject extracted_feature_record = readJSONFile(json_file_in);
 …
                     System.out.println("Sample output Solr add JSON [page 20]: " + solr_add_doc_json.toString());
                     System.out.println("==================");
-                    //System.out.println("Sample text [page 20]: " + solr_add_doc_json.getString("_text_"));
+                }
-                // create JSON obj of just the page (for now), and write it out
-                // write out the JSONOBject as a bz2 compressed file
-                /*
-                try {
-                    BufferedWriter bw = ClusterFileIO.getBufferedWriterForCompressedFile(_output_dir + "/" + output_json_bz2);
-                    bw.write(ef_page.toString());
-                    bw.close();
-                } catch (IOException e) {
-                    e.printStackTrace();
-                } catch (CompressorException e) {
-                    e.printStackTrace();
+                }
-                */
                 if (_solr_url != null) {
 …
+        }
-        /*
-        for (int i = 0; i < ef_num_pages; i++)
+        {
-            //String post_id = ef_pages.getJSONObject(i).getString("post_id");
-            //......
+        }
-        */
-        //String pageName = json_obj.getJSONObject("pageInfo").getString("pageName");
-/*
-        JSONArray arr = obj.getJSONArray("posts");
-        for (int i = 0; i < arr.length(); i++)
+        {
-            String post_id = arr.getJSONObject(i).getString("post_id");
-            ......
+        }
-*/
         ids.add(volume_id);
         _progress_accum.add(_progress_step);
+        return ids.iterator();
+        //return ids.iterator();
+    }
+}

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 30985 for other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/PagedJSON.java

Legend:

other-projects/hathitrust/solr-extracted-features/trunk/src/main/java/org/hathitrust/PagedJSON.java

Download in other formats: