Changeset 30923
- Timestamp:
- 2016-10-25T23:28:22+13:00 (6 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
other-projects/hathitrust/solr-extracted-features/trunk/RUN.bash
r30918 r30923 1 1 #!/bin/bash 2 2 3 input_dir=pd-ef-json-files 3 #input_dir=pd-ef-json-files 4 input_dir="hdfs://10.10.0.52:9000/user/htrc/pd-ef-json-files" 4 5 output_dir=pd-solr-json-files 5 6 6 master_opt="--master local[4]" 7 #master_opt="--master local[4]" 8 master_opt="--master spark://10.10.0.52:7077" 9 7 10 self_contained_jar=target/htrc-ef-ingest-0.9-jar-with-dependencies.jar 8 11 base_cmd="spark-submit --class org.hathitrust.PrepareForIngest $master_opt $self_contained_jar" 9 12 10 13 if [ $# -ge 1 ] ; then 11 file_listing=shift $* 14 file_listing=$1 15 shift 12 16 $base_cmd --json-filelist="$file_listing" $input_dir $output_dir $* 13 17 else … … 15 19 echo "* Processing all files in: $input_dir" 16 20 echo "****" 17 $base_cmd $input_dir $output_dir $*21 $base_cmd $input_dir/*.json.bz2 $output_dir $* 18 22 fi 19 23
Note:
See TracChangeset
for help on using the changeset viewer.