Last change
on this file since 30926 was 30926, checked in by davidb, 7 years ago |
Restructuring of RUN scripts to be more flexible
|
-
Property svn:executable
set to
*
|
File size:
1.1 KB
|
Line | |
---|
1 | #!/bin/bash
|
---|
2 |
|
---|
3 | # To work, the follow bash variables need to have been set:
|
---|
4 | #
|
---|
5 | # json_filelist input_dir output_dir
|
---|
6 | #
|
---|
7 | # Typically done through running a wrapper script, such as:
|
---|
8 | #
|
---|
9 | # RUN-PD-CLUSTER.bash
|
---|
10 |
|
---|
11 | if [ "x$json_filelist" = "x" ] ; then
|
---|
12 | echo "_RUN.bash: Failed to set 'json_filelist'" 1>2
|
---|
13 | exit
|
---|
14 | fi
|
---|
15 |
|
---|
16 | if [ "x$input_dir" = "x" ] ; then
|
---|
17 | echo "_RUN.bash: Failed to set 'input_dir'" 1>2
|
---|
18 | exit
|
---|
19 | fi
|
---|
20 |
|
---|
21 | if [ "x$output_dir" = "x" ] ; then
|
---|
22 | echo "_RUN.bash: Failed to set 'output_dir'" 1>2
|
---|
23 | exit
|
---|
24 | fi
|
---|
25 |
|
---|
26 | self_contained_jar=target/htrc-ef-ingest-0.9-jar-with-dependencies.jar
|
---|
27 | base_cmd="spark-submit --class org.hathitrust.PrepareForIngest $master_opt $self_contained_jar"
|
---|
28 |
|
---|
29 | $base_cmd --json-filelist="$json_filelist" "$input_dir" "$output_dir" $*
|
---|
30 |
|
---|
31 | # spark-submit --class org.hathitrust.PrepareForIngest --master local[4] target/htrc-ef-ingest-0.9-jar-with-dependencies.jar --json-filelist=pd-file-listing-step10000.txt pd-ef-json-files pd-solr-json-files $*
|
---|
32 |
|
---|
33 | # spark-submit --class org.hathitrust.PrepareForIngest --master local[4] target\htrc-ef-ingest-0.9-jar-with-dependencies.jar --json-filelist=pd-file-listing-step1000.txt json-files solr-files $*
|
---|
Note:
See
TracBrowser
for help on using the repository browser.