Last change
on this file since 31252 was 31252, checked in by davidb, 7 years ago |
Support for icu-tokenize property added, plus relevant refactoring.
|
File size:
809 bytes
|
Line | |
---|
1 |
|
---|
2 | #wcsa-ef-ingest.process-ef-json-mode = per-volume
|
---|
3 | wcsa-ef-ingest.process-ef-json-mode = per-page
|
---|
4 |
|
---|
5 | #wcsa-ef-ingest.solr-clode-nodes = 10.11.0.53:8983,10.11.0.54:8983,10.11.0.55:8983
|
---|
6 | wcsa-ef-ingest.solr-cloud-nodes = gc0:8983,gc1:8983,gc2:8983,gc3:8983,gc4:8983,gc5:8983,gc6:8983,gc7:8983,gc8:8983,gc9:8983
|
---|
7 | wcsa-ef-ingest.strict-file-io = false
|
---|
8 | wcsa-ef-ingest.icu-tokenize = flase
|
---|
9 |
|
---|
10 | # For guide on number of partitions to use, see "Parallelized collections" section of:
|
---|
11 | # https://spark.apache.org/docs/2.0.1/programming-guide.html
|
---|
12 | # which suggests 2-4 * num_cores
|
---|
13 | #
|
---|
14 | # For a more detailed discussion see:
|
---|
15 | # http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
|
---|
16 |
|
---|
17 | # wcsa-ef-ingest.num-partitions = 12
|
---|
18 | wcsa-ef-ingest.num-partitions = 120
|
---|
19 |
|
---|
20 | spark.executor.cores=11
|
---|
Note:
See
TracBrowser
for help on using the repository browser.