source: other-projects/maori-lang-detection/src/org/greenstone/atea/morphia/SentenceInfo.java@ 33674

Last change on this file since 33674 was 33674, checked in by ak19, 4 years ago

Changes to support the top 5 predicted langcodes and their confidence values per sentence/overlapping sentence (all 103 made some documents, like of site 00006, too big too go into mongodb). Have re-run the NutchTextDumpToMongDB to send the new form of the docs into mongodb.

File size: 1.3 KB
Line 
1package org.greenstone.atea.morphia;
2
3import java.util.Map;
4import java.util.HashMap;
5
6import dev.morphia.annotations.*;
7
8
9@Entity("Sentences")
10public class SentenceInfo {
11
12 public final String sentence;
13 public final Map<String, Double> languageToConfidenceMap;
14 @Embedded
15 public final LanguageInfo[] languagesInfo; // array of langCode and confidence value pairs
16
17
18 public SentenceInfo(String sentence, LanguageInfo[] languages) {
19 this.sentence = sentence;
20 this.languagesInfo = languages;
21
22 // let's store (langCode -> confidence) lookup in Map:
23 this.languageToConfidenceMap = new HashMap<String, Double>();
24 for(LanguageInfo li : languages) {
25 String langCode = li.langCode;
26 Double confidence = new Double(li.confidenceLevel);
27 languageToConfidenceMap.put(langCode, confidence);
28 }
29 }
30
31}
32
33// BACK WHEN WE ONLY STORED THE BEST PREDICTED LANGUAGE META FOR EACH SENTENCE:
34/*
35@Entity("Sentences")
36public class SentenceInfo {
37 public final double confidenceLevel;
38 // 3 letter lang code
39 public final String langCode;
40 public final String sentence;
41
42 public SentenceInfo(double confidence, String langCode, String sentence) {
43 this.confidenceLevel = confidence;
44 this.langCode = langCode;
45 this.sentence = sentence;
46 }
47}
48*/
Note: See TracBrowser for help on using the repository browser.