public class LanguageProfileBuilder
extends java.lang.Object
LanguageProfile.
This class does no internal synchronization.
| Modifier and Type | Field and Description |
|---|---|
private @NotNull LdLocale |
locale |
private int |
minimalFrequency |
private NgramExtractor |
ngramExtractor |
private java.util.Map<java.lang.Integer,java.util.Map<java.lang.String,java.lang.Integer>> |
ngrams |
| Constructor and Description |
|---|
LanguageProfileBuilder(@NotNull LdLocale locale) |
LanguageProfileBuilder(@NotNull java.lang.String locale)
Deprecated.
|
| Modifier and Type | Method and Description |
|---|---|
LanguageProfileBuilder |
addGram(java.lang.String ngram)
Shortcut for addGram(ngram, 1).
|
LanguageProfileBuilder |
addGram(java.lang.String ngram,
int frequency)
If the builder already has this ngram, the given frequency is added to the current count.
|
LanguageProfileBuilder |
addText(java.lang.CharSequence text)
In order to use this you must set the
ngramExtractor first. |
LanguageProfile |
build() |
LanguageProfileBuilder |
minimalFrequency(int minimalFrequency) |
LanguageProfileBuilder |
ngramExtractor(@NotNull NgramExtractor ngramExtractor) |
private void |
removeNgramsWithLessFrequency() |
@NotNull private final @NotNull LdLocale locale
private int minimalFrequency
private NgramExtractor ngramExtractor
private final java.util.Map<java.lang.Integer,java.util.Map<java.lang.String,java.lang.Integer>> ngrams
public LanguageProfileBuilder(@NotNull
@NotNull LdLocale locale)
@Deprecated
public LanguageProfileBuilder(@NotNull
@NotNull java.lang.String locale)
public LanguageProfileBuilder ngramExtractor(@NotNull @NotNull NgramExtractor ngramExtractor)
public LanguageProfileBuilder minimalFrequency(int minimalFrequency)
minimalFrequency - 1-n, the default is 1. n-grams that occurred less often in the text are removed.
This really should be set to something higher.
Try to play with the number until you get a profile file of satisfying size,
that produces good language detection results.public LanguageProfileBuilder addText(java.lang.CharSequence text)
ngramExtractor first.public LanguageProfileBuilder addGram(java.lang.String ngram)
public LanguageProfileBuilder addGram(java.lang.String ngram, int frequency)
public LanguageProfile build()
private void removeNgramsWithLessFrequency()