Class LangProfile
- java.lang.Object
-
- com.optimaize.langdetect.cybozu.util.LangProfile
-
- All Implemented Interfaces:
Serializable
@Deprecated public class LangProfile extends Object implements Serializable
Deprecated.replaced by LanguageProfileLangProfile
is a Language Profile Class. Users don't use this class directly. TODO split into builder and immutable class. TODO currently this only makes n-grams with the space before a word included. no n-gram with the space after the word. Example: "foo" creates " fo" as 3gram, but not "oo ". Either this is a bug, or if intended then needs documentation.- Author:
- Nakatani Shuyo
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description LangProfile()
Deprecated.Constructor for JSONICLangProfile(String name)
Deprecated.Normal Constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
add(@NotNull String gram)
Deprecated.Add n-gram to profileMap<String,Integer>
getFreq()
Deprecated.String
getName()
Deprecated.int[]
getNWords()
Deprecated.void
omitLessFreq()
Deprecated.Removes ngrams that occur fewer times than MINIMUM_FREQ to get rid of rare ngrams.void
setFreq(Map<String,Integer> freq)
Deprecated.void
setName(String name)
Deprecated.void
setNWords(int[] nWords)
Deprecated.
-
-
-
Constructor Detail
-
LangProfile
public LangProfile()
Deprecated.Constructor for JSONIC
-
LangProfile
public LangProfile(String name)
Deprecated.Normal Constructor- Parameters:
name
- language name
-
-
Method Detail
-
add
public void add(@NotNull @NotNull String gram)
Deprecated.Add n-gram to profile- Parameters:
gram
-
-
omitLessFreq
public void omitLessFreq()
Deprecated.Removes ngrams that occur fewer times than MINIMUM_FREQ to get rid of rare ngrams. Also removes ascii ngrams if the total number of ascii ngrams is less than one third of the total. This is done because non-latin text (such as Chinese) often has some latin noise in between. TODO split the 2 cleaning to separate methods. TODO distinguish ascii/latin, currently it looks for latin only, should include characters with diacritics, eg Vietnamese. TODO current code counts ascii, but removes any latin. is that desired? if so then this needs documentation.
-
getName
public String getName()
Deprecated.
-
setName
public void setName(String name)
Deprecated.
-
getNWords
public int[] getNWords()
Deprecated.
-
setNWords
public void setNWords(int[] nWords)
Deprecated.
-
-