Package com.optimaize.langdetect.text
Provides functionality for concatenating and cleaning text that is used as
a) learning text to produce
com.optimaize.langdetect.LanguageProfile
s
b) for the text for which the language is to be guessed.- Author:
- Fabian Kessler
-
Interface Summary Interface Description TextFilter Allows to filter content from a text to be ignored for the n-gram analysis. -
Class Summary Class Description CharNormalizerTextFilterImpl Deprecated. can't be used because it would be a big loss to not inline this code.CommonTextObjectFactories Contains some standardTextObjectFactory
s ready to use for common use cases.MultiTextFilter Groups multipleTextFilter
s as one and runs them in the given order.RemoveMinorityScriptsTextFilter Removes text written in scripts that are not the dominant script of the text.TextObject A convenient text object implementing CharSequence and Appendable.TextObjectFactory Factory forTextObject
s.TextObjectFactoryBuilder Builder forTextObjectFactory
.UrlTextFilter Removes URLs and email addresses from the text.