Class LanguageDetectorImpl

  • All Implemented Interfaces:
    LanguageDetector

    public final class LanguageDetectorImpl
    extends Object
    implements LanguageDetector

    This class is immutable and thus thread-safe.

    Author:
    Nakatani Shuyo, Fabian Kessler, Elmer Garduno
    • Method Detail

      • detect

        public com.google.common.base.Optional<LdLocale> detect​(CharSequence text)
        Description copied from interface: LanguageDetector
        Returns the best detected language if the algorithm is very confident.

        Note: you may want to use getProbabilities() instead. This here is very strict, and sometimes returns absent even though the first choice in getProbabilities() is correct.

        Specified by:
        detect in interface LanguageDetector
        Parameters:
        text - You probably want a TextObject.
        Returns:
        The language if confident, absent if unknown or not confident enough.
      • getProbabilities

        public List<DetectedLanguage> getProbabilities​(CharSequence text)
        Description copied from interface: LanguageDetector
        Returns all languages with at least some likeliness.

        There is a configurable cutoff applied for languages with very low probability.

        The way the algorithm currently works, it can be that, for example, this method returns a 0.99 for Danish and less than 0.01 for Norwegian, and still they have almost the same chance. It would be nice if this could be improved in future versions.

        Specified by:
        getProbabilities in interface LanguageDetector
        Parameters:
        text - You probably want a TextObject.
        Returns:
        Sorted from better to worse. May be empty. It's empty if the program failed to detect any language, or if the input text did not contain any usable text (just noise).