Class AnalyzedSentence


  • public final class AnalyzedSentence
    extends Object
    A sentence that has been tokenized and analyzed.
    Author:
    Daniel Naber
    • Method Detail

      • getTokensWithoutWhitespace

        public AnalyzedTokenReadings[] getTokensWithoutWhitespace()
        Returns the AnalyzedTokenReadings of the analyzed text, with whitespace tokens removed but with the artificial SENT_START token included.
      • getOriginalPosition

        public int getOriginalPosition​(int nonWhPosition)
        Get a position of a non-whitespace token in the original sentence with whitespace.
        Parameters:
        nonWhPosition - position of a non-whitespace token
        Returns:
        position in the original sentence.
      • toShortString

        public String toShortString​(String readingDelimiter)
        Return string representation without chunk information.
        Since:
        2.3
      • getText

        public String getText()
        Return the original text.
        Since:
        2.7
      • toString

        public String toString​(String readingDelimiter)
        Return string representation with chunk information.
      • getAnnotations

        public String getAnnotations()
        Get disambiguator actions log.
      • getTokenSet

        public Set<String> getTokenSet()
        Get the lowercase tokens of this sentence in a set. Used internally for performance optimization.
        Since:
        2.4
      • getLemmaSet

        public Set<String> getLemmaSet()
        Get the lowercase lemmas of this sentence in a set. Used internally for performance optimization.
        Since:
        2.5
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object
      • hasParagraphEndMark

        public boolean hasParagraphEndMark​(Language lang)
        Returns true if sentences ends with a paragraph break.
        Since:
        4.3