Class MultiWordChunker

    • Constructor Detail

      • MultiWordChunker

        public MultiWordChunker​(String filename)
        Parameters:
        filename - file text with multiwords and tags
      • MultiWordChunker

        public MultiWordChunker​(String filename,
                                boolean allowFirstCapitalized)
        Parameters:
        filename - file text with multiwords and tags
        allowFirstCapitalized - if set to true, first word of the multiword can be capitalized
    • Method Detail

      • disambiguate

        public final AnalyzedSentence disambiguate​(AnalyzedSentence input)
        Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...) start, and </ELLIPSIS> for ellipsis end.
        Parameters:
        input - The tokens to be chunked.
        Returns:
        AnalyzedSentence with additional markers.