Package org.languagetool.tokenizers
Class SRXSentenceTokenizer
- java.lang.Object
-
- org.languagetool.tokenizers.SRXSentenceTokenizer
-
- All Implemented Interfaces:
SentenceTokenizer
,Tokenizer
- Direct Known Subclasses:
SimpleSentenceTokenizer
public class SRXSentenceTokenizer extends Object implements SentenceTokenizer
Class to tokenize sentences using rules from an SRX file.- Author:
- Marcin MiĆkowski, Jarek Lipski
-
-
Constructor Summary
Constructors Constructor Description SRXSentenceTokenizer(Language language)
Build a sentence tokenizer based on the rules in thesegment.srx
file that comes with LanguageTool.SRXSentenceTokenizer(Language language, String srxInClassPath)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
boolean
singleLineBreaksMarksPara()
List<String>
tokenize(String text)
Tokenize the given string to sentences.
-
-
-
Constructor Detail
-
SRXSentenceTokenizer
public SRXSentenceTokenizer(Language language)
Build a sentence tokenizer based on the rules in thesegment.srx
file that comes with LanguageTool.
-
-
Method Detail
-
tokenize
public final List<String> tokenize(String text)
Description copied from interface:SentenceTokenizer
Tokenize the given string to sentences.- Specified by:
tokenize
in interfaceSentenceTokenizer
- Specified by:
tokenize
in interfaceTokenizer
-
singleLineBreaksMarksPara
public final boolean singleLineBreaksMarksPara()
- Specified by:
singleLineBreaksMarksPara
in interfaceSentenceTokenizer
-
setSingleLineBreaksMarksParagraph
public final void setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
- Specified by:
setSingleLineBreaksMarksParagraph
in interfaceSentenceTokenizer
- Parameters:
lineBreakParagraphs
- iftrue
, single lines breaks are assumed to end a paragraph; iffalse
, only two ore more consecutive line breaks end a paragraph
-
-