Package org.languagetool
Class JLanguageTool
- java.lang.Object
-
- org.languagetool.JLanguageTool
-
- Direct Known Subclasses:
MultiThreadedJLanguageTool
public class JLanguageTool extends Object
The main class used for checking text against different rules:- built-in Java rules (for English: a vs. an, whitespace after commas, ...)
- built-in pattern rules loaded from external XML files (usually called
grammar.xml
) - your own implementation of the abstract
Rule
classes added withaddRule(Rule)
You will probably want to use the sub class
MultiThreadedJLanguageTool
for best performance.Thread-safety: this class is not thread safe. Create one instance per thread, but create the language only once (e.g.
new AmericanEnglish()
) and use it for all instances of JLanguageTool.- See Also:
MultiThreadedJLanguageTool
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
JLanguageTool.Mode
static class
JLanguageTool.ParagraphHandling
Constants for correct paragraph-rule handling.
-
Field Summary
Fields Modifier and Type Field Description static @Nullable String
BUILD_DATE
LanguageTool build date and time like2013-10-17 16:10
ornull
if not run from JAR.static String
DICTIONARY_FILENAME_EXTENSION
Extension of dictionary files read by Spellersstatic String
FALSE_FRIEND_FILE
The name of the file with false friend information.static @Nullable String
GIT_SHORT_ID
Abbreviated git id ornull
if not available.static String
MESSAGE_BUNDLE
Name of the message bundle for translations.static String
PARAGRAPH_END_TAGNAME
The internal tag used to mark the end of a paragraph.static String
PATTERN_FILE
The name of the file with error patterns.static String
SENTENCE_END_TAGNAME
The internal tag used to mark the end of a sentence.static String
SENTENCE_START_TAGNAME
The internal tag used to mark the beginning of a sentence.static String
VERSION
LanguageTool version as a string like2.3
or2.4-SNAPSHOT
.
-
Constructor Summary
Constructors Constructor Description JLanguageTool(Language language)
Create a JLanguageTool and setup the built-in Java rules for the given language.JLanguageTool(Language language, List<Language> altLanguages, Language motherTongue, ResultCache cache, GlobalConfig globalConfig, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language lang, Language motherTongue)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language language, Language motherTongue, ResultCache cache)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language language, Language motherTongue, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.JLanguageTool(Language language, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
activateLanguageModelRules(File indexDir)
Activate rules that depend on a language model.void
activateNeuralNetworkRules(File modelDir)
Activate rules that depend on pretrained neural network models.void
activateWord2VecModelRules(File indexDir)
Activate rules that depend on a word2vec language model.void
addMatchFilter(@NotNull RuleMatchFilter filter)
Add aRuleMatchFilter
for post-processing of rule matches Filters are called sequentially in the same order as addedvoid
addRule(Rule rule)
Add a rule to be used by the next call to the check methods likecheck(String)
.static void
addTemporaryFile(File file)
Adds a temporary file to the internal list (internal method, you should never need to call this as a user of LanguageTool)RuleMatch
adjustRuleMatchPos(RuleMatch match, int charCount, int columnCount, int lineCount, String sentence, AnnotatedText annotatedText)
Change RuleMatch positions so they are relative to the complete text, not just to the sentence.protected List<AnalyzedSentence>
analyzeSentences(List<String> sentences)
List<AnalyzedSentence>
analyzeText(String text)
Use this method if you want to access LanguageTool's otherwise internal analysis of the text.protected List<RuleMatch>
applyCustomFilters(List<RuleMatch> matches, AnnotatedText text)
should be called just once with complete list of matches, before returning them to callerList<RuleMatch>
check(String text)
The main check method.List<RuleMatch>
check(String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode)
List<RuleMatch>
check(String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener)
List<RuleMatch>
check(String text, RuleMatchListener listener)
The main check method.List<RuleMatch>
check(AnnotatedText text)
The main check method.List<RuleMatch>
check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode)
The main check method.List<RuleMatch>
check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener)
The main check method.List<RuleMatch>
check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener, JLanguageTool.Mode mode)
The main check method.List<RuleMatch>
check(AnnotatedText text, RuleMatchListener listener)
List<RuleMatch>
checkAnalyzedSentence(JLanguageTool.ParagraphHandling paraMode, List<Rule> rules, AnalyzedSentence analyzedSentence)
This is an internal method that's public only for technical reasons, please use one of thecheck(String)
methods instead.void
disableCategory(CategoryId id)
Disable the given rule category so the check methods likecheck(String)
won't use it.void
disableRule(String ruleId)
Disable a given rule so the check methods likecheck(String)
won't use it.void
disableRules(List<String> ruleIds)
Disable the given rules so the check methods likecheck(String)
won't use them.void
enableRule(String ruleId)
Enable a given rule so the check methods likecheck(String)
will use it.void
enableRuleCategory(CategoryId id)
Enable all rules of the given category so the check methods likecheck(String)
will use it.List<Rule>
getAllActiveOfficeRules()
Works like getAllActiveRules but overrides defaults by office defaultsList<Rule>
getAllActiveRules()
Get all active (not disabled) rules for the current language that are built-in or that have been added using e.g.List<Rule>
getAllRules()
Get all rules for the current language that are built-in or that have been added usingaddRule(Rule)
.AnalyzedSentence
getAnalyzedSentence(String sentence)
Tokenizes the givensentence
into words and analyzes it, and then disambiguates POS tags.Map<CategoryId,Category>
getCategories()
Get all rule categories for the current language.static ResourceDataBroker
getDataBroker()
The grammar checker needs resources from following directories:/resource
/rules
Set<String>
getDisabledRules()
Get rule ids of the rules that have been explicitly disabled.Language
getLanguage()
Get the language that was used to configure this instance.static ResourceBundle
getMessageBundle()
Gets the ResourceBundle (i18n strings) for the default language of the user's system.static ResourceBundle
getMessageBundle(Language lang)
Gets the ResourceBundle (i18n strings) for the given user interface language.List<AbstractPatternRule>
getPatternRulesByIdAndSubId(String id, String subId)
Get pattern rules by Id and SubId.AnalyzedSentence
getRawAnalyzedSentence(String sentence)
Tokenizes the givensentence
into words and analyzes it.List<String>
getUnknownWords()
Get the alphabetically sorted list of unknown words in the latest run of one of thecheck(String)
methods.boolean
isCategoryDisabled(CategoryId id)
Returns true if a category is explicitly disabled.static boolean
isPremiumVersion()
List<AbstractPatternRule>
loadFalseFriendRules(String filename)
Load false friend rules from an XML file.List<AbstractPatternRule>
loadPatternRules(String filename)
Load pattern rules from an XML file.protected List<RuleMatch>
performCheck(List<AnalyzedSentence> analyzedSentences, List<String> sentences, List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, JLanguageTool.Mode mode)
protected List<RuleMatch>
performCheck(List<AnalyzedSentence> analyzedSentences, List<String> sentences, List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, RuleMatchListener listener, JLanguageTool.Mode mode)
protected void
printIfVerbose(String s)
protected void
printSentenceInfo(AnalyzedSentence analyzedSentence)
protected void
rememberUnknownWords(AnalyzedSentence analyzedText)
static void
removeTemporaryFiles()
Clean up all temporary files, if there are any.List<String>
sentenceTokenize(String text)
Tokenizes the given text into sentences.void
setCleanOverlappingMatches(boolean cleanOverlappingMatches)
Whether thecheck(String)
methods return overlapping errors.void
setConfigValues(Map<String,Integer> v)
static void
setDataBroker(ResourceDataBroker broker)
The grammar checker needs resources from following directories:/resource
/rules
void
setListUnknownWords(boolean listUnknownWords)
Whether thecheck(String)
methods store unknown words.void
setMaxErrorsPerWordRate(float maxErrorsPerWordRate)
Maximum errors per word rate, checking will stop with an exception if the rate is higher.void
setOutput(PrintStream printStream)
Set a PrintStream that will receive verbose output.
-
-
-
Field Detail
-
VERSION
public static final String VERSION
LanguageTool version as a string like2.3
or2.4-SNAPSHOT
.- See Also:
- Constant Field Values
-
BUILD_DATE
@Nullable public static final @Nullable String BUILD_DATE
LanguageTool build date and time like2013-10-17 16:10
ornull
if not run from JAR.
-
GIT_SHORT_ID
@Nullable public static final @Nullable String GIT_SHORT_ID
Abbreviated git id ornull
if not available.- Since:
- 4.5
-
PATTERN_FILE
public static final String PATTERN_FILE
The name of the file with error patterns.- See Also:
- Constant Field Values
-
FALSE_FRIEND_FILE
public static final String FALSE_FRIEND_FILE
The name of the file with false friend information.- See Also:
- Constant Field Values
-
SENTENCE_START_TAGNAME
public static final String SENTENCE_START_TAGNAME
The internal tag used to mark the beginning of a sentence.- See Also:
- Constant Field Values
-
SENTENCE_END_TAGNAME
public static final String SENTENCE_END_TAGNAME
The internal tag used to mark the end of a sentence.- See Also:
- Constant Field Values
-
PARAGRAPH_END_TAGNAME
public static final String PARAGRAPH_END_TAGNAME
The internal tag used to mark the end of a paragraph.- See Also:
- Constant Field Values
-
MESSAGE_BUNDLE
public static final String MESSAGE_BUNDLE
Name of the message bundle for translations.- See Also:
- Constant Field Values
-
DICTIONARY_FILENAME_EXTENSION
public static final String DICTIONARY_FILENAME_EXTENSION
Extension of dictionary files read by Spellers- See Also:
- Constant Field Values
-
-
Constructor Detail
-
JLanguageTool
public JLanguageTool(Language lang, Language motherTongue)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
lang
- the language of the text to be checkedmotherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.
-
JLanguageTool
public JLanguageTool(Language language)
Create a JLanguageTool and setup the built-in Java rules for the given language.- Parameters:
language
- the language of the text to be checked
-
JLanguageTool
public JLanguageTool(Language language, Language motherTongue, ResultCache cache)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedmotherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.cache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes- Since:
- 3.7
-
JLanguageTool
@Experimental public JLanguageTool(Language language, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedcache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes. Usenull
to deactivate the cache.- Since:
- 4.2
-
JLanguageTool
@Experimental public JLanguageTool(Language language, List<Language> altLanguages, Language motherTongue, ResultCache cache, GlobalConfig globalConfig, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedaltLanguages
- The languages that are accepted as alternative languages - currently this means words are accepted if they are in an alternative language and not similar to a word fromlanguage
. If there's a similar word inlanguage
, there will be an error of typeRuleMatch.Type.Hint
(EXPERIMENTAL)motherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.cache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes- Since:
- 4.3
-
JLanguageTool
@Experimental public JLanguageTool(Language language, Language motherTongue, ResultCache cache, UserConfig userConfig)
Create a JLanguageTool and setup the built-in rules for the given language and false friend rules for the text language / mother tongue pair.- Parameters:
language
- the language of the text to be checkedmotherTongue
- the user's mother tongue, used for false friend rules, ornull
. The mother tongue may also be used as a source language for checking bilingual texts.cache
- a cache to speed up checking if the same sentences get checked more than once, e.g. when LT is running as a server and texts are re-checked due to changes- Since:
- 4.2
-
-
Method Detail
-
isPremiumVersion
public static boolean isPremiumVersion()
- Since:
- 4.2
-
getDataBroker
public static ResourceDataBroker getDataBroker()
The grammar checker needs resources from following directories:/resource
/rules
- Returns:
- The currently set data broker which allows to obtain
resources from the mentioned directories above. If no
data broker was set, a new
DefaultResourceDataBroker
will be instantiated and returned. - Since:
- 1.0.1
-
setDataBroker
public static void setDataBroker(ResourceDataBroker broker)
The grammar checker needs resources from following directories:/resource
/rules
- Parameters:
broker
- The new resource broker to be used.- Since:
- 1.0.1
-
setListUnknownWords
public void setListUnknownWords(boolean listUnknownWords)
Whether thecheck(String)
methods store unknown words. If set totrue
(default: false), you can get the list of unknown words usinggetUnknownWords()
.
-
setCleanOverlappingMatches
public void setCleanOverlappingMatches(boolean cleanOverlappingMatches)
Whether thecheck(String)
methods return overlapping errors. If set totrue
(default: true), it removes overlapping errors according to the priorities established for the language.- Since:
- 3.6
-
setMaxErrorsPerWordRate
@Experimental public void setMaxErrorsPerWordRate(float maxErrorsPerWordRate)
Maximum errors per word rate, checking will stop with an exception if the rate is higher. For example, with a rate of 0.33, the checking would stop if the user's text has so many errors that more than every 3rd word causes a rule match. Note that this may not apply for very short texts.- Since:
- 4.0
-
getMessageBundle
public static ResourceBundle getMessageBundle()
Gets the ResourceBundle (i18n strings) for the default language of the user's system.
-
getMessageBundle
public static ResourceBundle getMessageBundle(Language lang)
Gets the ResourceBundle (i18n strings) for the given user interface language.- Since:
- 2.4 (public since 2.4)
-
setOutput
public void setOutput(PrintStream printStream)
Set a PrintStream that will receive verbose output. Set tonull
(which is the default) to disable verbose output.
-
loadPatternRules
public List<AbstractPatternRule> loadPatternRules(String filename) throws IOException
Load pattern rules from an XML file. UseaddRule(Rule)
to add these rules to the checking process.- Parameters:
filename
- path to an XML file in the classpath or in the filesystem - the classpath is checked first- Returns:
- a List of
PatternRule
objects - Throws:
IOException
-
loadFalseFriendRules
public List<AbstractPatternRule> loadFalseFriendRules(String filename) throws ParserConfigurationException, SAXException, IOException
Load false friend rules from an XML file. Only those pairs will be loaded that match the current text language and the mother tongue specified in the JLanguageTool constructor. UseaddRule(Rule)
to add these rules to the checking process.- Parameters:
filename
- path to an XML file in the classpath or in the filesystem - the classpath is checked first- Returns:
- a List of
PatternRule
objects, or an empty list if mother tongue is not set - Throws:
ParserConfigurationException
SAXException
IOException
-
activateNeuralNetworkRules
public void activateNeuralNetworkRules(File modelDir) throws IOException
Activate rules that depend on pretrained neural network models.- Parameters:
modelDir
- root dir of exported models- Throws:
IOException
- Since:
- 4.4
-
activateLanguageModelRules
public void activateLanguageModelRules(File indexDir) throws IOException
Activate rules that depend on a language model. The language model currently consists of Lucene indexes with ngram occurrence counts.- Parameters:
indexDir
- directory with a '3grams' sub directory which contains a Lucene index with 3gram occurrence counts- Throws:
IOException
- Since:
- 2.7
-
activateWord2VecModelRules
public void activateWord2VecModelRules(File indexDir) throws IOException
Activate rules that depend on a word2vec language model.- Parameters:
indexDir
- directory with a subdirectories like 'en', each containing dictionary.txt and final_embeddings.txt- Throws:
IOException
- Since:
- 4.0
-
addMatchFilter
public void addMatchFilter(@NotNull @NotNull RuleMatchFilter filter)
Add aRuleMatchFilter
for post-processing of rule matches Filters are called sequentially in the same order as added- Parameters:
filter
- filter to add- Since:
- 4.7
-
addRule
public void addRule(Rule rule)
Add a rule to be used by the next call to the check methods likecheck(String)
.
-
disableRule
public void disableRule(String ruleId)
Disable a given rule so the check methods likecheck(String)
won't use it.- Parameters:
ruleId
- the id of the rule to disable - no error will be thrown if the id does not exist- See Also:
enableRule(String)
-
disableRules
public void disableRules(List<String> ruleIds)
Disable the given rules so the check methods likecheck(String)
won't use them.- Parameters:
ruleIds
- the ids of the rules to disable - no error will be thrown if the id does not exist- Since:
- 2.4
-
disableCategory
public void disableCategory(CategoryId id)
Disable the given rule category so the check methods likecheck(String)
won't use it.- Parameters:
id
- the id of the category to disable - no error will be thrown if the id does not exist- Since:
- 3.3
- See Also:
enableRuleCategory(CategoryId)
-
isCategoryDisabled
public boolean isCategoryDisabled(CategoryId id)
Returns true if a category is explicitly disabled.- Parameters:
id
- the id of the category to check - no error will be thrown if the id does not exist- Returns:
- true if this category is explicitly disabled.
- Since:
- 3.5
- See Also:
disableCategory(org.languagetool.rules.CategoryId)
-
getLanguage
public Language getLanguage()
Get the language that was used to configure this instance.
-
getDisabledRules
public Set<String> getDisabledRules()
Get rule ids of the rules that have been explicitly disabled.
-
enableRule
public void enableRule(String ruleId)
Enable a given rule so the check methods likecheck(String)
will use it. This will not throw an exception if the given rule id doesn't exist.- Parameters:
ruleId
- the id of the rule to enable- See Also:
disableRule(String)
-
enableRuleCategory
public void enableRuleCategory(CategoryId id)
Enable all rules of the given category so the check methods likecheck(String)
will use it. This will not throw an exception if the given rule id doesn't exist.- Since:
- 3.3
- See Also:
disableCategory(org.languagetool.rules.CategoryId)
-
sentenceTokenize
public List<String> sentenceTokenize(String text)
Tokenizes the given text into sentences.
-
check
public List<RuleMatch> check(String text) throws IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Parameters:
text
- the text to be checked- Returns:
- a List of
RuleMatch
objects - Throws:
IOException
-
check
public List<RuleMatch> check(String text, RuleMatchListener listener) throws IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Parameters:
text
- the text to be checked- Returns:
- a List of
RuleMatch
objects - Throws:
IOException
- Since:
- 3.7
-
check
public List<RuleMatch> check(String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode) throws IOException
- Throws:
IOException
-
check
public List<RuleMatch> check(String text, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener) throws IOException
- Throws:
IOException
- Since:
- 3.7
-
check
public List<RuleMatch> check(AnnotatedText text) throws IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules, adjusting error positions so they refer to the original text including markup.- Throws:
IOException
- Since:
- 2.3
-
check
public List<RuleMatch> check(AnnotatedText text, RuleMatchListener listener) throws IOException
- Throws:
IOException
- Since:
- 3.9
-
check
public List<RuleMatch> check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode) throws IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Parameters:
annotatedText
- The text to be checked, created withAnnotatedTextBuilder
. Call this method with the complete text to be checked. If you call it repeatedly with smaller chunks like paragraphs or sentence, those rules that work across paragraphs/sentences won't work (their status gets reset whenever this method is called).tokenizeText
- If true, then the text is tokenized into sentences. Otherwise, it is assumed it's already tokenized, i.e. it is only one sentenceparaMode
- Uses paragraph-level rules only if true.- Returns:
- a List of
RuleMatch
objects, describing potential errors in the text - Throws:
IOException
- Since:
- 2.3
-
check
public List<RuleMatch> check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener) throws IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules.- Throws:
IOException
- Since:
- 3.7
-
check
public List<RuleMatch> check(AnnotatedText annotatedText, boolean tokenizeText, JLanguageTool.ParagraphHandling paraMode, RuleMatchListener listener, JLanguageTool.Mode mode) throws IOException
The main check method. Tokenizes the text into sentences and matches these sentences against all currently active rules depending onmode
.- Throws:
IOException
- Since:
- 4.3
-
analyzeText
public List<AnalyzedSentence> analyzeText(String text) throws IOException
Use this method if you want to access LanguageTool's otherwise internal analysis of the text. For actual text checking, use thecheck...
methods instead.- Parameters:
text
- The text to be analyzed- Throws:
IOException
- Since:
- 2.5
-
analyzeSentences
protected List<AnalyzedSentence> analyzeSentences(List<String> sentences) throws IOException
- Throws:
IOException
-
printSentenceInfo
protected void printSentenceInfo(AnalyzedSentence analyzedSentence)
-
performCheck
protected List<RuleMatch> performCheck(List<AnalyzedSentence> analyzedSentences, List<String> sentences, List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, JLanguageTool.Mode mode) throws IOException
- Throws:
IOException
-
performCheck
protected List<RuleMatch> performCheck(List<AnalyzedSentence> analyzedSentences, List<String> sentences, List<Rule> allRules, JLanguageTool.ParagraphHandling paraMode, AnnotatedText annotatedText, RuleMatchListener listener, JLanguageTool.Mode mode) throws IOException
- Throws:
IOException
- Since:
- 3.7
-
checkAnalyzedSentence
public List<RuleMatch> checkAnalyzedSentence(JLanguageTool.ParagraphHandling paraMode, List<Rule> rules, AnalyzedSentence analyzedSentence) throws IOException
This is an internal method that's public only for technical reasons, please use one of thecheck(String)
methods instead.- Throws:
IOException
- Since:
- 2.3
-
adjustRuleMatchPos
public RuleMatch adjustRuleMatchPos(RuleMatch match, int charCount, int columnCount, int lineCount, String sentence, AnnotatedText annotatedText)
Change RuleMatch positions so they are relative to the complete text, not just to the sentence.- Parameters:
charCount
- Count of characters in the sentences beforecolumnCount
- Current column numberlineCount
- Current line numbersentence
- The text being checked- Returns:
- The RuleMatch object with adjustments
-
rememberUnknownWords
protected void rememberUnknownWords(AnalyzedSentence analyzedText)
-
getUnknownWords
public List<String> getUnknownWords()
Get the alphabetically sorted list of unknown words in the latest run of one of thecheck(String)
methods.- Throws:
IllegalStateException
- ifsetListUnknownWords(boolean)
has been set tofalse
-
getAnalyzedSentence
public AnalyzedSentence getAnalyzedSentence(String sentence) throws IOException
Tokenizes the givensentence
into words and analyzes it, and then disambiguates POS tags.- Parameters:
sentence
- sentence to be analyzed- Throws:
IOException
-
getRawAnalyzedSentence
public AnalyzedSentence getRawAnalyzedSentence(String sentence) throws IOException
Tokenizes the givensentence
into words and analyzes it. This is the same asgetAnalyzedSentence(String)
but it does not run the disambiguator.- Parameters:
sentence
- sentence to be analyzed- Throws:
IOException
- Since:
- 0.9.8
-
getCategories
public Map<CategoryId,Category> getCategories()
Get all rule categories for the current language.- Returns:
- a map of
Categories
, keyed by theirid
. - Since:
- 3.5
-
getAllRules
public List<Rule> getAllRules()
Get all rules for the current language that are built-in or that have been added usingaddRule(Rule)
. Please note that XML rules that are grouped will appear as multiple rules with the same id. To tell them apart, check if they are of typeAbstractPatternRule
, cast them to that type and call theirAbstractPatternRule.getSubId()
method.- Returns:
- a List of
Rule
objects
-
getAllActiveRules
public List<Rule> getAllActiveRules()
Get all active (not disabled) rules for the current language that are built-in or that have been added using e.g.addRule(Rule)
. SeegetAllRules()
for hints about rule ids.- Returns:
- a List of
Rule
objects
-
getAllActiveOfficeRules
public List<Rule> getAllActiveOfficeRules()
Works like getAllActiveRules but overrides defaults by office defaults- Returns:
- a List of
Rule
objects - Since:
- 4.0
-
getPatternRulesByIdAndSubId
public List<AbstractPatternRule> getPatternRulesByIdAndSubId(String id, String subId)
Get pattern rules by Id and SubId. This returns a list because rules that use<or>...</or>
are internally expanded into several rules.- Returns:
- a List of
Rule
objects - Since:
- 2.3
-
printIfVerbose
protected void printIfVerbose(String s)
-
addTemporaryFile
public static void addTemporaryFile(File file)
Adds a temporary file to the internal list (internal method, you should never need to call this as a user of LanguageTool)- Parameters:
file
- the file to be added.
-
removeTemporaryFiles
public static void removeTemporaryFiles()
Clean up all temporary files, if there are any.
-
applyCustomFilters
protected List<RuleMatch> applyCustomFilters(List<RuleMatch> matches, AnnotatedText text)
should be called just once with complete list of matches, before returning them to caller- Parameters:
matches
- matches after applying rules and default filterstext
- text that matches refer to- Returns:
- transformed matches (after applying filters in
matchFilters
) - Since:
- 4.7
-
-