Package org.languagetool.languagemodel
Class LuceneLanguageModel
- java.lang.Object
-
- org.languagetool.languagemodel.BaseLanguageModel
-
- org.languagetool.languagemodel.LuceneLanguageModel
-
- All Implemented Interfaces:
AutoCloseable
,LanguageModel
public class LuceneLanguageModel extends BaseLanguageModel
LikeLuceneSingleIndexLanguageModel
, but can merge the results of lookups in several independent indexes to one result.- Since:
- 2.7
-
-
Field Summary
-
Fields inherited from interface org.languagetool.languagemodel.LanguageModel
GOOGLE_SENTENCE_END, GOOGLE_SENTENCE_START
-
-
Constructor Summary
Constructors Constructor Description LuceneLanguageModel(File topIndexDir)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
long
getCount(String token)
Get the occurrence count fortoken
.long
getCount(List<String> tokens)
Get the occurrence count for the given token sequence.long
getTotalTokenCount()
String
toString()
static void
validateDirectory(File topIndexDir)
-
Methods inherited from class org.languagetool.languagemodel.BaseLanguageModel
getPseudoProbability, getPseudoProbabilityStupidBackoff
-
-
-
-
Constructor Detail
-
LuceneLanguageModel
public LuceneLanguageModel(File topIndexDir)
- Parameters:
topIndexDir
- a directory which contains either: 1) sub directories called1grams
,2grams
,3grams
, which are Lucene indexes with ngram occurrences as created byorg.languagetool.dev.FrequencyIndexCreator
or 2) sub directoriesindex-1
,index-2
etc that contain the sub directories described under 1)
-
-
Method Detail
-
validateDirectory
public static void validateDirectory(File topIndexDir)
-
getCount
public long getCount(List<String> tokens)
Description copied from class:BaseLanguageModel
Get the occurrence count for the given token sequence.- Specified by:
getCount
in classBaseLanguageModel
-
getCount
public long getCount(String token)
Description copied from class:BaseLanguageModel
Get the occurrence count fortoken
.- Specified by:
getCount
in classBaseLanguageModel
-
getTotalTokenCount
public long getTotalTokenCount()
- Specified by:
getTotalTokenCount
in classBaseLanguageModel
-
close
public void close()
-
-