Package edu.berkeley.nlp.lm.cache
Class ContextEncodedCachingLmWrapper<T>
- java.lang.Object
-
- edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
-
- edu.berkeley.nlp.lm.AbstractContextEncodedNgramLanguageModel<T>
-
- edu.berkeley.nlp.lm.cache.ContextEncodedCachingLmWrapper<T>
-
- Type Parameters:
W
-
- All Implemented Interfaces:
ContextEncodedNgramLanguageModel<T>
,NgramLanguageModel<T>
,java.io.Serializable
public class ContextEncodedCachingLmWrapper<T> extends AbstractContextEncodedNgramLanguageModel<T>
This class wrapsContextEncodedNgramLanguageModel
with a cache.- Author:
- adampauls
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel
ContextEncodedNgramLanguageModel.DefaultImplementations, ContextEncodedNgramLanguageModel.LmContextInfo
-
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods
-
-
Field Summary
-
Fields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description float
getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo contextOutput)
Get the score for an n-gram, and also get the context offset of the n-gram's suffix.int[]
getNgramForOffset(long contextOffset, int contextOrder, int word)
Gets the n-gram referred to by a context-encoding.ContextEncodedNgramLanguageModel.LmContextInfo
getOffsetForNgram(int[] ngram, int startPos, int endPos)
Gets the offset which refers to an n-gram.WordIndexer<T>
getWordIndexer()
Each LM must have a WordIndexer which assigns integer IDs to each word W in the language.static <T> ContextEncodedCachingLmWrapper<T>
wrapWithCacheNotThreadSafe(ContextEncodedNgramLanguageModel<T> lm)
This type of caching is only threadsafe if you have one cache wrapper per thread.static <T> ContextEncodedCachingLmWrapper<T>
wrapWithCacheNotThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits)
static <T> ContextEncodedCachingLmWrapper<T>
wrapWithCacheThreadSafe(ContextEncodedNgramLanguageModel<T> lm)
This type of caching is threadsafe and (internally) maintains a separate cache for each thread that calls it.static <T> ContextEncodedCachingLmWrapper<T>
wrapWithCacheThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits)
-
Methods inherited from class edu.berkeley.nlp.lm.AbstractContextEncodedNgramLanguageModel
getLogProb, scoreSentence
-
Methods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, setOovWordLogProb
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, setOovWordLogProb
-
-
-
-
Method Detail
-
wrapWithCacheNotThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheNotThreadSafe(ContextEncodedNgramLanguageModel<T> lm)
This type of caching is only threadsafe if you have one cache wrapper per thread.- Type Parameters:
T
-- Parameters:
lm
-- Returns:
-
wrapWithCacheNotThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheNotThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits)
-
wrapWithCacheThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheThreadSafe(ContextEncodedNgramLanguageModel<T> lm)
This type of caching is threadsafe and (internally) maintains a separate cache for each thread that calls it. Note each thread has its own cache, so if you have lots of threads, memory usage could be substantial.- Type Parameters:
T
-- Parameters:
lm
-- Returns:
-
wrapWithCacheThreadSafe
public static <T> ContextEncodedCachingLmWrapper<T> wrapWithCacheThreadSafe(ContextEncodedNgramLanguageModel<T> lm, int cacheBits)
-
getWordIndexer
public WordIndexer<T> getWordIndexer()
Description copied from interface:NgramLanguageModel
Each LM must have a WordIndexer which assigns integer IDs to each word W in the language.- Specified by:
getWordIndexer
in interfaceNgramLanguageModel<T>
- Overrides:
getWordIndexer
in classAbstractNgramLanguageModel<T>
- Returns:
-
getOffsetForNgram
public ContextEncodedNgramLanguageModel.LmContextInfo getOffsetForNgram(int[] ngram, int startPos, int endPos)
Description copied from interface:ContextEncodedNgramLanguageModel
Gets the offset which refers to an n-gram. If the n-gram is not in the model, then it returns the shortest suffix of the n-gram which is. This operation is not necessarily fast.- Specified by:
getOffsetForNgram
in interfaceContextEncodedNgramLanguageModel<T>
- Specified by:
getOffsetForNgram
in classAbstractContextEncodedNgramLanguageModel<T>
-
getNgramForOffset
public int[] getNgramForOffset(long contextOffset, int contextOrder, int word)
Description copied from interface:ContextEncodedNgramLanguageModel
Gets the n-gram referred to by a context-encoding. This operation is not necessarily fast.- Specified by:
getNgramForOffset
in interfaceContextEncodedNgramLanguageModel<T>
- Specified by:
getNgramForOffset
in classAbstractContextEncodedNgramLanguageModel<T>
-
getLogProb
public float getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo contextOutput)
Description copied from interface:ContextEncodedNgramLanguageModel
Get the score for an n-gram, and also get the context offset of the n-gram's suffix.- Specified by:
getLogProb
in interfaceContextEncodedNgramLanguageModel<T>
- Specified by:
getLogProb
in classAbstractContextEncodedNgramLanguageModel<T>
- Parameters:
contextOffset
- Offset of context (prefix) of an n-gramcontextOrder
- The (0-based) length ofcontext
(i.e.order == 0
iffcontext
refers to a unigram).word
- Last word of the n-gramcontextOutput
- Offset of the suffix of the input n-gram. If the parameter isnull
it will be ignored. This can be passed to future queries for efficient access.- Returns:
-
-