Package edu.berkeley.nlp.lm
Class AbstractContextEncodedNgramLanguageModel<W>
- java.lang.Object
-
- edu.berkeley.nlp.lm.AbstractNgramLanguageModel<W>
-
- edu.berkeley.nlp.lm.AbstractContextEncodedNgramLanguageModel<W>
-
- All Implemented Interfaces:
ContextEncodedNgramLanguageModel<W>
,NgramLanguageModel<W>
,java.io.Serializable
- Direct Known Subclasses:
ContextEncodedCachingLmWrapper
,ContextEncodedProbBackoffLm
public abstract class AbstractContextEncodedNgramLanguageModel<W> extends AbstractNgramLanguageModel<W> implements ContextEncodedNgramLanguageModel<W>, java.io.Serializable
Default implementation of all ContextEncodedNgramLanguageModel functionality exceptgetLogProb(long, int, int, LmContextInfo)
, {@link #getOffsetForNgram(int[], int, int), and {- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel
ContextEncodedNgramLanguageModel.DefaultImplementations, ContextEncodedNgramLanguageModel.LmContextInfo
-
Nested classes/interfaces inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
NgramLanguageModel.StaticMethods
-
-
Field Summary
-
Fields inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
lmOrder, oovWordLogProb
-
-
Constructor Summary
Constructors Constructor Description AbstractContextEncodedNgramLanguageModel(int lmOrder, WordIndexer<W> wordIndexer, float oovWordLogProb)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract float
getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo outputContext)
Get the score for an n-gram, and also get the context offset of the n-gram's suffix.float
getLogProb(java.util.List<W> phrase)
Scores an n-gram.abstract int[]
getNgramForOffset(long contextOffset, int contextOrder, int word)
Gets the n-gram referred to by a context-encoding.abstract ContextEncodedNgramLanguageModel.LmContextInfo
getOffsetForNgram(int[] ngram, int startPos, int endPos)
Gets the offset which refers to an n-gram.float
scoreSentence(java.util.List<W> sentence)
Scores a complete sentence, taking appropriate care with the start- and end-of-sentence symbols.-
Methods inherited from class edu.berkeley.nlp.lm.AbstractNgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface edu.berkeley.nlp.lm.NgramLanguageModel
getLmOrder, getWordIndexer, setOovWordLogProb
-
-
-
-
Constructor Detail
-
AbstractContextEncodedNgramLanguageModel
public AbstractContextEncodedNgramLanguageModel(int lmOrder, WordIndexer<W> wordIndexer, float oovWordLogProb)
-
-
Method Detail
-
scoreSentence
public float scoreSentence(java.util.List<W> sentence)
Description copied from interface:NgramLanguageModel
Scores a complete sentence, taking appropriate care with the start- and end-of-sentence symbols. This is a convenience method and will generally be inefficient.- Specified by:
scoreSentence
in interfaceNgramLanguageModel<W>
- Returns:
-
getLogProb
public float getLogProb(java.util.List<W> phrase)
Description copied from interface:NgramLanguageModel
Scores an n-gram. This is a convenience method and will generally be relatively inefficient. More efficient versions are available inArrayEncodedNgramLanguageModel.getLogProb(int[], int, int)
andContextEncodedNgramLanguageModel.getLogProb(long, int, int, edu.berkeley.nlp.lm.ContextEncodedNgramLanguageModel.LmContextInfo)
.- Specified by:
getLogProb
in interfaceNgramLanguageModel<W>
-
getLogProb
public abstract float getLogProb(long contextOffset, int contextOrder, int word, ContextEncodedNgramLanguageModel.LmContextInfo outputContext)
Description copied from interface:ContextEncodedNgramLanguageModel
Get the score for an n-gram, and also get the context offset of the n-gram's suffix.- Specified by:
getLogProb
in interfaceContextEncodedNgramLanguageModel<W>
- Parameters:
contextOffset
- Offset of context (prefix) of an n-gramcontextOrder
- The (0-based) length ofcontext
(i.e.order == 0
iffcontext
refers to a unigram).word
- Last word of the n-gramoutputContext
- Offset of the suffix of the input n-gram. If the parameter isnull
it will be ignored. This can be passed to future queries for efficient access.- Returns:
-
getOffsetForNgram
public abstract ContextEncodedNgramLanguageModel.LmContextInfo getOffsetForNgram(int[] ngram, int startPos, int endPos)
Description copied from interface:ContextEncodedNgramLanguageModel
Gets the offset which refers to an n-gram. If the n-gram is not in the model, then it returns the shortest suffix of the n-gram which is. This operation is not necessarily fast.- Specified by:
getOffsetForNgram
in interfaceContextEncodedNgramLanguageModel<W>
-
getNgramForOffset
public abstract int[] getNgramForOffset(long contextOffset, int contextOrder, int word)
Description copied from interface:ContextEncodedNgramLanguageModel
Gets the n-gram referred to by a context-encoding. This operation is not necessarily fast.- Specified by:
getNgramForOffset
in interfaceContextEncodedNgramLanguageModel<W>
-
-