Package org.languagetool
Class AnalyzedTokenReadings
- java.lang.Object
-
- org.languagetool.AnalyzedTokenReadings
-
- All Implemented Interfaces:
Iterable<AnalyzedToken>
public final class AnalyzedTokenReadings extends Object implements Iterable<AnalyzedToken>
An array ofAnalyzedToken
s used to store multiple POS tags and lemmas for a given single token.- Author:
- Marcin Milkowski
-
-
Constructor Summary
Constructors Constructor Description AnalyzedTokenReadings(List<AnalyzedToken> tokens, int startPos)
AnalyzedTokenReadings(AnalyzedToken[] tokens, int startPos)
AnalyzedTokenReadings(AnalyzedToken token, int startPos)
AnalyzedTokenReadings(AnalyzedTokenReadings oldAtr, List<AnalyzedToken> newReadings, String ruleApplied)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addReading(AnalyzedToken token, String ruleApplied)
Add a new reading.boolean
equals(Object obj)
AnalyzedToken
getAnalyzedToken(int idx)
Get a token reading.List<ChunkTag>
getChunkTags()
int
getEndPos()
String
getHistoricalAnnotations()
Used to track disambiguator actions.List<AnalyzedToken>
getReadings()
int
getReadingsLength()
Number of readings.int
getStartPos()
String
getToken()
String
getWhitespaceBefore()
boolean
hasAnyLemma(String... lemmas)
Checks if one of the token's readings has one of the given lemmasboolean
hasAnyPartialPosTag(String... posTags)
Checks if the token has any of the given particular POS tags (only a part of the given POS tag needs to match)int
hashCode()
boolean
hasLemma(String lemma)
Checks if one of the token's readings has a particular lemma.boolean
hasPartialPosTag(String posTag)
Checks if the token has a particular POS tag, where only a part of the given POS tag needs to match.boolean
hasPosTag(String posTag)
Checks if the token has a particular POS tag.boolean
hasPosTagAndLemma(String posTag, String lemma)
Checks if the token has a particular POS tag and lemma.boolean
hasPosTagStartingWith(String posTag)
Checks if the token has a POS tag starting with the given string.boolean
hasReading()
Checks if there is at least one POS tagboolean
hasSameLemmas()
Used to optimize pattern matching.void
ignoreSpelling()
Make the token ignored by all spelling rules.void
immunize()
boolean
isFieldCode()
boolean
isIgnoredBySpeller()
Test if the token can be ignored by spelling rules.boolean
isImmunized()
boolean
isLinebreak()
Returns true if the token equals\n
,\r
,\n\r
, or\r\n
.boolean
isNonWord()
boolean
isParagraphEnd()
boolean
isPosTagUnknown()
Test if the token's POStag equals null.boolean
isSentenceEnd()
boolean
isSentenceStart()
boolean
isTagged()
boolean
isWhitespace()
boolean
isWhitespaceBefore()
Iterator<AnalyzedToken>
iterator()
void
leaveReading(AnalyzedToken token)
Removes all readings but the one that matches the token given.boolean
matchesPosTagRegex(String posTagRegex)
Checks if at least one of the readings matches a given POS tag regex.void
removeReading(AnalyzedToken token, String ruleApplied)
Removes a reading from the list of readings.void
setChunkTags(List<ChunkTag> chunkTags)
void
setParagraphEnd()
Add a reading with a paragraph end token unless this is already a paragraph end.void
setSentEnd()
Add a SENT_END tag.void
setStartPos(int position)
void
setWhitespaceBefore(String prevToken)
String
toString()
-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Constructor Detail
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(AnalyzedToken[] tokens, int startPos)
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(AnalyzedToken token, int startPos)
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(List<AnalyzedToken> tokens, int startPos)
-
AnalyzedTokenReadings
public AnalyzedTokenReadings(AnalyzedTokenReadings oldAtr, List<AnalyzedToken> newReadings, String ruleApplied)
-
-
Method Detail
-
getReadings
public List<AnalyzedToken> getReadings()
-
getAnalyzedToken
public AnalyzedToken getAnalyzedToken(int idx)
Get a token reading.
-
hasPosTag
public boolean hasPosTag(String posTag)
Checks if the token has a particular POS tag.- Parameters:
posTag
- POS tag to look for
-
hasPosTagAndLemma
public boolean hasPosTagAndLemma(String posTag, String lemma)
Checks if the token has a particular POS tag and lemma.- Parameters:
posTag
- POS tag and lemma to look for
-
hasReading
public boolean hasReading()
Checks if there is at least one POS tag- Since:
- 4.7
-
hasLemma
public boolean hasLemma(String lemma)
Checks if one of the token's readings has a particular lemma.- Parameters:
lemma
- lemma POS tag to look for
-
hasAnyLemma
public boolean hasAnyLemma(String... lemmas)
Checks if one of the token's readings has one of the given lemmas- Parameters:
lemmas
- to look for
-
hasPartialPosTag
public boolean hasPartialPosTag(String posTag)
Checks if the token has a particular POS tag, where only a part of the given POS tag needs to match.- Parameters:
posTag
- POS tag substring to look for- Since:
- 1.8
-
hasAnyPartialPosTag
public boolean hasAnyPartialPosTag(String... posTags)
Checks if the token has any of the given particular POS tags (only a part of the given POS tag needs to match)- Parameters:
posTags
- POS tag substring to look for- Since:
- 4.0
-
hasPosTagStartingWith
public boolean hasPosTagStartingWith(String posTag)
Checks if the token has a POS tag starting with the given string.- Parameters:
posTag
- POS tag substring to look for- Since:
- 4.0
-
matchesPosTagRegex
public boolean matchesPosTagRegex(String posTagRegex)
Checks if at least one of the readings matches a given POS tag regex.- Parameters:
posTagRegex
- POS tag regular expression to look for- Since:
- 2.9
-
addReading
public void addReading(AnalyzedToken token, String ruleApplied)
Add a new reading.- Parameters:
token
- new reading, given asAnalyzedToken
-
removeReading
public void removeReading(AnalyzedToken token, String ruleApplied)
Removes a reading from the list of readings. Note: if the token has only one reading, then a new reading with an empty POS tag and an empty lemma is created.- Parameters:
token
- reading to be removed
-
leaveReading
public void leaveReading(AnalyzedToken token)
Removes all readings but the one that matches the token given.- Parameters:
token
- Token to be matched- Since:
- 1.5
-
getReadingsLength
public int getReadingsLength()
Number of readings.
-
isWhitespace
public boolean isWhitespace()
-
isLinebreak
public boolean isLinebreak()
Returns true if the token equals\n
,\r
,\n\r
, or\r\n
.
-
isSentenceStart
public boolean isSentenceStart()
- Since:
- 2.3
-
isParagraphEnd
public boolean isParagraphEnd()
- Returns:
- true when the token is a last token in a paragraph.
- Since:
- 2.3
-
setParagraphEnd
public void setParagraphEnd()
Add a reading with a paragraph end token unless this is already a paragraph end.- Since:
- 2.3
-
isSentenceEnd
public boolean isSentenceEnd()
- Returns:
- true when the token is a last token in a sentence.
- Since:
- 2.3
-
isFieldCode
public boolean isFieldCode()
- Returns:
- true if the token is LibreOffice/OpenOffice field code.
- Since:
- 0.9.9
-
setSentEnd
public void setSentEnd()
Add a SENT_END tag.
-
getStartPos
public int getStartPos()
-
getEndPos
public int getEndPos()
- Since:
- 2.9
-
setStartPos
public void setStartPos(int position)
-
getToken
public String getToken()
-
setWhitespaceBefore
public void setWhitespaceBefore(String prevToken)
-
getWhitespaceBefore
public String getWhitespaceBefore()
-
isWhitespaceBefore
public boolean isWhitespaceBefore()
-
immunize
public void immunize()
-
isImmunized
public boolean isImmunized()
-
ignoreSpelling
public void ignoreSpelling()
Make the token ignored by all spelling rules.- Since:
- 2.5
-
isIgnoredBySpeller
public boolean isIgnoredBySpeller()
Test if the token can be ignored by spelling rules.- Returns:
- true if the token should be ignored.
- Since:
- 2.5
-
isPosTagUnknown
public boolean isPosTagUnknown()
Test if the token's POStag equals null.- Returns:
- true if the token does not have a POStag
- Since:
- 3.9
-
getHistoricalAnnotations
public String getHistoricalAnnotations()
Used to track disambiguator actions.- Returns:
- the historicalAnnotations
-
isTagged
public boolean isTagged()
- Returns:
- true if AnalyzedTokenReadings has some real POS tag (= not null or a special tag)
- Since:
- 2.3
-
hasSameLemmas
public boolean hasSameLemmas()
Used to optimize pattern matching.- Returns:
- true if all
AnalyzedToken
lemmas are the same.
-
isNonWord
public boolean isNonWord()
- Returns:
- true if AnalyzedTokenReadings is a punctuation mark, bracket, etc
- Since:
- 4.4
-
iterator
public Iterator<AnalyzedToken> iterator()
- Specified by:
iterator
in interfaceIterable<AnalyzedToken>
- Since:
- 2.3
-
-