Package morfologik.stemming
Class TrimPrefixAndSuffixEncoder
- java.lang.Object
-
- morfologik.stemming.TrimPrefixAndSuffixEncoder
-
- All Implemented Interfaces:
ISequenceEncoder
public class TrimPrefixAndSuffixEncoder extends Object implements ISequenceEncoder
Encodesdst
relative tosrc
by trimming whatever non-equal suffix and prefixsrc
anddst
have. The output code is (bytes):{P}{K}{suffix}
where (P
- 'A') bytes should be trimmed from the start ofsrc
, (K
- 'A') bytes should be trimmed from the end ofsrc
and then thesuffix
should be appended to the resulting byte sequence.Examples:
src: abc dst: abcd encoded: AAd src: abc dst: xyz encoded: ADxyz
-
-
Constructor Summary
Constructors Constructor Description TrimPrefixAndSuffixEncoder()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ByteBuffer
decode(ByteBuffer reuse, ByteBuffer source, ByteBuffer encoded)
ByteBuffer
encode(ByteBuffer reuse, ByteBuffer source, ByteBuffer target)
int
prefixBytes()
The number of encoded form's prefix bytes that should be ignored (needed for separator lookup).String
toString()
-
-
-
Method Detail
-
encode
public ByteBuffer encode(ByteBuffer reuse, ByteBuffer source, ByteBuffer target)
Description copied from interface:ISequenceEncoder
- Specified by:
encode
in interfaceISequenceEncoder
- Parameters:
reuse
- Reuses the providedByteBuffer
or allocates a new one if there is not enough remaining space.source
- The source byte sequence.target
- The target byte sequence to encode relative tosource
- Returns:
- Returns the
ByteBuffer
with encodedtarget
.
-
prefixBytes
public int prefixBytes()
Description copied from interface:ISequenceEncoder
The number of encoded form's prefix bytes that should be ignored (needed for separator lookup). An ugly workaround for GH-85, should be fixed by prior knowledge of whether the dictionary contains tags; then we can scan for separator right-to-left.- Specified by:
prefixBytes
in interfaceISequenceEncoder
- See Also:
- "https://github.com/morfologik/morfologik-stemming/issues/85"
-
decode
public ByteBuffer decode(ByteBuffer reuse, ByteBuffer source, ByteBuffer encoded)
Description copied from interface:ISequenceEncoder
- Specified by:
decode
in interfaceISequenceEncoder
- Parameters:
reuse
- Reuses the providedByteBuffer
or allocates a new one if there is not enough remaining space.source
- The source byte sequence.encoded
- The previously encoded byte sequence.- Returns:
- Returns the
ByteBuffer
with decodedtarget
.
-
-