Class AhoCorasickDoubleArrayTrie<V>

java.lang.Object
com.hankcs.algorithm.AhoCorasickDoubleArrayTrie<V>
All Implemented Interfaces:
Serializable

public class AhoCorasickDoubleArrayTrie<V> extends Object implements Serializable
An implementation of Aho Corasick algorithm based on Double Array Trie
Author:
hankcs
See Also:
  • Field Details

    • check

      protected int[] check
      check array of the Double Array Trie structure
    • base

      protected int[] base
      base array of the Double Array Trie structure
    • fail

      protected int[] fail
      fail table of the Aho Corasick automata
    • output

      protected int[][] output
      output table of the Aho Corasick automata
    • v

      protected V[] v
      outer value array
    • l

      protected int[] l
      the length of every key
    • size

      protected int size
      the size of base and check array
  • Constructor Details

    • AhoCorasickDoubleArrayTrie

      public AhoCorasickDoubleArrayTrie()
  • Method Details

    • parseText

      Parse text
      Parameters:
      text - The text
      Returns:
      a list of outputs
    • parseText

      public void parseText(CharSequence text, AhoCorasickDoubleArrayTrie.IHit<V> processor)
      Parse text
      Parameters:
      text - The text
      processor - A processor which handles the output
    • parseText

      public void parseText(CharSequence text, AhoCorasickDoubleArrayTrie.IHitCancellable<V> processor)
      Parse text
      Parameters:
      text - The text
      processor - A processor which handles the output
    • parseText

      public void parseText(char[] text, AhoCorasickDoubleArrayTrie.IHit<V> processor)
      Parse text
      Parameters:
      text - The text
      processor - A processor which handles the output
    • parseText

      public void parseText(char[] text, AhoCorasickDoubleArrayTrie.IHitFull<V> processor)
      Parse text
      Parameters:
      text - The text
      processor - A processor which handles the output
    • matches

      public boolean matches(String text)
      Checks that string contains at least one substring
      Parameters:
      text - source text to check
      Returns:
      true if string contains at least one substring
    • findFirst

      public AhoCorasickDoubleArrayTrie.Hit<V> findFirst(String text)
      Search first match in string
      Parameters:
      text - source text to check
      Returns:
      first match or null if there are no matches
    • save

      public void save(ObjectOutputStream out) throws IOException
      Save
      Parameters:
      out - An ObjectOutputStream object
      Throws:
      IOException - Some IOException
    • load

      public void load(ObjectInputStream in) throws IOException, ClassNotFoundException
      Load
      Parameters:
      in - An ObjectInputStream object
      Throws:
      IOException
      ClassNotFoundException
    • get

      public V get(String key)
      Get value by a String key, just like a map.get() method
      Parameters:
      key - The key
      Returns:
    • set

      public boolean set(String key, V value)
      Update a value corresponding to a key
      Parameters:
      key - the key
      value - the value
      Returns:
      successful or not(failure if there is no key)
    • get

      public V get(int index)
      Pick the value by index in value array
      Notice that to be more efficiently, this method DO NOT check the parameter
      Parameters:
      index - The index
      Returns:
      The value
    • transition

      protected int transition(int current, char c)
      transition of a state
      Parameters:
      current -
      c -
      Returns:
    • transitionWithRoot

      protected int transitionWithRoot(int nodePos, char c)
      transition of a state, if the state is root and it failed, then returns the root
      Parameters:
      nodePos -
      c -
      Returns:
    • build

      public void build(Map<String,V> map)
      Build a AhoCorasickDoubleArrayTrie from a map
      Parameters:
      map - a map containing key-value pairs
    • exactMatchSearch

      public int exactMatchSearch(String key)
      match exactly by a key
      Parameters:
      key - the key
      Returns:
      the index of the key, you can use it as a perfect hash function
    • size

      public int size()
      Get the size of the keywords
      Returns: