Class AhoCorasickDoubleArrayTrie<V>

  • All Implemented Interfaces:
    Serializable

    public class AhoCorasickDoubleArrayTrie<V>
    extends Object
    implements Serializable
    An implementation of Aho Corasick algorithm based on Double Array Trie
    Author:
    hankcs
    See Also:
    Serialized Form
    • Field Detail

      • check

        protected int[] check
        check array of the Double Array Trie structure
      • base

        protected int[] base
        base array of the Double Array Trie structure
      • fail

        protected int[] fail
        fail table of the Aho Corasick automata
      • output

        protected int[][] output
        output table of the Aho Corasick automata
      • v

        protected V[] v
        outer value array
      • l

        protected int[] l
        the length of every key
      • size

        protected int size
        the size of base and check array
    • Constructor Detail

      • AhoCorasickDoubleArrayTrie

        public AhoCorasickDoubleArrayTrie()
    • Method Detail

      • parseText

        public void parseText​(char[] text,
                              AhoCorasickDoubleArrayTrie.IHit<V> processor)
        Parse text
        Parameters:
        text - The text
        processor - A processor which handles the output
      • parseText

        public void parseText​(char[] text,
                              AhoCorasickDoubleArrayTrie.IHitFull<V> processor)
        Parse text
        Parameters:
        text - The text
        processor - A processor which handles the output
      • matches

        public boolean matches​(String text)
        Checks that string contains at least one substring
        Parameters:
        text - source text to check
        Returns:
        true if string contains at least one substring
      • findFirst

        public AhoCorasickDoubleArrayTrie.Hit<V> findFirst​(String text)
        Search first match in string
        Parameters:
        text - source text to check
        Returns:
        first match or null if there are no matches
      • get

        public V get​(String key)
        Get value by a String key, just like a map.get() method
        Parameters:
        key - The key
        Returns:
      • set

        public boolean set​(String key,
                           V value)
        Update a value corresponding to a key
        Parameters:
        key - the key
        value - the value
        Returns:
        successful or not(failure if there is no key)
      • get

        public V get​(int index)
        Pick the value by index in value array
        Notice that to be more efficiently, this method DO NOT check the parameter
        Parameters:
        index - The index
        Returns:
        The value
      • transition

        protected int transition​(int current,
                                 char c)
        transition of a state
        Parameters:
        current -
        c -
        Returns:
      • transitionWithRoot

        protected int transitionWithRoot​(int nodePos,
                                         char c)
        transition of a state, if the state is root and it failed, then returns the root
        Parameters:
        nodePos -
        c -
        Returns:
      • build

        public void build​(Map<String,​V> map)
        Build a AhoCorasickDoubleArrayTrie from a map
        Parameters:
        map - a map containing key-value pairs
      • exactMatchSearch

        public int exactMatchSearch​(String key)
        match exactly by a key
        Parameters:
        key - the key
        Returns:
        the index of the key, you can use it as a perfect hash function
      • size

        public int size()
        Get the size of the keywords
        Returns: