Package io.netty.buffer.search
Class AbstractMultiSearchProcessorFactory
java.lang.Object
io.netty.buffer.search.AbstractMultiSearchProcessorFactory
- All Implemented Interfaces:
MultiSearchProcessorFactory
,SearchProcessorFactory
- Direct Known Subclasses:
AhoCorasicSearchProcessorFactory
public abstract class AbstractMultiSearchProcessorFactory
extends Object
implements MultiSearchProcessorFactory
Base class for precomputed factories that create
The purpose of
See the documentation of
Note: in some cases one
Usage example (given that the
MultiSearchProcessor
s.
The purpose of
MultiSearchProcessor
is to perform efficient simultaneous search for multiple needles
in the haystack
, while scanning every byte of the input sequentially, only once. While it can also be used
to search for just a single needle
, using a SearchProcessorFactory
would be more efficient for
doing that.
See the documentation of
AbstractSearchProcessorFactory
for a comprehensive description of common usage.
In addition to the functionality provided by SearchProcessor
, MultiSearchProcessor
adds
a method to get the index of the needle
found at the current position of the MultiSearchProcessor
-
MultiSearchProcessor.getFoundNeedleId()
.
Note: in some cases one
needle
can be a suffix of another needle
, eg. {"BC", "ABC"}
,
and there can potentially be multiple needles
found ending at the same position of the haystack
.
In such case MultiSearchProcessor.getFoundNeedleId()
returns the index of the longest matching needle
in the array of needles
.
Usage example (given that the
haystack
is a ByteBuf
containing "ABCD" and the
needles
are "AB", "BC" and "CD"):
MultiSearchProcessorFactory factory = MultiSearchProcessorFactory.newAhoCorasicSearchProcessorFactory(
"AB".getBytes(CharsetUtil.UTF_8), "BC".getBytes(CharsetUtil.UTF_8), "CD".getBytes(CharsetUtil.UTF_8));
MultiSearchProcessor processor = factory.newSearchProcessor();
int idx1 = haystack.forEachByte(processor);
// idx1 is 1 (index of the last character of the occurrence of "AB" in the haystack)
// processor.getFoundNeedleId() is 0 (index of "AB" in needles[])
int continueFrom1 = idx1 + 1;
// continue the search starting from the next character
int idx2 = haystack.forEachByte(continueFrom1, haystack.readableBytes() - continueFrom1, processor);
// idx2 is 2 (index of the last character of the occurrence of "BC" in the haystack)
// processor.getFoundNeedleId() is 1 (index of "BC" in needles[])
int continueFrom2 = idx2 + 1;
int idx3 = haystack.forEachByte(continueFrom2, haystack.readableBytes() - continueFrom2, processor);
// idx3 is 3 (index of the last character of the occurrence of "CD" in the haystack)
// processor.getFoundNeedleId() is 2 (index of "CD" in needles[])
int continueFrom3 = idx3 + 1;
int idx4 = haystack.forEachByte(continueFrom3, haystack.readableBytes() - continueFrom3, processor);
// idx4 is -1 (no more occurrences of any of the needles)
// This search session is complete, processor should be discarded.
// To search for the same needles again, reuse the same AbstractMultiSearchProcessorFactory
// to get a new MultiSearchProcessor.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionnewAhoCorasicSearchProcessorFactory
(byte[]... needles) Creates aMultiSearchProcessorFactory
based on Aho–Corasick string search algorithm.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface io.netty.buffer.search.MultiSearchProcessorFactory
newSearchProcessor
-
Constructor Details
-
AbstractMultiSearchProcessorFactory
public AbstractMultiSearchProcessorFactory()
-
-
Method Details
-
newAhoCorasicSearchProcessorFactory
public static AhoCorasicSearchProcessorFactory newAhoCorasicSearchProcessorFactory(byte[]... needles) Creates aMultiSearchProcessorFactory
based on Aho–Corasick string search algorithm.
Precomputation (this method) time is linear in the size of input (O(Σ|needles|)
).
The factory allocates and retains an array of 256 * X ints plus another array of X ints, where X is the sum of lengths of each entry ofneedles
minus the sum of lengths of repeated prefixes of theneedles
.
Search (the actual application ofMultiSearchProcessor
) time is linear in the size ofByteBuf
on which the search is performed (O(|haystack|)
). Every byte ofByteBuf
is processed only once, sequentually, regardles of the number ofneedles
being searched for.- Parameters:
needles
- a varargs array of arrays of bytes to search for- Returns:
- a new instance of
AhoCorasicSearchProcessorFactory
precomputed for the givenneedles
-