THE KNUTH - MORRIS-PRATT ALGOTITHM - AOA MODULES 6
Knuth-Morris-Pratt (KMP) Algorithm
Definition
The Knuth-Morris-Pratt (KMP) algorithm is a pattern matching algorithm that improves upon the brute-force method by preprocessing the pattern to determine the longest prefix which is also a suffix. This allows skipping unnecessary comparisons.
Steps
Step 1: Construct the prefix table (LPS Array) for the pattern.
Step 2: Start the matching process by comparing the text and pattern characters.
Step 3: If a mismatch occurs, use the LPS array to determine the next shift.
Step 4: Continue until the entire text is scanned.
Step 5: If q reaches m (length of pattern), a match is found.
Example
Let's consider the following example:
Text (T): abcabababcbc
Pattern (P): bababcb
Prefix Table (LPS)
q | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
P[q] | b | a | b | a | b | c | b |
Prefix | 0 | 0 | 1 | 2 | 3 | 1 | 1 |
Matching Process
i = 1, q = 0 → Compare T[1] = a with P[1] = b (Mismatch)
i = 2, q = 0 → Compare T[2] = b with P[1] = b (Match)
i = 3, q = 1 → Compare T[3] = c with P[2] = a (Mismatch)
i = 5, q = 0 → Compare T[5] = b with P[1] = b (Match)
i = 6, q = 1 → Compare T[6] = a with P[2] = a (Match)
i = 7, q = 2 → Compare T[7] = b with P[3] = b (Match)
i = 8, q = 3 → Compare T[8] = a with P[4] = a (Match)
i = 9, q = 4 → Compare T[9] = b with P[5] = b (Match)
i = 10, q = 5 → Compare T[10] = c with P[6] = c (Match)
i = 11, q = 6 → Compare T[11] = b with P[7] = b (Match)
Pattern found at shift 4
Conclusion
The KMP algorithm efficiently finds pattern matches in a text with a time complexity of O(n + m), where n is the length of the text and m is the length of the pattern.
Comments
Post a Comment