String matching algorithm

April 16, 2025

String Matching Algorithms: A Friendly Guide

String matching is the backbone of search engines, plagiarism checkers, and even DNA sequencing. The goal? Find a pattern (substring) within a text (larger string).

Here’s a breakdown of key algorithms, from brute force to genius-level optimizations.

1. Brute-Force (Naive) Algorithm

How it works:

Slide the pattern over the text one character at a time.
At each position, check if the pattern matches the current text window.

Example:
Text: "AABAACAADAABAABA"
Pattern: "AABA"

Checks AABA at position 0 (match!), then shifts right by 1.

Time Complexity:

Worst case: O(n×m) (if no matches exist).

Pros:

Simple to implement.
Cons:
Slow for large texts.

2. Rabin-Karp Algorithm

How it works:

Uses hashing to compare the pattern with text windows.
Computes hash values for the pattern and each text window.
Only checks character-by-character if hashes match (to avoid false positives).

Example:
Text: "GEEKSFORGEEKS"
Pattern: "GEEK"

Hash("GEEK") = some value.
Compare hash with every 4-letter window in the text.

Time Complexity:

Average case: O(n + m).
Worst case: O(n×m) (if all hashes collide).

Pros:

Efficient for multiple pattern searches (like plagiarism checks).
Cons:
Hash collisions can slow it down.

3. Knuth-Morris-Pratt (KMP) Algorithm

How it works:

Preprocesses the pattern to create a prefix table (LPS array).
Uses this table to skip unnecessary comparisons.

Key Idea:

If a mismatch occurs, KMP knows how much to shift the pattern without rechecking.

Example:
Text: "ABABDABACDABABCABAB"
Pattern: "ABABCABAB"

LPS table helps skip redundant checks.

Time Complexity:

O(n + m) (linear time!).

Pros:

No backtracking in text.
Great for long patterns.
Cons:
Slightly complex to implement.

Search This Blog

Radhika Nanda