The Knuth-Morris-Pratt (KMP) algorithm is a string-matching algorithm that efficiently finds occurrences of a pattern string within a larger text string. It was developed by Donald Knuth, Vaughan Pratt, and James Morris in 1977 and is considered to be one of the most efficient string-matching algorithms.
The basic idea behind the KMP algorithm is to avoid unnecessary comparisons by using information from previous comparisons to determine where to begin the next comparison. Specifically, the algorithm precomputes a partial match table, which is used to determine the longest proper prefix of the pattern that is also a suffix of the pattern for each position in the pattern. This information is then used to determine where to begin the next comparison in the text string.
The KMP algorithm has a time complexity of O(m + n), where m is the length of the pattern and n is the length of the text string. This is because the precomputation of the partial match table takes O(m) time, and the matching of the pattern against the text string takes O(n) time.
The KMP algorithm is widely used in text editors, compilers, and other applications that involve searching for patterns within large text strings. It is particularly useful for searching for patterns that contain repeated substrings, since it can exploit the repetition to improve its efficiency.
fun kmp(text: String, pattern: String): List<Int> {
val n = text.length
val m = pattern.length
val lps = computeLPS(pattern)
var i = 0
var j = 0
val indices = mutableListOf<Int>()
while (i < n) {
if (text[i] == pattern[j]) {
i++
j++
}
if (j == m) {
indices.add(i - j)
j = lps[j - 1]
} else if (i < n && text[i] != pattern[j]) {
if (j != 0) {
j = lps[j - 1]
} else {
i++
}
}
}
return indices
}
kmp
function takes two string arguments: text
and pattern
. It first computes the longest proper prefix that is also a suffix of the pattern using the computeLPS
function. It then uses two pointers i
and j
to traverse the text and pattern strings, respectively. If a match is found, the pointers are incremented. If the entire pattern is found, the index is added to a list of indices. If there is a mismatch, the pointer j
is updated using the LPS array. The function returns the list of indices where the pattern is found in the text.
fun computeLPS(pattern: String): IntArray {
val m = pattern.length
val lps = IntArray(m)
var len = 0
var i = 1
while (i < m) {
if (pattern[i] == pattern[len]) {
len++
lps[i] = len
i++
} else {
if (len != 0) {
len = lps[len - 1]
} else {
lps[i] = 0
i++
}
}
}
return lps
}
computeLPS(pattern: String): IntArray
– This function takes a string pattern as input and returns an integer array representing the longest proper prefix which is also a suffix for each position in the pattern. It uses the concept of LPS (Longest Prefix which is also a Suffix) to compute the array. It initializes the LPS value for the first character as 0 and then iterates through the remaining characters of the pattern. For each character, it compares the previous character’s LPS value and checks if it is equal to the current character. If they match, it increments the LPS value for the current character by 1. If not, it updates the LPS value by checking for the longest proper prefix which is also a suffix for the current character.
fun kmp(text: String, pattern: String): List<Int> {
val n = text.length
val m = pattern.length
val lps = computeLPS(pattern)
var i = 0
var j = 0
val indices = mutableListOf<Int>()
while (i < n) {
if (text[i] == pattern[j]) {
i++
j++
}
if (j == m) {
indices.add(i - j)
j = lps[j - 1]
} else if (i < n && text[i] != pattern[j]) {
if (j != 0) {
j = lps[j - 1]
} else {
i++
}
}
}
return indices
}
fun computeLPS(pattern: String): IntArray {
val m = pattern.length
val lps = IntArray(m)
var len = 0
var i = 1
while (i < m) {
if (pattern[i] == pattern[len]) {
len++
lps[i] = len
i++
} else {
if (len != 0) {
len = lps[len - 1]
} else {
lps[i] = 0
i++
}
}
}
return lps
}
fun main() {
val text = "ABABDABACDABABCABAB"
val pattern = "ABABCABAB"
val indices = kmp(text, pattern)
if (indices.isEmpty()) {
println("Pattern not found in text")
} else {
println("Pattern found at indices: $indices")
}
}
main
function demonstrates how to use the kmp
function to find a pattern in a text string. In this example, we search for the pattern “ABABCABAB” in the text “ABABDABACDABABCABAB”. If the pattern is found, the indices at which it occurs are printed. If the pattern is not found, a message is printed indicating that it was not found.
Output : Pattern found at indices: [10]