Find all occurrences of the string word in
the string text.
Input. The first line contains the string text, and the
second line contains the string word. The lengths of both strings are
greater than 0 and less than 50000. The strings consist only of
Latin letters.
Output. Print the starting indices of all occurrences of the
string word in the string text, in increasing order. As is
customary in programming, indexing starts from zero.
Sample
input |
Sample
output |
ababbababa aba |
0 5 7 |
SOLUTION
strings – Knuth–Morris–Pratt
Algorithm analysis
This
task requires implementing the Knuth–Morris–Pratt algorithm for substring
search in a text.
Algorithm implementation
Define
the following variables:
·
text – the string in which we need to find
all occurrences of the pattern word.
·
word – the substring (pattern) we are
searching for in the string text.
·
p – the prefix function array for the
string word.
·
pos – a list of all positions in text
where occurrences of the substring word begin.
string text, word;
vector<int> p, pos;
The
function MaxBorderArray
computes the prefix function of the string s and stores its values in
the array p.
void MaxBorderArray(string &s)
{
p.resize(s.size());
for (int i = 1; i < s.size(); i++)
{
int j = p[i - 1];
while ((j > 0) &&
(s[i] != s[j])) j = p[j - 1];
if (s[i] == s[j]) j++;
p[i] = j;
}
}
The
function kmp
implements the Knuth–Morris–Pratt algorithm. All starting indices of
occurrences of the string word in the text text are stored in the
array pos.
void kmp(string &word, string &text)
{
MaxBorderArray(word);
for (int i = 0, j = 0; i < text.size(); i++)
{
while (j && (text[i] != word[j])) j = p[j - 1];
if (text[i] == word[j]) j++;
If the index j reaches
the last character of the string word, it means a complete occurrence of
the word word has been found in the text. The starting position of this
occurrence is i – j + 1 (character indexing starts from zero).
if (j == word.size()) pos.push_back(i - j + 1);
}
}
The main part of the program. Read the input strings.
cin >> text;
cin >> word;
Run the Knuth-Morris-Pratt
algorithm.
kmp(word, text);
Print the found occurrence positions.
for (auto x : pos)
cout << x << " ";
cout << endl;