1498. KMP

 

Find all occurrences of the string word in the string text.

 

Input. The first line contains the string text, and the second line contains the string word. The lengths of both strings are greater than 0 and less than 50000. The strings consist only of Latin letters.

 

Output. Print the starting indices of all occurrences of the string word in the string text, in increasing order. As is customary in programming, indexing starts from zero.

 

Sample input

Sample output

ababbababa

aba

0 5 7

 

 

SOLUTION

stringsKnuth–Morris–Pratt

 

Algorithm analysis

This task requires implementing the Knuth–Morris–Pratt algorithm for substring search in a text.

 

Algorithm implementation

Define the following variables:

·        textthe string in which we need to find all occurrences of the pattern word.

·        wordthe substring (pattern) we are searching for in the string text.

·        pthe prefix function array for the string word.

·        posa list of all positions in text where occurrences of the substring word begin.

 

string text, word;

vector<int> p, pos;

 

The function MaxBorderArray computes the prefix function of the string s and stores its values in the array p.

 

void MaxBorderArray(string &s)

{

  p.resize(s.size());

  for (int i = 1; i < s.size(); i++)

  {

    int j = p[i - 1];

    while ((j > 0) && (s[i] != s[j])) j = p[j - 1];

    if (s[i] == s[j]) j++;

    p[i] = j;

  }

}

 

The function kmp implements the Knuth–Morris–Pratt algorithm. All starting indices of occurrences of the string word in the text text are stored in the array pos.

 

void kmp(string &word, string &text)

{

  MaxBorderArray(word);

  for (int i = 0, j = 0; i < text.size(); i++)

  {

    while (j && (text[i] != word[j])) j = p[j - 1];

    if (text[i] == word[j]) j++;

 

If the index j reaches the last character of the string word, it means a complete occurrence of the word word has been found in the text. The starting position of this occurrence is ij + 1 (character indexing starts from zero).

 

    if (j == word.size()) pos.push_back(i - j + 1);

  }

}

 

The main part of the program. Read the input strings.

 

cin >> text;

cin >> word;

 

Run the Knuth-Morris-Pratt algorithm.

 

kmp(word, text);

 

Print the found occurrence positions.

 

for (auto x : pos)

  cout << x << " ";

cout << endl;