Genetic scientists of the planet
Scientists have already deduced some primitive organism and want to modify
its genome in such a way as to obtain an ideal organism. They believe that in
the future it will help to find a cure for many diseases.
An organism is considered ideal if any two identical genes are either
located in adjacent positions in the genome, or there is at least one the same
gene between them.
In one operation, scientists can select and remove one or more identical
genes from the body's genome, after which they can be inserted back into the
genome, but perhaps in different positions. Since each such operation weakens
the body, scientists want to achieve their goal, while performing as few
operations as possible.
Write a program that, according to a given representation of the genome,
will determine the smallest number of operations necessary to obtain an ideal
organism.
Input. The first line contains the number of genes n (1 ≤ n ≤ 105) in the genome of the primitive organism.
The next line contains n positive
integer, each does not exceed n – the
sequence of genes in the genome.
Output. Print the least number of
operations for which scientists can get a perfect organism.
Sample
input |
Sample
output |
9 1 2 1 3 1
3 2 4 5 |
2 |
greedy
The genes should
be rearranged in such a way that the same genes stand side by side. In this
case, the number of operations performed should be minimized.
For each positive integer ai associate the segment [si; ei], where si is the position where ai appears the first time and ei is the position where ai appears the last time.
Let res be the largest possible number of
disjoint segments. Find this value using the “order selection” algorithm. This means that each
of the other segments intersects with at least one of the selected ones. And over the genes that
correspond to them, it is necessary to carry out the described procedure.
If k is the number
of all constructed segments, then the answer to the problem will be the value k – res.
Example
For the given
example, the segments will have the form (the positions are numbered starting from zero):
·
for 1: [0; 4];
·
for 2: [1; 6];
·
for 3: [3; 5];
·
for 4: [7; 7];
·
for 5: [8; 8];
Sort the segments
by the coordinate of its end:
[0; 4], [3; 5], [1; 6], [7; 7], [8; 8]
The maximum
number of non-intersecting line segments is 3. For example, you can select line
segments [0; 4], [7; 7], [8; 8]. The procedure described in the problem
statement should be carried out over the genes corresponding to the remaining
segments. These genes are 2 and 3. Two permutations can, for example, be done
as follows:
Let MAX be the maximum possible number of genomes in the
input sequence.
#define MAX 100001
The vector of pairs v stores
the segments [si; ei].
vector<pair<int, int> > v;
int s[MAX], e[MAX];
Read the input data. Initialize s[i] =
-1, e[i] = -1.
scanf("%d", &n);
memset(s, -1, sizeof(s));
memset(e, -1, sizeof(e));
for (i = 0; i < n; i++)
{
scanf("%d", &x);
If number x occurs for the first time, then its
position in the input sequence is stored in s[x].
if (s[x] == -1) s[x] = i;
If number x occurs not for the first time, then its
position is stored in e[x], assuming that x occurs for the
last time.
if (i > e[x]) e[x] = i;
}
Store all constructed segments in the vector v.
for (i = 1; i <= n; i++) //
numbers 1..n
if (s[i] != -1) v.push_back(make_pair(e[i], s[i]));
The segments were inserted to the vector in the reverse order (like [ei; si]), so that default sort function will sort the segments in increasing order of
their ends.
sort(v.begin(),
v.end());
Find the maximum number res
of disjoint segments.
i = res = 0;
while (i < v.size())
{
The i-th segment is added to the
set of disjoint segments. The variable temp
contains the end of this segment (let's call it the current segment).
res++;
temp = v[i++].first; // end
As long as the start of the next segment is less than the end of the current
one, skip such segment (it intersects with the current one).
while (i < v.size() && v[i].second < temp) i++;
}
Print the answer.
printf("%d\n", v.size() -
res);