Let's talk about algorithms. Newbies often perceive them as something difficult, complex and incomprehensible, and this is partly true, but algorithms are the basis. And the better you know the basics of your specialty, the more likely you are to excel in it.
Today we'll take a look at one beautiful algorithm problem. We will not scare people away from working with algorithms at the very start, so in our analysis there will be no spreading trees of segments, no assorted optimizations of the knapsack problem, or probabilistic tests for simplicity. User-friendly algos.
Here's the challenge: find the maximum difference between neighbors.
An array of N integers is given. It is not ordered in any way, and numbers can be repeated. Let's say we sorted it and calculated the difference between each pair of consecutive elements. It is necessary to find the maximum such difference and do it in the most optimal way.
Complicated? You can try to do this before you click "Read more", and then check the solution.
So let's go. Maximum difference between neighbors.
Consider an example:
Let an array of N = 11 elements be given.
First, let's solve the problem naively, it often helps. We can do exactly what the problem requires of us - sort the numbers and find the difference between adjacent elements. The complexity of the solution will vary depending on which sorting we use.
If we use qsort or mergesort , then the time complexity is ... If we know that the numbers are distributed rather densely on the set of integers, then we can use counting sort. Then we get , where U is the difference between the maximum and minimum element. The case is relatively rare, but it is worth remembering about this possibility.
After sorting, we get an array:
Let's write out an array of differences:
We
get that the maximum difference is 6. Now let's think, is it possible to solve the problem faster? When iterating over pairs of neighboring elements, we will consider many pairs with a small difference. Such pairs may never be able to give the greatest difference. Indeed, in the sorted array we have a pair of adjacent numbers, let the difference between maximum and minimum be ... Then the largest difference should have at least ... This estimate is true by the Dirichlet principle.
If the pigeons are placed in cages, and the number of pigeons is greater than the number of cages, then at least one of the cages contains more than one pigeon.
9 cells contain 10 pigeons, according to the Dirichlet principle, at least one cell contains more than one pigeon ( Wikipedia )
Let , ... , - i-th element of the sorted array. Then ...
If we assume that the maximum of all D [i] is less , then the whole amount will be strictly less than U, a contradiction.
Great, we got a lower bound for the maximum difference! Now let's try to somehow localize elements that are too close to each other - we split the segment from the minimum to the maximum element into half-intervals of length ... Each element will fall into exactly one half-interval. We will get a partition of all elements into disjoint groups, they are also called batches.
It is very simple to determine which of them the element x fell into - you need to calculate 1 + int ((x - a_min) / L) (we number from one), where a_min is the minimum element in the array A, it can be found in one pass along the original array. The only exception will be the maximum element - it is better to make elements with such a value in a separate batch.
The figure shows the distribution from the example described above. For clarity, let's do this with our hands:
Batch distribution is performed in linear time and requires additional memory. Please note that items from one batch cannot give the maximum difference between two items. We have specially selected its size in this way.
Where can there be two adjacent elements? Of course, in neighboring non-empty batches! For example, in the figure, elements from the third and sixth batches can go in a row in a sorted array, since the batches between them are empty. It is clear that only two elements will be adjacent - the maximum from the 3rd batch and the minimum from the 6th. Similarly, candidates for a pair with the maximum difference will be the maximum element of the first batch and the minimum element of the third.
Let us write down all possible pairs of elements that can give the greatest difference. Let's denote as min (i) - the minimum element in the i-th group, as max (i) - the maximum element.
max (1) = -1 min (3) = 3
max (3) = 4 min (6) = 10
max (6) = 10 min (8) = 15
max (8) = 15 min (10) = 20
max (10) = 21 min (11) = 22
We have identified pairs that are worth considering. In practice, it is necessary to make one pass through all batches, maintain the last non-empty batch and the maximum value in it. When we come across the next non-empty batch, we will find the minimum in it and try to update the answer.
We will be glad - we solved the problem in time ...
In fact, we used a fairly well-known idea and in fact took the first steps of pocket sorting, in the original it is called bucket-sort .
Items are arranged into baskets, and then the items in each basket are sorted
In the worst case, it works for , but the average running time with a sufficient number of batches and a uniform distribution of elements is ...
“But wait, but what about the case when , why haven't you considered it? ”- the attentive reader will ask. This case is degenerate, so we haven't considered it, but let's do it now for completeness.
If the difference between the minimum and maximum is zero, then the maximum difference is also zero. Actually, that's all.