Indeed, in addition to the performance indicator, there are other characteristics, often equally important. One of them is naturalness . What it is? Sorting is called natural if it is faster when the array is almost sorted. And what array can be considered "almost sorted"? Here are two arrays of the same elements:
{1,2,9,3,4,5,7,6,8,10} and {9,1,6,3,10,5,4,2, 8.7}
Even by eye you can see that the first array is more ordered (only 9 and 7 are "out of place"). Whereas in the second array the numbers are mixed randomly. What is the measure of orderliness? The answer is known - the number of inversions. A pair of elements A [i] and A [j] for j> i constitute an inverse if A [j] <A [i]. (In this note, the purpose of sorting is always ascending order.)
Now we can give a precise definition: sorting is called natural if the speed of its operation decreases with a decrease in the number of inversions in the original array.
Naturalness is a rather “rare fruit” in the world of sorting - neither QuickSort nor Shell sort have this property, alas. But there is one algorithm that, one might say, is absolutely natural - it is insertion sort. Although this algorithm is known to every cultured person, let me briefly repeat its essence.
The array to be sorted is scanned once from beginning to end. As soon as it is found that the i-th and (i-1) -elements form an inversion, the i-th element is "thrown" back (which is achieved by shifting the desired segment of the beginning of the array to the right by one position). It is clear from this description that if there are few inversions in the array, the algorithm will "fly" through the array very quickly. If there are no inversions at all, then the algorithm will complete in O (n) time. But QuickSort in this case will be long and tedious to select a separating element, divide an already ordered array into segments, etc. But with a large number of inversions, the situation will, of course, be the opposite: the performance of insertion sort will drop to O (n ^ 2), and QuickSort will be the champion - O (n * log (n)).
This situation raises a natural question from my point of view: how many inversions outweigh naturalness and insertion sort works faster than QuickSort?
To answer this question, I ran a series of computational experiments. Their essence was as follows. Arrays of integers ranging from 3000 to 30,000 elements were taken, a certain number of inversions were introduced into them, then the arrays were sorted by insertions and quick sort. The sorting time was measured (in system ticks). For averaging, each sort was repeated 10 times. The results were as follows:
The picture shows the dependence of the sorting time on the number of introduced inversions. The raspberry series is, of course, QuickSort, and the blue series is insertion sort. For an array size of 30 thousand elements, up to about 400 thousand inversions "natural wins." For less ordered arrays, QuickSort is already more beneficial.
And the next picture shows the empirical dependence of the critical number of inversions on the size of the array.
It is easy to see that for an array of size n, the critical number of inversions is approximately 10 * n. With more inversions, QuickSort is beneficial. It should be noted that the maximum number of inversions for an array of length n is n * (n-1) / 2. The value 10 * n is a very insignificant part of them. Which is not surprising.
To what has been said, it remains to add that the results of such experiments depend on many factors (programming language, the type of data being sorted, etc.) To be honest, I was too lazy to investigate these issues in more detail. My experiments were done in Microsoft Excel in VBA environment:
Test source code
Private Declare Function GetTickCount Lib "kernel32" () As Long
':::
Sub JSort(A() As Long)
n& = UBound(A, 1)
For i& = 2 To n
If A(i&) < A(i& - 1) Then
j& = i& - 1
x& = A(i&)
Do While (A(j&) > x&)
A(j& + 1) = A(j&)
j& = j& - 1
If j& = 0 Then Exit Do
Loop
A(j& + 1) = x&
End If
Next i&
End Sub
':::
Sub QSort(A() As Long, Optional b As Long = 1, Optional e As Long = 0)
If (e - b) <= 1 Then Exit Sub
i& = b
j& = e
w& = A((i& + j&) / 2)
Do While (i& < j&)
Do While (A(i&) < w&)
i& = i& + 1
Loop
Do While (A(j&) > w&)
j& = j& - 1
Loop
If i& < j& Then
tmp& = A(i&)
A(i&) = A(j&)
A(j&) = tmp&
i& = i& + 1
j& = j& - 1
End If
Loop
If j& > b Then QSort A, b, j&
If i& < e Then QSort A, i&, e
End Sub
'::: ( )
Sub InsInv(A() As Long, k As Long)
n& = UBound(A, 1)
For i& = 1 To k
Do
k1& = n& * Rnd
k2& = n& * Rnd
If (k1& <> k2&) And (k1& >= 1) And (k2& >= 1) Then Exit Do
Loop
tmp& = A(k1&)
A(k1&) = A(k2&)
A(k2&) = tmp&
Next i&
End Sub
':::
Function NumInv(A() As Long) As Long
n& = UBound(A, 1)
For i& = 1 To n& - 1
For j& = i& + 1 To n&
If A(j&) < A(i&) Then NumInv = NumInv + 1
Next j&
Next i&
End Function
':::
Sub Gtest_1()
Dim Eta() As Long
Dim Arr() As Long
Size& = CLng(InputBox("Sz="))
ReDim Eta(1 To Size&) As Long
ReDim Arr(1 To Size&) As Long
Randomize
For i& = 1 To Size&
Eta(i&) = i&
Next i&
q# = Size& * (Size& - 1) / 2
For iii% = 1 To 10
InsInv Eta, CLng(iii%)
ni# = CDbl(NumInv(Eta))
Cells(iii%, 1).Value = ni#
DoEvents
S# = 0
For l% = 1 To 10
For i& = 1 To Size&
Arr(i&) = Eta(i&)
Next i&
TBeg& = GetTickCount
JSort Arr
TEnd& = GetTickCount
S# = S# + TEnd& - TBeg&
Next l%
Cells(iii%, 2).Value = S#
DoEvents
S# = 0
For l% = 1 To 10
For i& = 1 To Size&
Arr(i&) = Eta(i&)
Next i&
TBeg& = GetTickCount
QSort Arr, 1, Size&
TEnd& = GetTickCount
S# = S# + TEnd& - TBeg&
If Not check(Arr) Then Exit Sub
Next l%
Cells(iii%, 3).Value = S#
DoEvents
Next iii%
MsgBox "OK"
End Sub
Thanks for attention!
PS
And thanks to everyone who noted my mistakes! (Incorrect spelling of Timsort - in the first edition there was "TeamSort" and the missing "i" in QuickSort). And yes! - (specially for perfectionists) QuickSort can "degrade" to quadratic performance. But in this post I am considering not the worst, but the best use case for QuickSort.