Why Sorted Array Are Processed Faster Than Unsorted Array?(mayankjain.me) |
Why Sorted Array Are Processed Faster Than Unsorted Array?(mayankjain.me) |
"Well, this is due to something called as Branch Prediction. Branch Prediction is a strategy used in processors to decide if the branch would be taken or not."
Emphatically false. Python's sort is based on timsort. Timsort makes the observation that most real-world lists are partially sorted; either forward or reverse. This can help sorting go faster. The observed timing difference is nearly all to do with that, and not with branch prediction.
Consider the Shell sort. It's O(n*n) for an unsorted list and O(n) for a sorted list. Again, nothing to do with branch prediction and everything to do with how well the data fits the algorithm's expectations.
I am just looping onto a list how is this related to complexity of sorting etc..
Sorry, but I didn't understood you properly.
I still don't believe it's due to branch prediction. I believe it's due to cache coherency.
Here's a more extensive example, where I shuffle the last N elements:
import time
from random import shuffle
x = [i for i in range(1000000)]
tick=time.clock()
sum1=0
for i in x:
sum1=sum1+i
toc=time.clock()
print "ref", toc-tick, sum1
for N in (0, 250000, 500000, 750000, 1000000):
y1 = range(0, N)
y2 = range(N, 1000000)
shuffle(y2)
y = y1 + y2
tick=time.clock()
sum1=0
for i in y:
sum1=sum1+i
toc=time.clock()
print "shuffle last %d: %f %d" % (len(x)-N, toc-tick, sum1)
assert x == y, "These should be identical when N==1000000"
print "== Two summations of the same array =="
tick=time.clock()
sum1=0
for i in x:
sum1=sum1+i
toc=time.clock()
print "ref (again)", toc-tick, sum1
tick=time.clock()
sum1=0
for i in y:
sum1=sum1+i
toc=time.clock()
print "final range (again)", toc-tick, sum1
When I run the above, the output is: ref 0.158759 499999500000
shuffle last 1000000: 0.254425 499999500000
shuffle last 750000: 0.228932 499999500000
shuffle last 500000: 0.252207 499999500000
shuffle last 250000: 0.224307 499999500000
shuffle last 0: 0.255656 499999500000
== Two summations of the same array ==
ref (again) 0.158914 499999500000
final range (again) 0.250253 499999500000
As you can see, the last two tests use arrays x and y where x == y == range(1000000), but their timing numbers are quite different. Branch prediction shouldn't cause that difference. Poor cache behavior might.Here's an attempt to double-check my view. I use an array.array(), which stores the numbers sequentially both for the sorted and random versions:
import array
import time
from random import shuffle
x = [i for i in range(1000000)]
x = array.array("i", x)
tick=time.clock()
sum1=0
for i in x:
sum1=sum1+i
toc=time.clock()
print "sorted", toc-tick, sum1
x = [i for i in range(1000000)]
shuffle(x)
x = array.array("i", x)
tick=time.clock()
sum1=0
for i in x:
sum1=sum1+i
toc=time.clock()
print "unsorted", toc-tick, sum1
When I run the above, I find that both timings are essentially identical: sorted 0.159446 499999500000
unsorted 0.161541 499999500000
This gives very strong evidence (unless I made another mistake) that branch prediction does not play a role, while cache prediction should be the actual suspect.