18

Considering there is an array returned from a function which is of very large size.

What will be the fastest approach to test if the array is sorted?

A simplest approach will be:

/// <summary>
/// Determines if int array is sorted from 0 -> Max
/// </summary>
public static bool IsSorted(int[] arr)
{
for (int i = 1; i < arr.Length; i++)
{
    if (arr[i - 1] > arr[i])
    {
    return false;
    }
}
return true;
}
3
  • are there only sequential elements? Commented Aug 16, 2012 at 14:15
  • 2
    If you want better algorithm than O(n), you will have to tolerate some errors. Commented Aug 16, 2012 at 18:36
  • An alternative might be to wrap the list in a Stream (assuming you know the array is packed, which this int[] in your example is) and read sizeof(int) bytes at a time, comparing with the previous bytes read. That'll save you incrementing an index counter, but not sure what it will do to caching. Commented Mar 16, 2014 at 23:02

9 Answers 9

22

You will have to visit each element of the array to see if anything is unsorted.

Your O(n) approach is about as fast as it gets, without any special knowledge about the likely state of the array.

Your code specifically tests if the array is sorted with smaller values at lower indices. If that is not what you intend, your if becomes slightly more complex. Your code comment does suggest that is what you're after.

If you were to have special knowledge of the probable state (say, you know it's generally sorted but new data might be added to the end), you can optimize the order in which you visit array elements to allow the test to fail faster when the array is unsorted.

You can leverage knowledge of the hardware architecture to check multiple parts of the array in parallel by partitioning the array, first comparing the boundaries of the partition (fail fast check) and then running one array partition per core on a separate thread (no more than 1 thread per CPU core). Note though that if a array partition is much smaller than the size of a cache line, the threads will tend to compete with each other for access to the memory containing the array. Multithreading will only be very efficient for fairly large arrays.

Sign up to request clarification or add additional context in comments.

Comments

5

Faster approach, platform target: Any CPU, Prefer 32-bit.
A sorted array with 512 elements: ~25% faster.

static bool isSorted(int[] a)
{
    int j = a.Length - 1;
    if (j < 1) return true;
    int ai = a[0], i = 1;
    while (i <= j && ai <= (ai = a[i])) i++;
    return i > j;
}

Target: x64, same array: ~40% faster.

static bool isSorted(int[] a)
{
    int i = a.Length - 1;
    if (i <= 0) return true;
    if ((i & 1) > 0) { if (a[i] < a[i - 1]) return false; i--; }
    for (int ai = a[i]; i > 0; i -= 2)
        if (ai < (ai = a[i - 1]) || ai < (ai = a[i - 2])) return false;
    return a[0] <= a[1];
}

Forgot one, marginally slower than my first code block.

static bool isSorted(int[] a)
{
    int i = a.Length - 1; if (i < 1) return true;
    int ai = a[i--]; while (i >= 0 && ai >= (ai = a[i])) i--;
    return i < 0;
}

Measuring it (see greybeard's comment).

using System;                                  //  ????????? DEBUG ?????????
using sw = System.Diagnostics.Stopwatch;       //  static bool abc()    
class Program                                  //  {   // a <= b <= c ?  
{                                              //      int a=4,b=7,c=9;  
    static void Main()                         //      int i = 1;  
    {                                          //      if (a <= (a = b))  
        //abc();                               //      {  
        int i = 512;                           //          i++;  
        int[] a = new int[i--];                //          if (a <= (a = c))
        while (i > 0) a[i] = i--;              //          {    
        sw sw = sw.StartNew();                 //              i++;  
        for (i = 10000000; i > 0; i--)         //          }  
            isSorted(a);                       //      }  
        sw.Stop();                             //      return i > 2;  
        Console.Write(sw.ElapsedMilliseconds); //  }  
        Console.Read();                        //  static bool ABC();
    }                                          //  {
                                               //      int[]a={4,7,9};    
    static bool isSorted(int[] a) // OP Cannon //      int i=1,j=2,ai=a[0]; 
    {                                          //  L0: if(i<=j)    
        for (int i = 1; i < a.Length; i++)     //        if(ai<=(ai=a[i]))  
            if (a[i - 1] > a[i]) return false; //          {i++;goto L0;}  
        return true;                           //      return i > j;  
    }                                          //  }  
}

Target: x64. Four cores/threads. A sorted array with 100,000 elements: ~55%.

static readonly object _locker = new object();
static bool isSorted(int[] a)  // a.Length > 3
{
    bool b = true;
    Parallel.For(0, 4, k =>
    {
        int i = 0, j = a.Length, ai = 0;
        if (k == 0) { j /= 4; ai = a[0]; }                        // 0 1
        if (k == 1) { j /= 2; i = j / 2; ai = a[i]; }             // 1 2
        if (k == 2) { i = j - 1; ai = a[i]; j = j / 2 + j / 4; }  // 4 3
        if (k == 3) { i = j - j / 4; ai = a[i]; j = j / 2; }      // 3 2
        if (k < 2)
            while (b && i <= j)
            {
                if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2])) i += 2;
                else lock (_locker) b = false;
            }
        else
            while (b && i >= j)
            {
                if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2])) i -= 2;
                else lock (_locker) b = false;
            }
    });
    return b;
}

1,000,000 items?

if (k < 2)
    while (b && i < j)
        if (ai <= (ai = a[i + 1]) && ai <= (ai = a[i + 2]) &&
            ai <= (ai = a[i + 3]) && ai <= (ai = a[i + 4])) i += 4;
        else lock (_locker) b = false;
else
    while (b && i > j)
        if (ai >= (ai = a[i - 1]) && ai >= (ai = a[i - 2]) &&
            ai >= (ai = a[i - 3]) && ai >= (ai = a[i - 4])) i -= 4;
        else lock (_locker) b = false;

Let's forget percentages.
Original: 0.77 ns/item, now: 0.22 ns/item.
2,000,000 items? Four cores: 4 times faster.

6 Comments

Does this detect inversions? (I think 1) it does 2) it isn't obvious.) Please disclose your approach to measuring this.
Does this detect inversions? Good question! Wikipedia: Let (A(1),...,A(n)) be a sequence of n distinct numbers. If i < j and A(i) > A(j), then the pair (i,j) is an inversion of A. - If the sequence has duplicates, then not all numbers are distinct. -If a pair (i,j), isSorted uses j = i + 1 , and A(i) > A(j), then the sequence is not sorted, and, YES, an inversion is detected in a sequence of not necessarily distinct numbers, we're done. So it does NOT detect inversionS, but if it detects ONE inversion, then we know the sequence is NOT sorted. I added a code block "Measuring it".
(Do not comment comments asking for additional information or clarification: edit your post. The part that made me comment/I don't (yet) find idiomatic: ai <= (ai = a[i]) (and similar).)
Are you sure those timings were taken with a Release build with the debugger NOT attached? If the debugger is attached (i.e. you pressed F5 in Visual Studio), then your timings do not reflect how the code will run in production. Select "Run without Debugging" from the Debug menu if you want real timings.
Yes I am sure. My timings were taken with a Release build (Optimize code "on"). I just checked it on a slower machine, a laptop. Results for a sorted array with 512 elements: Any CPU, prefer 32-bit: 24%, x64: 35%. And to put it simply: run it on your system, check whether my approach is faster (or not).
|
1

Linq solution.

public static bool IsSorted<T>(IEnumerable<T> list) where T:IComparable<T>
{
    var y = list.First();
    return list.Skip(1).All(x =>
    {
        bool b = y.CompareTo(x) < 0;
        y = x;
        return b;
    });
}

2 Comments

The OP asked What will be the fastest approach to test if the array is sorted?. This solution is at best as fast as the OP's solution, and is probably a little slower. Linq may be the new black, but it's not the best solution all of the time.
In fact, this could fail if the passed Enumerable is an iterate-once collection. For example, if somebody were to write x = IsSorted(File.ReadLines("filename.txt")); the method would fail because it would require multiple enumeration of the Enumerable.
1

Here is my version of the function IsSorted

public static bool IsSorted(int[] arr)
{               
    int last = arr.Length - 1;
    if (last < 1) return true;

    int i = 0;

    while(i < last && arr[i] <= arr[i + 1])
        i++;

    return i == last;
}

While this function is a bit faster than in the question, it will do fewer assignments and comparisons than anything has been posted so far. In the worst case, it does 2n+1 comparisons. It still can be improved if you can make a reasonable assumption about the nature of the data like minimum data size or array contains even number of elements.

Comments

0

The only improvement i can think of is check both ends of the array at the same time, this little change will do it in half time...

public static bool IsSorted(int[] arr)
{
int l = arr.Length;
for (int i = 1; i < l/2 + 1 ; i++)
{
    if (arr[i - 1] > arr[i] || arr[l-i] < arr[l-i-1])
    {
    return false;
    }
}
return true;
}

6 Comments

How do you say that this algorithm can execute in half time? Effective you are doing the same amount of comparisons (l/2 * 2 i.e. l).
ok, i see your point, anyway if there's something wrong it might find it sooner, this is when the error is in the first or fourth quarter of the array
Then again, you will find it later if the unsorted portion is in the middle. Adds complexity without adding value. Plus checking from both ends reduces the likelihood that the data you are accessing is in the CPU cache (depending on the size of the array and the underlying architecture).
Not to mention that this will likely be slower because of cache behavior.
@AndrejBauer can you elaborate about the cache behaviour?
|
0

This is what I came up with and find works better particularly with greater sized arrays. The function is recursive and will be called for the very first time, say in a while loop like this

while( isSorted( yourArray, 0 )

The if statement checks if the bounds of the array have been reached.

The else if statement will call itself recursively and break at any time when the condition becomes false

 public static bool IsSorted(int[] arr, int index)
    {
        if (index >= arr.Length - 1)
        {
            return true;
        }
        else if ((arr[index] <= arr[ index + 1]) && IsSorted(arr, index + 1))
        {
            return true;
        }
        else
        {
            return false;
        }
    }

4 Comments

This is what I […] find works better a) better than what? (Better than the code presented in the question?) b) better in which regard: how can I reproduce your finding?
Sorry Michael, but 1)Did you test your solution, int[] arr = new int[]{0,0}, isSorted(arr,0) returns false (change "<" into "<="). 2)On my box it's (much) slower than OP's solution. 3)With a large array ("int[] arr = new int[1000000]"): stack overflow.
Hi @greybeard, yes I was referring to the code presented in the question, this in regard to how execution time for larger arrays. I tried it with samples of random data. If you find this is otherwise please do let me know
@P_P Thank you for that, I had somehow overlooked cases where neighbouring elements are equal. I had tested the solution and found it was faster for larger data sets, however I will do further tests on it
0

If the order doesn't matter(descending or ascending).

private bool IsSorted<T>(T[] values) where T:IComparable<T>
{
    if (values == null || values.Length == 0) return true;

    int sortOrder = 0;

    for (int i = 0; i < values.Length - 1; i++)
    {
        int newSortOrder = values[i].CompareTo(values[i + 1]);

        if (sortOrder == 0) sortOrder = newSortOrder;

        if (newSortOrder != 0 && sortOrder != newSortOrder) return false;
    }

    return true;
}

Comments

-1

The question that comes to my mind is "why"?

Is it to avoid re-sorting an already-sorted list? If yes, just use Timsort (standard in Python and Java). It's quite good at taking advantage of an array/list being already sorted, or almost sorted. Despite how good Timsort is at this, it's better to not sort inside a loop.

Another alternative is to use a datastructure that is innately sorted, like a treap, red-black tree or AVL tree. These are good alternatives to sorting inside a loop.

2 Comments

Innately sorted data structures still incur a cost to sort them. It's just spread out between the various Add/Insert/Delete calls. That may or may not be a good thing, depending on use case.
I would say that other than initial population, you don't sort a treap. Updating a treap inside a loop is not sorting - it's preserving sortedness, which is less expensive than even Timsort.
-1

This might not be the fastest but it's the complete solution. Every value with index lower than i is checked against the current value at i. This is written in php but can easily be translated into c# or javascript

for ($i = 1; $i < $tot; $i++) {
        for ($j = 0; $j <= $i; $j++) {
            //Check all previous values with indexes lower than $i
            if ($chekASCSort[$i - $j] > $chekASCSort[$i]) {
                return false;
            }
        }
    }

3 Comments

Please take a stab at the asymptotic order of growth of the number of comparisons astot grows large. Would there be any chance to capitalise on order relations being transitive?
I agree that the number of comparisons grows large when tot is large however, we can't assume the sequence to be transitive because we don't know what the sequence order is and if it was transitive (by definition) it is assumed to already be in some type order.
If you can't assume that the sequence order is transitive, then it's not possible to sort in the first place. Sorting in the general sense implies transitivity.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.