1

Firstly, it's not about an array with subsequences that may be in some order before we start sort, it's an about array of special structure.

I'm writing now a simple method that sorts data. Until now, I used Array.Sort, but PLINQ's OrderBy outperform standard Array.Sort on large arrays.

So i decide to write my own implementation of multithreading sort. Idea was simple: split an array on partitions, parallel sort each partition, then merge all results in one array.

Now i'm done with partitioning and sorting:

public class PartitionSorter
{
    public static void Sort(int[] arr)
    {
        var ranges = Range.FromArray(arr);
        var allDone = new ManualResetEventSlim(false, ranges.Length*2);
        int completed = 0;
        foreach (var range in ranges)
        {
            ThreadPool.QueueUserWorkItem(r =>
            {
                var rr = (Range) r;
                Array.Sort(arr, rr.StartIndex, rr.Length);
                if (Interlocked.Increment(ref completed) == ranges.Length)
                    allDone.Set();
            }, range);
        }
        allDone.Wait();
    }
}

public class Range
{
    public int StartIndex { get; }
    public int Length { get; }

    public Range(int startIndex, int endIndex)
    {
        StartIndex = startIndex;
        Length = endIndex;
    }

    public static Range[] FromArray<T>(T[] source)
    {
        int processorCount = Environment.ProcessorCount;
        int partitionLength = (int) (source.Length/(double) processorCount);
        var result = new Range[processorCount];
        int start = 0;
        for (int i = 0; i < result.Length - 1; i++)
        {
            result[i] = new Range(start, partitionLength);
            start += partitionLength;
        }
        result[result.Length - 1] = new Range(start, source.Length - start);
        return result;
    }
}

As result I get an array with special structure, for example

[1 3 5 | 2 4 7 | 6 8 9]

Now how can I use this information and finish sorting? Insertion sorts and others doesn't use information that data in blocks is already sorted, and we just need to merge them together. I tried to apply some algorithms from Merge sort, but failed.

15
  • 2
    Since you are essentially doing a Merge-Sort you should continue in that direction! Why did you fail to implement the Merge-Sort? Commented Feb 25, 2016 at 14:22
  • @MrPaulch because it's hard to implement in-place merge sort. Before I used a qsort, which is good in place, but its performance is worse than a naive singlethread Array.Sort because of random memory access. Commented Feb 25, 2016 at 14:23
  • Then you should adapt you algorithm to Quicksort - Array.Sort actually uses Introsort which is a hybrid of Quicksort and Heapsort - I once implemented a highly specialized Quicksort algorithm that outperformed Array.Sort since it knew the kind of data it had to sort. Commented Feb 25, 2016 at 14:30
  • 1
    en.wikipedia.org/wiki/Timsort is normally quite fast on partially sorted collections... The algorithm finds subsets of the data that are already ordered, and uses that knowledge to sort the remainder more efficiently Commented Feb 25, 2016 at 14:41
  • 1
    The thing to remember that having more threads than cores can be useful if one or more of the threads is blocked for some reason. However, if all the threads are active then having more threads than cores will always slow things down due to the extra context switching. I'm not saying you should never do it; I'm just explaining why the multithreaded implementation can be slower than the sequential one. Commented Feb 26, 2016 at 8:52

1 Answer 1

2

I've done some testing with a parallel Quicksort implementation.

I tested the following code with a RELEASE build on Windows x64 10, compiled with C#6 (Visual Studio 2015), .Net 4.61, and run outside any debugger.

My processor is quad core with hyperthreading (which is certainly going to help any parallel implementation!)

The array size is 20,000,000 (so a fairly large array).

I got these results:

LINQ OrderBy()  took 00:00:14.1328090
PLINQ OrderBy() took 00:00:04.4484305
Array.Sort()    took 00:00:02.3695607
Sequential      took 00:00:02.7274400
Parallel        took 00:00:00.7874578

PLINQ OrderBy() is much faster than LINQ OrderBy(), but slower than Array.Sort().

QuicksortSequential() is around the same speed as Array.Sort()

But the interesting thing here is that QuicksortParallelOptimised() is noticeably faster on my system - so it's definitely an efficient way of sorting if you have enough processor cores.

Here's the full compilable console app. Remember to run it in RELEASE mode - if you run it in DEBUG mode the timing results will be woefully incorrect.

using System;
using System.Diagnostics;
using System.Linq;
using System.Threading.Tasks;

namespace Demo
{
    class Program
    {
        static void Main()
        {
            int n = 20000000;
            int[] a = new int[n];
            var rng = new Random(937525);

            for (int i = 0; i < n; ++i)
                a[i] = rng.Next();

            var b = a.ToArray();
            var d = a.ToArray();

            var sw = new Stopwatch();

            sw.Restart();
            var c = a.OrderBy(x => x).ToArray(); // Need ToArray(), otherwise it does nothing.
            Console.WriteLine("LINQ OrderBy() took " + sw.Elapsed);

            sw.Restart();
            var e = a.AsParallel().OrderBy(x => x).ToArray(); // Need ToArray(), otherwise it does nothing.
            Console.WriteLine("PLINQ OrderBy() took " + sw.Elapsed);

            sw.Restart();
            Array.Sort(d);
            Console.WriteLine("Array.Sort() took " + sw.Elapsed);

            sw.Restart();
            QuicksortSequential(a, 0, a.Length-1);
            Console.WriteLine("Sequential took " + sw.Elapsed);

            sw.Restart();
            QuicksortParallelOptimised(b, 0, b.Length-1);
            Console.WriteLine("Parallel took " + sw.Elapsed);

            // Verify that our sort implementation is actually correct!

            Trace.Assert(a.SequenceEqual(c));
            Trace.Assert(b.SequenceEqual(c));
        }

        static void QuicksortSequential<T>(T[] arr, int left, int right)
        where T : IComparable<T>
        {
            if (right > left)
            {
                int pivot = Partition(arr, left, right);
                QuicksortSequential(arr, left, pivot - 1);
                QuicksortSequential(arr, pivot + 1, right);
            }
        }

        static void QuicksortParallelOptimised<T>(T[] arr, int left, int right)
        where T : IComparable<T>
        {
            const int SEQUENTIAL_THRESHOLD = 2048;
            if (right > left)
            {
                if (right - left < SEQUENTIAL_THRESHOLD)
                {
                    QuicksortSequential(arr, left, right);
                }
                else
                {
                    int pivot = Partition(arr, left, right);
                    Parallel.Invoke(
                        () => QuicksortParallelOptimised(arr, left, pivot - 1),
                        () => QuicksortParallelOptimised(arr, pivot + 1, right));
                }
            }
        }

        static int Partition<T>(T[] arr, int low, int high) where T : IComparable<T>
        {
            int pivotPos = (high + low) / 2;
            T pivot = arr[pivotPos];
            Swap(arr, low, pivotPos);

            int left = low;
            for (int i = low + 1; i <= high; i++)
            {
                if (arr[i].CompareTo(pivot) < 0)
                {
                    left++;
                    Swap(arr, i, left);
                }
            }

            Swap(arr, low, left);
            return left;
        }

        static void Swap<T>(T[] arr, int i, int j)
        {
            T tmp = arr[i];
            arr[i] = arr[j];
            arr[j] = tmp;
        }
    }
}
Sign up to request clarification or add additional context in comments.

10 Comments

I tried this code yesterday, and AFAIR this pivot is very bad when array is already sorted (for example, on new int[20*1000*1000])
Only naive realisation of pivot get a valid result without cutting one element for each call. This is why I switched to merge sort, which has no worst case. Well, now the problem is the pivot. I'l try to solve it myself. Thanks.
@AlexZhukovskiy Yes, it looks like the crucial remaining thing is a good choice of pivot.
I'm not sure if I understand what are you talking about. For array of zeroes this pivot will return a left bound for every call. So there will be N calls of pivot which is slow enough. Of course on real distinct data this pivot works fine, but I wanted to use this sort as generic sorting method. But it's not good when in some cases performance degrade to level which is lower than even bubblesort has. And i was looking for something that could enchance this special case. Just comment code where you are using rng (lines 3-6) and you will see it yourself
@AlexZhukovskiy I expect you've already seen this, but just in case you haven't: en.wikipedia.org/wiki/Quicksort#Choice_of_pivot
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.