3

I have an unsorted List with 100 million entries. RAM is not an issue. I am not all together happy with the speed of List.Sort and was wondering whether there is a reliable technique to sort my list to utilize multiple processors.

I looked at Parallel.Invoke but can't figure out how to actually use it for sorting, perhaps it can't be. Any ideas?

6
  • I don't know of an existing implementation in .NET, but algorithmically speaking, I know there's an O(log n) algorithm that's the equivalent of parallel mergesort. There's also an easier O(log^2 n) algorithm called bitonic sort. Commented May 17, 2012 at 18:40
  • Not sure how multiple CPU's come into play here. But if all other straightforward methods fail, I'd try to implement Binary tree Commented May 17, 2012 at 18:44
  • You said that "RAM is not an issue". Is it not an issue to the point that you can fit all your data in memory, or do you have additional memory to fit your data twice? If you have twice the memory, you could sort in parallel K sub-lists of length N/K, where N is the total length and K is the number of CPUs, and then run a K-way merge on a single CPU. While you can do merges in place (search Kronrod's algorithm if you're curious) the implementation is too difficult to attempt as an optimization. Commented May 17, 2012 at 18:52
  • @dasblinkenlight I can fit everything into RAM 100 times over. Commented May 17, 2012 at 19:07
  • @AngryHacker You can try the sort/merge approach then: splitting K ways is trivial, and merging K ways is only marginally more difficult than the classic two-way merging. Commented May 17, 2012 at 19:09

1 Answer 1

3

In another StackOverflow question, I found this:

private void QuicksortSequential<T>(T[] arr, int left, int right)  
where T : IComparable<T> 
{ 
    if (right > left) 
    { 
        int pivot = Partition(arr, left, right); 
        QuicksortSequential(arr, left, pivot - 1); 
        QuicksortSequential(arr, pivot + 1, right); 
    } 
} 

private void QuicksortParallelOptimised<T>(T[] arr, int left, int right)  
where T : IComparable<T> 
{ 
    const int SEQUENTIAL_THRESHOLD = 2048; 
    if (right > left) 
    { 
        if (right - left < SEQUENTIAL_THRESHOLD) 
        { 

            QuicksortSequential(arr, left, right); 
        } 
        else 
        { 
            int pivot = Partition(arr, left, right); 
            Parallel.Do( 
                () => QuicksortParallelOptimised(arr, left, pivot - 1), 
                () => QuicksortParallelOptimised(arr, pivot + 1, right)); 
        } 
    } 
} 

Different (and more complete, though not sure about 'better') implementation from answers to same question:

/// <summary> 
/// Parallel quicksort algorithm. 
/// </summary> 
public class ParallelSort 
{ 
    #region Public Static Methods 

    /// <summary> 
    /// Sequential quicksort. 
    /// </summary> 
    /// <typeparam name="T"></typeparam> 
    /// <param name="arr"></param> 
    public static void QuicksortSequential<T>(T [] arr) where T : IComparable<T> 
    { 
        QuicksortSequential(arr, 0, arr.Length - 1); 
    } 

    /// <summary> 
    /// Parallel quicksort 
    /// </summary> 
    /// <typeparam name="T"></typeparam> 
    /// <param name="arr"></param> 
    public static void QuicksortParallel<T>(T[] arr) where T : IComparable<T> 
    { 
        QuicksortParallel(arr, 0, arr.Length - 1); 
    } 

    #endregion 

    #region Private Static Methods 

    private static void QuicksortSequential<T>(T[] arr, int left, int right)  
        where T : IComparable<T> 
    { 
        if (right > left) 
        { 
            int pivot = Partition(arr, left, right); 
            QuicksortSequential(arr, left, pivot - 1); 
            QuicksortSequential(arr, pivot + 1, right); 
        } 
    } 

    private static void QuicksortParallel<T>(T[] arr, int left, int right)  
        where T : IComparable<T> 
    { 
        const int SEQUENTIAL_THRESHOLD = 2048; 
        if (right > left) 
        { 
            if (right - left < SEQUENTIAL_THRESHOLD) 
            { 
                QuicksortSequential(arr, left, right); 
            } 
            else 
            { 
                int pivot = Partition(arr, left, right); 
                Parallel.Invoke(new Action[] { delegate {QuicksortParallel(arr, left, pivot - 1);}, 
                                               delegate {QuicksortParallel(arr, pivot + 1, right);} 
                }); 
            } 
        } 
    } 

    private static void Swap<T>(T[] arr, int i, int j) 
    { 
        T tmp = arr[i]; 
        arr[i] = arr[j]; 
        arr[j] = tmp; 
    } 

    private static int Partition<T>(T[] arr, int low, int high)  
        where T : IComparable<T> 
    { 
        // Simple partitioning implementation 
        int pivotPos = (high + low) / 2; 
        T pivot = arr[pivotPos]; 
        Swap(arr, low, pivotPos); 

        int left = low; 
        for (int i = low + 1; i <= high; i++) 
        { 
            if (arr[i].CompareTo(pivot) < 0) 
            { 
                left++; 
                Swap(arr, i, left); 
            } 
        } 

        Swap(arr, low, left); 
        return left; 
    } 

    #endregion 
}
Sign up to request clarification or add additional context in comments.

5 Comments

Do you have an implementation of the Partition method? Also my copy of Parallel class (.net 4) does not have a Do method.
In the other question I linked to, there's another implementation that shows Partition and uses Parallel.Invoke rather than Parallel.Do. That might work for you instead.
That is odd, the parallel implementation takes double the time on a Dual Core box. Let me try it on a 24 CPU box, see if any improvements there.
Just an FYI, once I got to an 8 CPU box, the parallel solution started to be slightly faster than a sequential one.
It'd a good idea to not only have a threshold for the parallel version, but also change to InsertionSort for the serial version under some specific threshold (usually something in the order of 8-20 elements). That should give another speedup compared to the library version (which certainly does that too)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.