24

Why? Is it faster or more efficient?

For systems with one core, we can use quicksort. What should we use on systems with two cores, four cores, or eight cores?

5
  • 15
    Arrays.sort() only uses quicksort on primitive types. For Object[] or collections (Collections.sort()) merge sort is used. Commented Nov 29, 2010 at 15:14
  • 4
    ... because it's stable, whereas quicksort isn't. For primitive types it doesn't matter whether a sort is stable or not, because there are no values which compare unequal but have the same sort order. Floating point types raise some complication there, but not enough to require merge sort. Commented Nov 29, 2010 at 15:18
  • 4
    besides the reasons below, quicksort, being a divide & conquer algorithm, also exhibits good cache behaviour. Commented Nov 29, 2010 at 15:20
  • If you see the Arrays.java class, you can see that for a small array (length < 7) they use Insertion Sort, otherwise Quick Sort. Commented Feb 23, 2015 at 22:41
  • 2
    [They are similar questions, could be reference][1] [1]: stackoverflow.com/questions/3707190/… Commented Mar 9, 2015 at 21:59

12 Answers 12

37

Quicksort has the advantage of being completely in place, so it does not require any additional storage, while mergesort (which is actually used by Arrays.sort() for object arrays) and other (all?) guaranteed O(n*log n) algorithm require at least one full copy of the array. For programs that sort very large primitive arrays, that means potentially doubling the overall memory usage.

Sign up to request clarification or add additional context in comments.

4 Comments

+1, for both the memory point, and the primitive vs object difference
Heapsort is an in-place sort with worst-case O(n log n). (As are variants and hybrids like smoothsort and introsort.)
Isn't only the copy of half of the array required in mergesort?
Next, you might want to see stackoverflow.com/questions/3707190/…
20

The answer is in Jon L. Bentley and M. Douglas McIlroy’s “Engineering a Sort Function”, which the sort function cites.

Shopping around for a better qsort, we found that a qsort written at Berkeley in 1983 would consume quadratic time on arrays that contain a few elements repeated many times—in particular arrays of random zeros and ones. In fact, among a dozen different Unix libraries we found no qsort that could not easily be driven to quadratic behavior; all were derived from the Seventh Edition or from the 1983 Berkeley function.…

Unable to find a good enough qsort, we set out to build a better one. The algorithm should avoid extreme slowdowns on reasonable inputs, and should be fast on ‘random’ inputs. It should also be efficient in data space and code space. The sort need not be stable; its specification does not promise to preserve the order of equal elements.

The alternatives were heapsort and mergesort, since Java was created in the early 1990s. Mergesort is less desirable because it requires extra storage space. Heapsort has a better worst-case performance (O(n log n) compared to O(n^2)), but performs more slowly in practice. Thus, if you can control the worst case performance via good heuristics, a tuned quicksort is the way to go.

Java 7 is switching to Timsort, which was invented in 1993 (implemented in Python in 2002) and has a worst-case performance of O(n log n) and is a stable sort.

1 Comment

+1 for Timsort again. This is the first I've heard of it, thanks for sharing!
14

Quicksort has O(n log n) average and O(n^2) worst case performance, that is the best "average case" a sort algorithm can be, there are other sort algorithms that have this performance, but quicksort tends to perform better than most.

See: http://en.wikipedia.org/wiki/Quicksort

5 Comments

average case O(n log n) is not the best performance a sort can have. worst case O(n log n) is the best performance a comparison-based sort can have.
That is only the best that comparison based sorts can be. You can sort faster if you can avoid comparison based sorting.
@lijie Sorry that was what I meant to say, not enough sleep LOL. I'll amend.
It's okie, but the correction is not correct. quicksort does not have O(nlogn) worst case performance; its worst case performance is O(n^2). And Arrays.sort is a comparison based sort, but when statements like "the best possible worst case sort performance is O(nlogn)" are made then it causes people to forget that there are other sorting paradigms available.
Well comparison sorts are the type usually used, a radix or bucket sort is a different kettle of fish.
10

It is a tuned quicksort. If you are really interested you can read the material mentioned in the documentation.

The sorting algorithm is a tuned quicksort, adapted from Jon L. Bentley and M. Douglas McIlroy's "Engineering a Sort Function", Software-Practice and Experience, Vol. 23(11) P. 1249-1265 (November 1993).

And here is a bit of an explanation - the tuned version gives n*log(n) on many data sets:

This algorithm offers n*log(n) performance on many data sets that cause other quicksorts to degrade to quadratic performance

Comments

4

Arrays.sort() uses multiple sorting algorithms depending on the size and elements in the array.

  • Insertion sort for small arrays
  • Merge sort for mostly sorted arrays
  • A highly tuned and adaptable dual-pivot & single pivot quicksort for everything else

So in practice we see that quicksort is very fast for large arrays of primitives but has some pitfalls when it needs to adapt to partially sorted arrays, when comparisons between objects are slow, for stable sorting and more.

Comments

3

Compared with Quicksort, Mergesort has less number of comparisons but larger number of moving elements.

In Java, an element comparison is expensive but moving elements is cheap. Therefore, Mergesort is used in the standard Java library for generic sorting

In C++, copying objects can be expensive while comparing objects often is relatively cheap. Therefore, quicksort is the sorting routine commonly used in C++ libraries.

ref: http://www.cs.txstate.edu/~rp44/cs3358_092/Lectures/qsort.ppt

Comments

3

Since its been a while since last answer on this thread, here is some updates...

It depend on the complexity and its relevancy to size of array plus probability when java researched these algorithms and just decided depending on Measurements and Benchmarks.

According to JAVA JDK 1.8 DOCS its self explanatory where it chooses the algorithm not only one but up to four to choose from according to some thresholds...

/**
     * If the length of an array to be sorted is less than this
     * constant, Quicksort is used in preference to merge sort.
     */
    private static final int QUICKSORT_THRESHOLD = 286;

    /**
     * If the length of an array to be sorted is less than this
     * constant, insertion sort is used in preference to Quicksort.
     */
    private static final int INSERTION_SORT_THRESHOLD = 47;

    /**
     * If the length of a byte array to be sorted is greater than this
     * constant, counting sort is used in preference to insertion sort.
     */
    private static final int COUNTING_SORT_THRESHOLD_FOR_BYTE = 29;

    /**
     * If the length of a short or char array to be sorted is greater
     * than this constant, counting sort is used in preference to Quicksort.
     */
    private static final int COUNTING_SORT_THRESHOLD_FOR_SHORT_OR_CHAR = 3200;

Reference Java DOC JDK 8

It event evolved to use parallel Sorting Sorting in Java

Java 8 comes with a new API – parallelSort – with a similar signature to the Arrays.sort() API:

@Test
public void givenIntArray_whenUsingParallelSort_thenArraySorted() {
    Arrays.parallelSort(toSort);

    assertTrue(Arrays.equals(toSort, sortedInts));
}

Behind the scenes of parallelSort(), it breaks the array into different sub-arrays (as per granularity in the algorithm of parallelSort). Each sub-array is sorted with Arrays.sort() in different threads so that sort can be executed in parallel fashion and are merged finally as a sorted array.

Note that the ForJoin common pool is used for executing these parallel tasks and then merging the results.

The result of the Arrays.parallelSort is going to be same as Array.sort of course, it’s just a matter of leveraging multi-threading.

Finally, there are similar variants of API Arrays.sort in Arrays.parallelSort as well:

Arrays.parallelSort (int [] a, int fromIndex, int toIndex);

Summary: So as the Java API evolves with the HardWare and software in general there is more use for the multi threading and tuning here and there on the thresholds and algorithims.

Comments

1

First of all Arrays.sort doesn't only use quick sort , It uses multiple algorithms java1.6 onwards

See below code from Arrays class

/** * Sorts the specified array into ascending numerical order. * *

Implementation note: The sorting algorithm is a Dual-Pivot Quicksort * by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm * offers O(n log(n)) performance on many data sets that cause other * quicksorts to degrade to quadratic performance, and is typically * faster than traditional (one-pivot) Quicksort implementations. * * @param a the array to be sorted */ public static void sort(int[] a) { DualPivotQuicksort.sort(a); }

DualPivotQuicksort.sort(a); // This uses 5 algorithms internally depending upon dataset size 
do checkout the source code of Arrays class.

Before java 1.6 I think it was using three algorithm quick sort for primitive types such as int and mergesort for objects and when quick sort out performs it start heap sort, See here for more details http://cafe.elharo.com/programming/java-programming/why-java-util-arrays-uses-two-sorting-algorithms

Comments

0

Quicksort is fastest on average O(n log(n)), so Sun probably used that as a good metric.

5 Comments

Lots of sort algorithms are O(n log n).
@Paul: true, but since quicksort has been around for about 50 years, it's well understood and that might have been one reason for using it.
by that reasoning i'm sure insertion sort should be preferred (it probably has been around for more years than quicksort and even more well understood)
@lijie: except its complexity is O(n^2) on average.
yes that wasn't really a serious comment. It was just to point out that the reason given (it is old and understood) is probably not an important consideration at all... in fact among the more popularly known sorts, i think quicksort is relatively "recent"...
0

QuickSort is a common sorting algorithm. It's reasonably fast, except when the data to be sorted is already in inverse order. It's also efficient in space.

6 Comments

Surely Arrays.sort copes with reverse order without going O(n^2). So to the extent that it "is quicksort" (the premise of the question), "quicksort" doesn't mean, "the dumbest quicksort you could possibly write" ;-)
Well, yeah, it's a "Tuned Quicksort", so it's not really QuickSort.
there is no predefined worst case for quicksort, if one does not specify the pivot selection. so the claim about it performing worse when the data is reverse ordered is not right.
@lijie: that's just a name game, though - does "quicksort" mean, "the algorithm presented by Hoare", or does it mean, "any algorithm which looks a bit like the algorithm presented by Hoare"? I think Paul refers to this difference in his comment above, although it's incorrect to say "QuickSort is a common sorting algorithm" while also taking "QuickSort" to mean "original QuickSort with the pivot as the left-most element".
@Steve: well, yes i guess there are way too many variants. but the original one performs essentially as badly with a sorted list as well as a reverse-sorted list. but unless pivot selection is smart (e.g. the one used for a linear quickselect), avoiding O(n^2) is hard.
|
0

It depends on what you want to do. The problem with a normal quicksort is, that it can sometimes be in O(n²). So normaly you could use heap sort, but most times quick sort is faster.

However the Arrays.sort(...) implementation uses a "tuned tuned quicksort, adapted from Jon L. Bentley and M. Douglas McIlroy [...]" (according to the JavaDoc documentation). This algorithm has has some build in optimizations, that enables it to work on O(n*log(n)), where a normal quicksort would use O(n²).

Also the Arrays.sort algorithm is tested over and over again and you can be sure that it works and is bugfree (although this can't be guaranteed.)

iuiz

2 Comments

The docs, say, "This algorithm offers n*log(n) performance on many data sets", not all data sets. Suggests that it isn't O(n log n). What's b? Introsort would give that guarantee, but wasn't invented until 1997, 4 years after the paper cited in the Java docs.
i think he wanted to write n * log ( n ) :)
0

Arrays.sort() does not use quick-sort. Java 7 used TimSort which is a combination of Merge Sort and Insertion Sort. Java 8 uses parallel-sort when there are more number of elements, and uses multiple threads for sorting. Else it uses TimSort.

Thus the worst case time complexity is always O(nlogn)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.