1

I am taking the cs50 course on edx and am doing the hacker edition of pset3 (in essence it is the advanced version). Basically the program takes a value to be searched for as the command-line argument, and then asks for a bunch of numbers to be used in an array. Then it sorts and searches that array for the value entered at the command-line. The way the program is implemented, it uses a pseudo-random number generator to feed the numbers for the array.

The task is to write the search and sorting functions. I already have the searching function, but the sorting function is supposed to be O(n). In the regular version you were supposed to use a O(n ^ 2) algorithm which wasn't a problem to implement. Also using a log n algorithm wouldn't be an issue either. But the problem set specifically ask's for a big O(n) algorithm.

It gives a hint in saying that, since no number in the array is going to be negative, and the not greater than LIMIT (the numbers output by the generator are modulo'd so they are not greater than 65000). But how does that help in getting the algorithm to be O(n)?

But the counting sort algorithm, which purports to be an acceptable solution, returns a new sorted array rather than actually sort the original one, and that contradicts with the pset specification that reads 'As this return type of void implies, this function must not return a sorted array; it must instead "destructively" sort the actual array that it’s passed by moving around the values therein.'

Also, if we decide to copy the sorted array onto the original one using another loop, with so many consecutive loops, I'm not sure if the sorting function can be considered to have a running time of O(n) anymore. Here is the actual pset, the question is about the sorting part.

Any ideas to how to implement such an algorithm would be greatly appreciated. It's not necessary to provide actual code, rather just the logic of you can create a O(n) algorithm under the conditions provided.

5
  • 1
    A quick search via Google for O(n) sorting methods reveals a couple different options, like radix sort on uniform-length keys, or bucket sort, ... Commented May 7, 2014 at 16:23
  • Just to give a hint: does using a BitSet help? Commented May 7, 2014 at 16:23
  • I did search Google, but using something like radix seemed a little more than what they were asking for. It also used a second array, but you're supposed to only use the passed array Commented May 7, 2014 at 16:34
  • 2
    @user3612934 That's a common tradeoff in programming - you can use a naive, but slow, algorithm that uses no extra space for auxiliary data structures, or you can use a faster, usually more complex, algorithm that uses additional storage space to achieve the speedup... Commented May 7, 2014 at 16:41
  • 1
    I highly recommend a book for you: "Cracking the coding interview". It will give you a nice problem-solving cheatsheet which you can use in your daily work (or even on interviews). Here counting sort is the obvious choice since the number of options is limited. Commented May 7, 2014 at 17:07

2 Answers 2

5

It gives a hint in saying that, since no number in the array is going to be negative, and the not greater than LIMIT (the numbers outputted by the generator are modulo'd to not be higher than 65000). But how does that help in getting the algorithm to be O(n).

That hint directly seems to point towards counting sort. You create 65000 buckets and use them to count the number of occurrences of each number. Then, you just revisit the buckets and you have the sorted result.

It takes advantage of the fact that:

  1. They are integers.
  2. They have a limited range.

Its complexity is O(n) and as this is not a comparison-based sort, the O(nlogn) lower bound on sorting does not apply. A very good visualization is here.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, this fits exactly. I didn't know about the algorithm, but it seems so obvious once you see it.
2

As @DarkCthulhu said, counting sort is clearly what they were urging you to use. But you could also use a radix sort.

Here is a particularly concise radix 2 sort that exploits a nice connection to Gray codes. In your application it would require 16 passes over the input, one per data bit. For big inputs, the counting sort is likely to be faster. For small ones, the radix sort ought to be fster because you avoid initializing 256K bytes or more of counters.

See this article for explanation.

void sort(unsigned short *a, int len) 
{  
  unsigned short bit, *s = a, *d = safe_malloc(len * sizeof *d), *t;  
  unsigned is, id0, id1;

  for (bit = 1; bit; bit <<= 1, t = s, s = d, d = t)
    for (is = id0 = 0, id1 = len; is < len; ++is)
      if (((s[is] >> 1) ^ s[is]) & bit)
        d[--id1] = s[is];
      else
        d[id0++] = s[is];
  free(d);
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.