1

I have an array of strings, each one with a different length. e.g:

s[0] = "sSWXk"
s[1] = "qCk"
s[2] = "sOQQXPbk"
.
.
.
s[x] = "KVfdQk";

I also am given that

n = s[0].length() + s[1].length() + ... + s[x].length()

I need a sorting algorithm with time complexity O(n) for sorting these strings lexicographically, so that (for example)

a < ab < b < bbc < c < ca

Any suggestions? The time complexity is the essential requirement in the algorithm.

8
  • 8
    Is this homework? Commented Jun 12, 2012 at 19:37
  • 2
    The person who gives you an answer to this in general will get an accepted answer and a few million dollars. You can't do sorting (in general) in O(n), without making some serious CS innovations. Commented Jun 12, 2012 at 19:38
  • 2
    @Oleksi: You can do it in this instance. The OP isn't exactly asking about the general case. Commented Jun 12, 2012 at 19:40
  • 1
    take care about definition of "n". here "n" is not number of elements, but number of total characters of all strings! Commented Jun 12, 2012 at 19:40
  • 1
    @EhsanKhodarahmi: So you're saying that if you have m strings each 1000 characters long, you expect to be able to sort them in O(m) time? (since O(1000*m) is the same as O(m)) Commented Jun 12, 2012 at 19:44

3 Answers 3

11

There is a data structure called a trie that is optimally suited for this. If you insert all the words into the trie and then do a DFS over the trie, you will get the words back in sorted order. Doing so takes time O(n) as well, where n is the total number of characters in all the strings.

Since I assume that this is homework, I'll leave the details of how to implement the trie as an exercise. :-)

Hope this helps!

Sign up to request clarification or add additional context in comments.

Comments

1

I've had a similar test question, which I got incorrect, but it has bugged me since then. So I kept looking into it, and I think there are two other methods that might produce results in linear time. One is to treat the strings as a series of integers with a base of 26 and use radix sort on the array after padding the strings to be of equal length (in some way - there's probably a slick way to do this without increasing storage space drastically, I just haven't worked out the details). I haven't built an example or tested it, so I can't say with certainty this would work, but the principle of it seems sound. Another method would be bucket sort - using a 26 element array that contain pointers to 26 lists (the buckets). Sort each string into the appropriate the bucket-linked lists (those starting with 'a' into the list pointed at by the first array element, etc.) Then sort each list using a standard O(n log n) method. I don't fully understand the math, but according to Cormen's textbook "Introduction to Algorithms", using bucket sort in this way winds up having a linear time complexity. It seems the space would be larger than the Radix-style method, though, so long as the right-padding requirement could be met without actually allocating a bunch of storage for it.

Comments

0

Because this is a homework problem I can only give a hint. Hint: Use a modified version of counting sort. It's practical if we assume a char is 8 bits.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.