Remove duplicates from Array without using Hash Table

Question

i have an array which might contain duplicate elements(more than two duplicates of an element). I wonder if it's possible to find and remove the duplicates in the array:

without using Hash Table (strict requirement)
without using a temporary secondary array. No restrictions on complexity.

P.S: This is not Home work question

Was asked to my friend in yahoo technical interview

Despite "no restrictions on complexity", I'd not personally hire anyone who gave an O(n^2) response to this :P — Billy ONeal
– Billy ONeal, Commented Dec 9, 2010 at 7:06
@Billy: I think the correct attitude for a candidate is to explain the trade-off: in-place sorting destroys the original order but satisfies the immediate functional requirement, whereas an O(N^2) can be expected to be slower for large N but can preserve the order. Neither answer is necessarily better in any absolute sense, at least when the question goes out of its way to say there's no restrictions on complexity. — Tony Delroy
– Tony Delroy, Commented Dec 9, 2010 at 8:03
@Tony: If you need to preserve order you can always re-sort the target array by the elements' original positions and still avoid quadradic complexity. — Billy ONeal
– Billy ONeal, Commented Dec 9, 2010 at 8:08
@Billy: can you? How do you know their original positions? You're not allowed a temporary secondary array to record them. They might have been in some order that's not related to any data inside them, or even necessarily implied by data elsewhere in the program. — Tony Delroy
– Tony Delroy, Commented Dec 9, 2010 at 8:56
@Tony: Why not? :P (I know the interview question said that the source couldn't be copied, but the interview question didn't also add the order requirement) In a real world program, it's better to spend the small space overhead and save quadradic complexity. — Billy ONeal
– Billy ONeal, Commented Dec 9, 2010 at 16:20

Billy ONeal · Accepted Answer · 2010-12-09 07:31:37Z

8

Sort the source array. Find consecutive elements that are equal. (I.e. what std::unique does in C++ land). Total complexity is N lg N, or merely N if the input is already sorted.

To remove duplicates, you can copy elements from later in the array over elements earlier in the array also in linear time. Simply keep a pointer to the new logical end of the container, and copy the next distinct element to that new logical end at each step. (Again, exactly like std::unique does (In fact, why not just download an implementation of std::unique and do exactly what it does? :P))

edited Dec 9, 2010 at 7:31

answered Dec 9, 2010 at 7:00

Billy ONeal

107k61 gold badges329 silver badges566 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

codaddict · Accepted Answer · 2010-12-09 07:03:07Z

5

O(NlogN) : Sort and replace consecutive same element with one copy.

O(N²) : Run nested loop to compare each element with the remaining elements in the array, if duplicate found, swap the duplicate with the element at the end of the array and decrease the array size by 1.

answered Dec 9, 2010 at 7:03

codaddict

457k83 gold badges501 silver badges537 bronze badges

2 Comments

SuperMan Over a year ago

how do i check for consecutive elements equality ?

Tony Delroy Over a year ago

for a couple seconds there, I thought you were enumerating steps rather than alternatives... seen too many of your posts to believe such a mistake so kept puzzling it out :-)

Gabi Purcaru · Accepted Answer · 2010-12-09 07:04:32Z

3

No restrictions on complexity.

So this is a piece of cake.

// A[1], A[2], A[3], ... A[i], ... A[n]

// O(n^2)
for(i=2; i<=n; i++)
{
    duplicate = false;
    for(j=1; j<i; j++)
        if(A[i] == A[j])
             {duplicate = true; break;}
    if(duplicate)
    {
        // "remove" A[i] by moving all elements from its left over it
        for(j=i; j<n; j++)
            A[j] = A[j+1];
        n--;
    }
}

answered Dec 9, 2010 at 7:04

Gabi Purcaru

31.7k9 gold badges81 silver badges96 bronze badges

2 Comments

SuperMan Over a year ago

Well ur code was good, but this was an interview question, interviewer prefers O(NlogN) over O(N^2).

SuperMan Over a year ago

Well no restriction , but given there are two options were in u have one of them better would be chosen, My friend gave him O(N^2) solution, result of interview is awaited..

Theo · Accepted Answer · 2010-12-09 07:27:15Z

2

In-place duplicate removal that preserves the existing order of the list, in quadratic time:

for (var i = 0; i < list.length; i++) {
  for (var j = i + 1; j < list.length;) {
    if (list[i] == list[j]) {
      list.splice(j, 1);
    } else {
      j++;
    }
  }
}

The trick is to start the inner loop on i + 1 and not increment the inner counter when you remove an element.

The code is JavaScript, splice(x, 1) removes the element at x.

If order preservation isn't an issue, then you can do it quicker:

list.sort();

for (var i = 1; i < list.length;) {
  if (list[i] == list[i - 1]) {
    list.splice(i, 1);
  } else {
    i++;
  }
}

Which is linear, unless you count the sort, which you should, so it's of the order of the sort -- in most cases n × log(n).

edited Dec 9, 2010 at 7:27

answered Dec 9, 2010 at 7:03

Theo

133k22 gold badges170 silver badges213 bronze badges

Comments

cfern · Accepted Answer · 2010-12-09 09:02:28Z

In functional languages you can combine sorting and unicification (is that a real word?) in one pass. Let's take the standard quick sort algorithm:

- Take the first element of the input (x) and the remaining elements (xs)
- Make two new lists
- left: all elements in xs smaller than or equal to x
- right: all elements in xs larger than x
- apply quick sort on the left and right lists
- return the concatenation of the left list, x, and the right list
- P.S. quick sort on an empty list is an empty list (don't forget base case!)

If you want only unique entries, replace

left: all elements in xs smaller than or equal to x

with

left: all elements in xs smaller than x

This is a one-pass O(n log n) algorithm.

Example implementation in F#:

let rec qsort = function
    | [] -> []
    | x::xs -> let left,right = List.partition (fun el -> el <= x) xs
               qsort left @ [x] @ qsort right

let rec qsortu = function
    | [] -> []
    | x::xs -> let left = List.filter (fun el -> el < x) xs
               let right = List.filter (fun el -> el > x) xs
               qsortu left @ [x] @ qsortu right

And a test in interactive mode:

> qsortu [42;42;42;42;42];;
val it : int list = [42]
> qsortu [5;4;4;3;3;3;2;2;2;2;1];;
val it : int list = [1; 2; 3; 4; 5]
> qsortu [3;1;4;1;5;9;2;6;5;3;5;8;9];;
val it : int list = [1; 2; 3; 4; 5; 6; 8; 9]

This ignores the "without...temporary secondary array" requirement.

Matthieu M. · Accepted Answer · 2010-12-09 14:19:51Z

Since it's an interview question it is usually expected by the interviewer to be asked precisions about the problem.

With no alternative storage allowed (that is O(1) storage allowed in that you'll probably use some counters / pointers), it seems obvious that a destructive operation is expected, it might be worth pointing it out to the interviewer.

Now the real question is: do you want to preserve the relative order of the elements ? ie is this operation supposed to be stable ?

Stability hugely impact the available algorithms (and thus the complexity).

The most obvious choice is to list Sorting Algorithms, after all, once the data is sorted, it's pretty easy to get unique elements.

But if you want stability, you cannot actually sort the data (since you could not get the "right" order back) and thus I wonder if it solvable in less than O(N**2) if stability is involved.

d48 · Accepted Answer · 2013-03-15 21:19:32Z

0

doesn't use a hash table per se but i know behind the scenes it's an implementation of one. Nevertheless, thought I might post in case it can help. This is in JavaScript and uses an associative array to record duplicates to pass over

function removeDuplicates(arr) {
    var results = [], dups = []; 

    for (var i = 0; i < arr.length; i++) {

        // check if not a duplicate
        if (dups[arr[i]] === undefined) {

            // save for next check to indicate duplicate
            dups[arr[i]] = 1; 

            // is unique. append to output array
            results.push(arr[i]);
        }
    }

    return results;
}

answered Mar 15, 2013 at 21:19

d48

6927 silver badges15 bronze badges

Comments

Soudipta Dutta · Accepted Answer · 2022-04-04 18:56:13Z

0

Let me do this in Python.

array1 = [1,2,2,3,3,3,4,5,6,4,4,5,5,5,5,10,10,8,7,7,9,10]

array1.sort()
print(array1)

current = NONE
count = 0 

# overwriting the numbers at the frontal part of the array
for item in array1:
    if item != current:
        array1[count] = item
        count +=1
        current=item
        
       

print(array1)#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 5, 5, 5, 5, 6, 7, 7, 8, 9, 10, 10, 10]

print(array1[:count])#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

The most Efficient method is :

array1 = [1,2,2,3,3,3,4,5,6,4,4,5,5,5,5,10,10,8,7,7,9,10]

array1.sort()
print(array1)

print([*dict.fromkeys(array1)])#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

#OR#
aa = list(dict.fromkeys(array1))
print( aa)#[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

edited Apr 4, 2022 at 18:56

answered Mar 29, 2022 at 12:50

Soudipta Dutta

2,0721 gold badge16 silver badges11 bronze badges

Collectives™ on Stack Overflow

Remove duplicates from Array without using Hash Table

8 Answers 8

Comments

2 Comments

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Comments

2 Comments

2 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related