0

There are 1002 numbers in an array and two numbers are the same. How would you find the same number in this array efficiently or is there an efficient algorithm?

Here is my algorithm:

for i in range(0, 1002):
    for j in range(i+1, 1002):
        if(a[i]==a[j]):
           return a[i]
9
  • 4
    What is range of values of numbers? use hashing! It will work in O(n) time! Commented Jun 15, 2014 at 18:42
  • 1
    It depends on the allowed range for the array elements. Commented Jun 15, 2014 at 18:45
  • The number range is not specified as between 1 and 1002. Commented Jun 15, 2014 at 18:47
  • Still you will have some range! what is that range? What is your requirement a memory efficient algorithm? or time efficient algorithm? Commented Jun 15, 2014 at 18:48
  • 2
    Without "efficiency" defined, it's an ambiguous question with no real answer. If I asked this question as an interviewer, I'd be fishing for a candidate to ask what I meant before I cared about an answer. Commented Jun 15, 2014 at 18:57

4 Answers 4

2

This should work!

#include<stdio.h>
#define RANGE 1000000001
int main()
{
  int arr[1002];//your all numbers;
  short int hash[RANGE];//Your range of numbers 
  long long int i;
  for(i = 0; i < RANGE; i++)
    hash[i] = 0;
  for(i = 0; i < 1002; i++)
    {
      if(hash[arr[i]] != 0)
    {
      printf("Duplicate number is:%d\n",arr[i]);
      break;
    }
      else
    hash[arr[i]]++;
    }
  return 0;
}
Sign up to request clarification or add additional context in comments.

4 Comments

I wouldn't call the identity function a hash function, so the name is a bit misleading :) And you should really use RANGE = 1003
Note that this solution requires at least 1000000001 operations for zeroing out hash, which is about 1000 times more than the original 1000*10001, and cannot cope with numbers greater than 1000000001.
@NiklasB. If I use RANGE = 1003 then I will have to limit range of my values to 1003 but OP says no such value range is defined this is why I have used RANGE = 1000000001. And yes hash function is a bit misleading! :p
Well in that case you must use a hash table ;)
1

I think the most efficient solution is to use hash set:

from sets import Set
s=Set()
for x in [1,2,3,4,5,2,3,1]:
  if x in s:
    print x
    break
  s.add(x)

Comments

0

If your values are numbers, you can use radix sort to fill up a buffer and check for an element that appeared twice.

Comments

0

Your algortihm isn't bad at all ! In the worst case you loop n*(n-1)/2, meaning a complexity of O(n²).

The most favourable condition would be a sorted array. THen you could just loop through it comparing each element with its predecessor. The worst is n-1 comparisons, otherwhise said a complexity of O(n).

However, I assume that the array is not sorted. Sorting it would imply the cost of the sort. Quiksort algorithm, which is pretty good here, has a worstcase of O(n²). So sorting+traversing would have a cost comparable to your algorithm.

Using a hash... well, it's optimal if memory is not a problem (see exellent solution from @Nullpointer. The algorithm cost is the simple traversal, which is O(n).

However in real life, you risk to have memory constraints, meaning shorter hash table and a hash function with risks of colisions (for example modulo size of table). For this reason you'll need to store for each hash value, the list of matching values. In such a situation, the worstcase is when all numbers have the same hash H. In this case, you would calculate each hash (simple O(n) traversal), but when inserting the hash, you'd need to loop through the colision list. A quick calculation shows that again you'd have n*(n-1)/2 comparison, and again a compelxity O(n²), the same as your original proposal.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.