Finding Number of duplicates in an array

Question

I have an array, say 1,3,3,1,2 The output of the code must be 4(2 repetitions of 1 + 2 repetitions of 3=4). How can I do this in C? Here's my attempt.

#include <stdio.h>
int main(){
int n,i,j,temp;
scanf("%d",&n);
int arr[n];
for(i=0;i<n;i++){
  scanf("%d",&arr[i]);
}
for(i=0;i<n;i++){
  int min = i;
  for(j=i+1;j<n;j++){
    if(arr[j]<arr[min]) min=j;
  }
  temp= arr[min];
  arr[min]=arr[i];
  arr[i]=temp;
  
}
  int count=1;
  for(i=0;i<n;i++){
    if(arr[i]==arr[i+1])count++;
    else continue;
  }
  printf("%d",count);
}

Can you explain what is wrong with your attempt? Why are you providing a program that supposedly solves the problem, and then asking how to solve the problem? — paddy
– paddy, Commented Mar 7, 2022 at 7:49
In the last iteration through your last loop when i == n - 1, you are accessing arr[i + 1] == arr[n] out of bounds. Your approach accounts only for actual duplicates, so 1×1 and 1×3. And count should start at 0, not 1. — M Oehm
– M Oehm, Commented Mar 7, 2022 at 7:57

Vlad from Moscow · Accepted Answer · 2022-03-07 09:04:32Z

1

What you need is to change this for loop.

int count=1;
for(i=0;i<n;i++){
  if(arr[i]==arr[i+1])count++;
  else continue;
}

It can look for example the following way

int count = 0;

for ( i = 0; i < n; )
{
    int j = i;
    while ( ++i < n && arr[i-1] == arr[i] );

    if ( !( i - j < 2 ) ) count += i - j;
}

answered Mar 7, 2022 at 9:04

Vlad from Moscow

313k27 gold badges204 silver badges358 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

paddy · Accepted Answer · 2022-03-07 08:29:42Z

It looks like your loop has a couple of problems.

It indexes past the end of the array, which is undefined behavior
It doesn't understand when to count the first item in a group of duplicates

Regarding #1 it's nice to just start your loop at 1 instead of 0 and then check index i-1 against i.

Regarding #2 your code works but only when there's only one number that has duplicates. This is because you started the count at 1. However, when you encounter another group, that assumption breaks down. The simplest way is to just record whether you're starting a new group.

Let's put this all together:

int count = 0;
int first = 1;
for(i = 1; i < n; i++) {
    if (arr[i-1] == arr[i]) {
        count += first + 1;
        first = 0;
    } else {
        first = 1;
    }
}

As for the sorting step, it's using a wildly inefficient algorithm. This is fine for small datasets, but you'll have a problem if you have a very large number of inputs. It would be wise to use something like qsort instead. There are many examples out there for how to do this.

So, your runtime right now is O(N^2). With quicksort it becomes O(N.logN).

You can probably reduce runtime further with something like a hash table that simply stores how many of each value you've found, which you update as they arrive.

If your data ranges are well-defined and small enough, you might also benefit from using a large array instead of a hash table and store a single bit for each possible number representing when a number is seen. Actually for your case you'd need two of these because of the "first in the group" problem. Now, each number that arrives sets the "seen" bit. If it was already seen, set the "duplicate" bit and increment the count. If the "duplicate" bit is not set, increment the count. Now you pretty much guarantee blazing-fast O(N) runtime where testing for and counting a duplicate value is O(1).

Collectives™ on Stack Overflow

Finding Number of duplicates in an array

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related