0

I have a large list of string arrays, and within this List<string[]> there can be arrays with all same values (and possibly with different indexes). I'm looking to find and count these duplicate string arrays and have a Dictionary<string[], int> with int being the count (however if there is a better way than using a dictionary I would be interested in hearing). Does anyone have any advice on how to achieve this? Any and all input is very appreciated, thanks!

9
  • 1
    can u give examples on what you are trying to achieve ? Commented Mar 22, 2016 at 1:51
  • 1
    What is the int key supposed to store? That cant be the count since you could have many arrays with the same count Commented Mar 22, 2016 at 1:56
  • Please elaborate your question with sample data. Dictionary<int, string[]> is quite confusing Commented Mar 22, 2016 at 2:04
  • @bryanmac ahh yes was just thinking of int to be the count but Dictionary<string[], int> would make more sense. thanks for pointing that out Commented Mar 22, 2016 at 2:05
  • @Saleem Thanks I revised the question slightly and will try to have a sample up soon Commented Mar 22, 2016 at 2:10

3 Answers 3

1

You can use linq GroupBy with a IEqualityComparer to compare the string[]

var items = new List<string[]>() 
    { 
        new []{"1", "2", "3" ,"4" }, 
        new []{"4","3", "2", "1"},
        new []{"1", "2"}
    };

var results = items
        .GroupBy(i => i, new UnorderedEnumerableComparer<string>())
        .ToDictionary(g => g.Key, g => g.Count());

The IEqualityComparer for the unordered list

public class UnorderedEnumerableComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    public bool Equals(IEnumerable<T> x, IEnumerable<T> y)
    {
        return x.OrderBy(i => i).SequenceEqual(y.OrderBy(i => i));
    }
    // Just the count of the array, 
    // it violates the rule of hash code but should be fine here
    public int GetHashCode(IEnumerable<T> obj)
    {
        return obj.Count();
    }
}

.Net Fiddle

Sign up to request clarification or add additional context in comments.

Comments

0

You might find duplicate keys if you use number of occurrences as a Key to Dictionary I would suggest use Dictionary<string, int> where key represents the string and value represents no of occurrences. Now we can use Linq statements.

var results = items.SelectMany(item=>item)
                   .GroupBy(item=>item)
                   .ToDictionary(g=>g.Key, g=>g.Count()); 

Other approach is having LookUp, which allows a collection of keys each mapped to one or more values

var lookup = items.SelectMany(item=>item)
                  .GroupBy(item=>item)
                  .ToLookup(c=>c.Count(), c=>c.Key);

Working example

6 Comments

Hmm I think this is just counting and grouping single strings from all arrays in list? It doesn't need to compare strings, but each array of strings and group by / count array with all same string values
in that case second approach (lookup) should work to you.
Ahh... now I got what you mean, you want grouping at per array and count duplicates with in that array, is that correct?
No i think your second approach is what I'm looking for, just running into an issue with SelectMany. I think because i actually have string[]'s inside of an object. Error CS0411: The type arguments for method System.Linq.Enumerable.SelectMany<TSource,TResult>(this System.Collections.Generic.IEnumerable<TSource>, System.Func<TSource,System.Collections.Generic.IEnumerable<TResult>>)' cannot be inferred from the usage. Try specifying the type arguments explicitly
Oh I got the same results with second approach. Nope there aren't any duplicates within each array - pretty much am looking to group and count in a scenario like: new [] { "camera", "lens", "tripod" } == new [] { "camera", "tripod", "lens" }
|
0
import java.util.Scanner;
public class Q1 {

public static void main(String[] args) {
    System.out.println("String entry here --> ");
    Scanner input = new Scanner(System.in);
    String entry = input.nextLine();
    String[] words = entry.split("\\s");         
    System.out.println(words.length);
    for(int i=0; i<words.length; i++){
        int count = 0;
        if(words[i] != null){
            for(int j=i+1;j<words.length;j++){
                if(words[j] != null){
                    if(words[i].equals(words[j])){
                        words[j] = null;
                        count++;
                    }
                }
                else{
                    continue;
                }
            }
            if(count != 0){
                System.out.println("Count of duplicate " + words[i] + " = " + count );

            }
        }
        else{
            continue;
        }
    }
    input.close();
}
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.