Due to a series of blog posts that I’m writing on TFS and MS Cognitive Services, I came across a requirement to identify duplicate values in a dictionary. For example, imagine you had an actual physical dictionary, and you wanted to find all the words that meant the exact same thing. Here’s the set-up for the test:
Dictionary<int, string> test = new Dictionary<int, string>() { { 1, "one"}, { 2, "two" }, { 3, "one" }, { 4, "three" } }; DisplayDictionary("Initial Collection", test);
I’m outputting to the console at every stage, so here’s the helper method for that:
private static void DisplayDictionary(string title, Dictionary<int, string> test) { Console.WriteLine(title); foreach (var it in test) { Console.WriteLine($"Key: {it.Key}, Value: {it.Value}"); } }
Finding Duplicates
LINQ has a special method for this, it’s Intersect. For flat collections, this works excellently, but no so well for Dictionaries; here was my first attempt:
Dictionary<int, string> intersect = test.Intersect(test) .ToDictionary(i => i.Key, i => i.Value); DisplayDictionary("Intersect", intersect);
As you can see, the intersect doesn’t work very well this time (don’t tell Chuck).
Manual Intersect
The next stage then is to roll your own; a pretty straightforward lambda in the end:
var intersect2 = test.Where(i => test.Any(t => t.Key != i.Key && t.Value == i.Value)) .ToDictionary(i => i.Key, i => i.Value); DisplayDictionary("Manual Intersect", intersect2);
This works much better.