List Duplicates Ignored in LINQ Merge - c#

Following on from this post is it possible to create a new list of the duplicate records found (and excluded) in the merge? I want to let the user know which records were excluded. The code I have is working in that it is correctly merging the data and excluding any duplicate keys, but I want to be able to show the keys excluded after the merge.
var fileLocation = #"D:\TFS2010-UK\Merge_Text\PM.INX";
var fileContents =
File.ReadLines(FileLocation, Encoding.Default)
.Select(line => line.Split(','))
.ToDictionary(line => line[0].Replace("\"", ""), line => line[1] + ',' + line[2] + ',' + line[3]);
// define an array of items to be added...
var newContent = new Dictionary<string, string>
{
{ "XYZ789", "\"XYZ789\",1,123.789" },
{ "GHI456", "\"GHI456\",2,123.456" },
{ "ABC123", "\"ABC123\",1,123.123" }
};
var uniqueElements = fileContents.Concat(newContent.Where(kvp => !fileContents.ContainsKey(kvp.Key)))
.OrderBy(x => x.Key)
.ToDictionary(y => y.Key, z => z.Value);
// append new lines to the existing file...
using (var writer = new StreamWriter(fileLocation))
{
// loop through the data to be written...
foreach (var pair in uniqueElements)
{
// and write it to the file...
writer.WriteLine("\"{0}\",{1}", pair.Key, pair.Value);
}
}
Many thanks. Martin

var removedKeys = newContent.Where(kvp => fileContents.ContainsKey(kvp.Key))
.Select(kvp => kvp.Key);
Simply select the keys in newContent that are already contained in fileContents.

Related

Show duplicates from lists c#

I'm working on a WPF application, and at one point, I have to get/show all the duplicates from the string list. (With the duplicated strings name and the number of how many of that same string is in the list)Like this for example: "The list contains the String 'Hello' 3 times." So far, I'm getting the string's name successfully but I can't manage to get the correct number of times it is presented in the list.
This is my code so far:
List<String> answerData = new List<String>();
using (MySqlCommand command = new MySqlCommand(query2, conn))
{
using (MySqlDataReader reader = command.ExecuteReader())
{
while (reader.Read())
{
answerData.Add(reader.GetString(0));
}
}
}
var duplicates = answerData
.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key);
foreach (var d in duplicates)
{
MessageBox.Show(""+ d + duplicates.Count().ToString()); //Here I tried to get the number
//with Count() but it doesn't work as I thought it would.
}
What should I add/change to get the result I want?
EDIT
As suggested changed my code to the following:
var duplicates = answerData
.GroupBy(i => i)
.Where(g => g.Count() > 1);
foreach (var d in duplicates)
{
MessageBox.Show(d.Key + " " + d.Count().ToString());
}
And now it works smoothly.
Thank you everyone!
Store the actual groups instead of the keys in duplicates:
var duplicates = answerData
.GroupBy(i => i)
.Where(g => g.Count() > 1);
You could then iterate through the groups:
foreach (var d in duplicates)
{
MessageBox.Show(d.Key + " " + d.Count().ToString());
}
This example counts, i.e. iterates, each group twice. Alternatively, you could store objects that contain both the Key and the Count as suggested by #HimBromBeere.
You just need to return the number within your Select:
var duplicates = answerData
.GroupBy(i => i)
.Select(g => new { Key = g.Key, Count = x.Count() })
.Where(x => x.Count > 1);
Notice that I changed the order of your statements to avoid a duplicate execution of g.Count().
You can do something like this
you need to use Dictionary for performance reasons
List<String> answerData = new List<String>();
Dictionary<string,int> map = new Dictionary<string, int>();
foreach (var data in answerData)
{
if (map.ContainsKey(data))
{
map[data]++;
}
else
{
map.Add(data, 1);
}
}
foreach (var item in map)
{
if (item.Value > 1)
{
Console.WriteLine("{0} - {1}", item.Key, item.Value);
}
}

c# read lines and store values in dictionaries

I want to read a csv file and store the values in a correct way in dictionaries.
using (var reader = new StreamReader(#"CSV_testdaten.csv"))
{
while (!reader.EndOfStream)
{
string new_line;
while ((new_line = reader.ReadLine()) != null)
{
var values = new_line.Split(",");
g.add_vertex(values[0], new Dictionary<string, int>() { { values[1], Int32.Parse(values[2]) } });
}
}
}
the add_vertex function looks like this:
Dictionary<string, Dictionary<string, int>> vertices = new Dictionary<string, Dictionary<string, int>>();
public void add_vertex(string name, Dictionary<string, int> edges)
{
vertices[name] = edges;
}
The csv file looks like this:
there are multiple lines with the same values[0] (e.g. values[0] is "0") and instead of overwriting the existing dictionary, it should be added to the dictionary which already exists with values[0] = 0. like this:
g.add_vertex("0", new Dictionary<string, int>() { { "1", 731 } ,
{ "2", 1623 } , { "3" , 1813 } , { "4" , 2286 } , { "5" , 2358 } ,
{ "6" , 1 } , ... });
I want to add all values which have the same ID (in the first column of the csv file) to one dictionary with this ID. But I'm not sure how to do this. Can anybody help?
When we have complex data and we want to query them, Linq can be very helpful:
var records = File
.ReadLines(#"CSV_testdaten.csv")
.Where(line => !string.IsNullOrWhiteSpace(line)) // to be on the safe side
.Select(line => line.Split(','))
.Select(items => new {
vertex = items[0],
key = items[1],
value = int.Parse(items[2])
})
.GroupBy(item => item.vertex)
.Select(chunk => new {
vertex = chunk.Key,
dict = chunk.ToDictionary(item => item.key, item => item.value)
});
foreach (var record in records)
g.add_vertex(record.vertex, record.dict);
Does this work for you?
vertices =
File
.ReadLines(#"CSV_testdaten.csv")
.Select(x => x.Split(','))
.Select(x => new { vertex = x[0], name = x[1], value = int.Parse(x[2]) })
.GroupBy(x => x.vertex)
.ToDictionary(x => x.Key, x => x.ToDictionary(y => y.name, y => y.value));
You can split your code in two parts. First will read csv lines:
public static IEnumerable<(string, string, string)> ReadCsvLines()
{
using (var reader = new StreamReader(#"CSV_testdaten.csv"))
{
while (!reader.EndOfStream)
{
string newLine;
while ((newLine = reader.ReadLine()) != null)
{
var values = newLine.Split(',');
yield return (values[0], values[1], values[2]);
}
}
}
}
and second will add those lines to dictionary:
var result = ReadCsvLines()
.ToArray()
.GroupBy(x => x.Item1)
.ToDictionary(x => x.Key, x => x.ToDictionary(t => t.Item2, t => int.Parse(t.Item3)));
With your input result would be:

c# - Remove last n elements from Dictionary

How do I remove the last 2 keyValuePairs from a Dictionary of string, string where the key starts with "MyKey_" ?
var myDict = new Dictionary<string, string>();
myDict.Add("SomeKey1", "SomeValue");
myDict.Add("SomeKey2", "SomeValue");
myDict.Add("MyKey_" + Guid.NewGuid(), "SomeValue");
myDict.Add("MyKey_" + Guid.NewGuid(), "SomeValue");
myDict.Add("MyKey_" + Guid.NewGuid(), "SomeValue");
EDIT:
var noGwInternal = myDict.Where(o => !o.Key.StartsWith("MyKey_")).ToDictionary(o => o.Key, o => o.Value);
var gwInternal = myDict.Where(o => o.Key.StartsWith("MyKey_")).ToDictionary(o => o.Key, o => o.Value);
How to move forward from here? Need to remove 2 of the items from gwInternal and then put noGwInternal + gwInternal into a new Dictionary together
Not sure what you mean 'last', since this is a dictionary (there is no order), but this code will remove the last 2 in the order they were encountered in the loop.
List<string> toRemove = new List<string>();
foreach(KeyValuePair pair in myDict.Reverse())
{
if(pair.key.StartsWith("MyKey_"))
{
toRemove.Add(pair.key);
toRemoveCount--;
}
if(toRemove.Count == 2)
{
break;
}
}
foreach(string str in toRemove)
{
myDict.Remove(str);
}
This should do what you're trying to do (according to what you posted in your comment).
(edit: It looks like you replaced your comment, now I'm not sure you're going for alphabetical...)
var myDict = new Dictionary<string, string>();
myDict.Add("SomeKey1", "SomeValue");
myDict.Add("SomeKey2", "SomeValue");
myDict.Add("MyKey_B" + Guid.NewGuid(), "SomeValue");
myDict.Add("MyKey_A" + Guid.NewGuid(), "SomeValue");
myDict.Add("MyKey_C" + Guid.NewGuid(), "SomeValue");
var pairsToRemove = myDict.Where(x => x.Key.StartsWith("MyKey_"))
.OrderByDescending(x => x.Key)
.Take(2);
foreach (var pair in pairsToRemove)
{
myDict.Remove(pair.Key);
}
foreach (var pair in myDict)
{
Console.WriteLine(pair);
}
Output: (MyKey_B and MyKey_C are removed)
[SomeKey1, SomeValue]
[SomeKey2, SomeValue]
[MyKey_Ad6c3a25d-5d8c-44e4-9651-39164c0496fc, SomeValue]
I like what tevemadar mentioned about the OrderedDictionary... I'm not sure it'll work for what you're trying to do, but it's worth a look.

C# Get dictionary first keys of duplicated values

I've got the following Dictionary:
Dictionary<int, int> myDict = new Dictionary<int, int>();
myDict.Add(0, 6);
myDict.Add(1, 10);
myDict.Add(2, 6);
myDict.Add(3, 14);
myDict.Add(4, 10);
myDict.Add(5, 10);
I already know how to get all the duplicates values:
var duplicatedValues = myDict.GroupBy(x => x.Value).Where(x => x.Count() > 1);
But what I want instead is the following: A list with all the keys of the duplicated values, but excluding the last duplicated ones. So in my list above the duplicates values are 10 and 6, and what I want is a list of the following keys: 0, 1, 4 (so excluding 2 and 5).
What is the best way to do this?
Any help would be appreciated. Thanks in advance.
EDIT: I did manage to do it with this piece of code by modifying something I found on the internet, but to be honest I find it a bit dumb to first create a string from the keys and then back into ints. I'm kinda new to the Aggregate-command, so any help how to modify the following code would be welcome:
var lookup = allIDs.ToLookup(x => x.Value, x => x.Key).Where(x => x.Count() > 1);
foreach (var item in lookup) {
var keys = item.Aggregate("", (s, v) => s + "," + v);
string[] split = keys.Split(',');
for (int i = 0; i < split.Length - 1; i++) {
if (!split[i].Equals("")) {
Console.WriteLine("removing card nr: " + split[i]);
CurrentField.removeCardFromField(Convert.ToInt32(split[i]));
}
}
}
This should do it:
var firstKeysOfDupeValues = myDict
.GroupBy(x => x.Value)
.SelectMany(x => x.Reverse().Skip(1))
.Select(p => p.Key);
.ToList();
After grouping by value, the last key for each value group is rejected using .Reverse().Skip(1) (this construct serves double duty: it also rejects the single keys of non-duplicated values) and the keys of the remaining key/value pairs are extracted into the result.
You could use
var allButLastDupKeys = myDict.GroupBy(kv => kv.Value)
.Where(g => g.Count() > 1)
.SelectMany(g => g.Take(g.Count() - 1).Select(kv => kv.Key));
string dupKeys = string.Join(",", allButLastDupKeys); // 0,1,4

Checking a dictionary if there are same values for a key

I have a dictionary object like this:
CustomKeys<int, string>
eg;
1000, F1
1001, F2
1002, F1
1003, F4
1004, F2
I want to know if I have more than 1 of same values in this dictionary. I would also want to keep a note of which keys(unique id) has duplicates.
Is that possible?
It is possible using GroupBy and than Count() > 1 to keep track of which values that have duplicates.
var q = dic.GroupBy(x => x.Value)
.Select (x => new { Item = x, HasDuplicates = x.Count() > 1 });
You can find all key values they had the same values like this;
Dictionary<int, string> d = new Dictionary<int, string>();
d.Add(1000, "F1");
d.Add(1001, "F2");
d.Add(1002, "F1");
d.Add(1003, "F4");
d.Add(1004, "F2");
var dublicate = d.ToLookup(x => x.Value, x => x.Key).Where(x => x.Count() > 1);
foreach (var i in dublicate)
{
Console.WriteLine(i.Key);
}
Here is a DEMO.
But if you want to get a boolean value since your item's has a same value, look at Magnus's answer which is great.
I'm not sure by what you mean by "keeping note of which has duplicate values". If you mean keeping note of the keys, you could do this:
var keys = new Dictionary<int, string>();
keys.Add(1000, "F1");
keys.Add(1001, "F2");
keys.Add(1002, "F1");
keys.Add(1003, "F4");
keys.Add(1004, "F2");
var duplicates = keys.GroupBy(i => i.Value).Select(i => new
{
keys = i.Select(x => x.Key),
value = i.Key,
count = i.Count()
});
foreach (var duplicate in duplicates)
{
Console.WriteLine("Value: {0} Count: {1}", duplicate.value, duplicate.count);
foreach (var key in duplicate.keys)
{
Console.WriteLine(" - {0}", key);
}
}
If you mean keeping track of the duplicate values only, see Sonor's answer.
Another solution could be:
var duplicates = dictionary.GroupBy( g => g.Value )
.Where( x => x.Count( ) > 1 )
.Select( x => new { Item = x.First( ), Count = x.Count( ) } )
.ToList( );

Categories