C#, ToDictionary() with ContainsKey Check as parameter [duplicate] - c#

I have a list of Person objects. I want to convert to a Dictionary where the key is the first and last name (concatenated) and the value is the Person object.
The issue is that I have some duplicated people, so this blows up if I use this code:
private Dictionary<string, Person> _people = new Dictionary<string, Person>();
_people = personList.ToDictionary(
e => e.FirstandLastName,
StringComparer.OrdinalIgnoreCase);
I know it sounds weird but I don't really care about duplicates names for now. If there are multiple names I just want to grab one. Is there anyway I can write this code above so it just takes one of the names and doesn't blow up on duplicates?

LINQ solution:
// Use the first value in group
var _people = personList
.GroupBy(p => p.FirstandLastName, StringComparer.OrdinalIgnoreCase)
.ToDictionary(g => g.Key, g => g.First(), StringComparer.OrdinalIgnoreCase);
// Use the last value in group
var _people = personList
.GroupBy(p => p.FirstandLastName, StringComparer.OrdinalIgnoreCase)
.ToDictionary(g => g.Key, g => g.Last(), StringComparer.OrdinalIgnoreCase);
If you prefer a non-LINQ solution then you could do something like this:
// Use the first value in list
var _people = new Dictionary<string, Person>(StringComparer.OrdinalIgnoreCase);
foreach (var p in personList)
{
if (!_people.ContainsKey(p.FirstandLastName))
_people[p.FirstandLastName] = p;
}
// Use the last value in list
var _people = new Dictionary<string, Person>(StringComparer.OrdinalIgnoreCase);
foreach (var p in personList)
{
_people[p.FirstandLastName] = p;
}

Here's the obvious, non linq solution:
foreach(var person in personList)
{
if(!myDictionary.ContainsKey(person.FirstAndLastName))
myDictionary.Add(person.FirstAndLastName, person);
}
If you don't mind always getting the last one added, you can avoid the double lookup like this:
foreach(var person in personList)
{
myDictionary[person.FirstAndLastName] = person;
}

A Linq-solution using Distinct() and and no grouping is:
var _people = personList
.Select(item => new { Key = item.Key, FirstAndLastName = item.FirstAndLastName })
.Distinct()
.ToDictionary(item => item.Key, item => item.FirstFirstAndLastName, StringComparer.OrdinalIgnoreCase);
I don't know if it is nicer than LukeH's solution but it works as well.

This should work with lambda expression:
personList.Distinct().ToDictionary(i => i.FirstandLastName, i => i);

You can create an extension method similar to ToDictionary() with the difference being that it allows duplicates. Something like:
public static Dictionary<TKey, TElement> SafeToDictionary<TSource, TKey, TElement>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
Func<TSource, TElement> elementSelector,
IEqualityComparer<TKey> comparer = null)
{
var dictionary = new Dictionary<TKey, TElement>(comparer);
if (source == null)
{
return dictionary;
}
foreach (TSource element in source)
{
dictionary[keySelector(element)] = elementSelector(element);
}
return dictionary;
}
In this case, if there are duplicates, then the last value wins.

You can also use the ToLookup LINQ function, which you then can use almost interchangeably with a Dictionary.
_people = personList
.ToLookup(e => e.FirstandLastName, StringComparer.OrdinalIgnoreCase);
_people.ToDictionary(kl => kl.Key, kl => kl.First()); // Potentially unnecessary
This will essentially do the GroupBy in LukeH's answer, but will give the hashing that a Dictionary provides. So, you probably don't need to convert it to a Dictionary, but just use the LINQ First function whenever you need to access the value for the key.

To handle eliminating duplicates, implement an IEqualityComparer<Person> that can be used in the Distinct() method, and then getting your dictionary will be easy.
Given:
class PersonComparer : IEqualityComparer<Person>
{
public bool Equals(Person x, Person y)
{
return x.FirstAndLastName.Equals(y.FirstAndLastName, StringComparison.OrdinalIgnoreCase);
}
public int GetHashCode(Person obj)
{
return obj.FirstAndLastName.ToUpper().GetHashCode();
}
}
class Person
{
public string FirstAndLastName { get; set; }
}
Get your dictionary:
List<Person> people = new List<Person>()
{
new Person() { FirstAndLastName = "Bob Sanders" },
new Person() { FirstAndLastName = "Bob Sanders" },
new Person() { FirstAndLastName = "Jane Thomas" }
};
Dictionary<string, Person> dictionary =
people.Distinct(new PersonComparer()).ToDictionary(p => p.FirstAndLastName, p => p);

In case we want all the Person (instead of only one Person) in the returning dictionary, we could:
var _people = personList
.GroupBy(p => p.FirstandLastName)
.ToDictionary(g => g.Key, g => g.Select(x=>x));

The issue with most of the other answers is that they use Distinct, GroupBy or ToLookup, which creates an extra Dictionary under the hood. Equally ToUpper creates extra string.
This is what I did, which is an almost an exact copy of Microsoft's code except for one change:
public static Dictionary<TKey, TSource> ToDictionaryIgnoreDup<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer = null) =>
source.ToDictionaryIgnoreDup(keySelector, i => i, comparer);
public static Dictionary<TKey, TElement> ToDictionaryIgnoreDup<TSource, TKey, TElement>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, Func<TSource, TElement> elementSelector, IEqualityComparer<TKey> comparer = null)
{
if (keySelector == null)
throw new ArgumentNullException(nameof(keySelector));
if (elementSelector == null)
throw new ArgumentNullException(nameof(elementSelector));
var d = new Dictionary<TKey, TElement>(comparer ?? EqualityComparer<TKey>.Default);
foreach (var element in source)
d[keySelector(element)] = elementSelector(element);
return d;
}
Because a set on the indexer causes it to add the key, it will not throw, and will also do only one key lookup. You can also give it an IEqualityComparer, for example StringComparer.OrdinalIgnoreCase

DataTable DT = new DataTable();
DT.Columns.Add("first", typeof(string));
DT.Columns.Add("second", typeof(string));
DT.Rows.Add("ss", "test1");
DT.Rows.Add("sss", "test2");
DT.Rows.Add("sys", "test3");
DT.Rows.Add("ss", "test4");
DT.Rows.Add("ss", "test5");
DT.Rows.Add("sts", "test6");
var dr = DT.AsEnumerable().GroupBy(S => S.Field<string>("first")).Select(S => S.First()).
Select(S => new KeyValuePair<string, string>(S.Field<string>("first"), S.Field<string>("second"))).
ToDictionary(S => S.Key, T => T.Value);
foreach (var item in dr)
{
Console.WriteLine(item.Key + "-" + item.Value);
}

Using LINQ's equivalent of foldLeft functionality
persons.Aggregate(new Dictionary<string,Person>(StringComparer.OrdinalIgnoreCase),
(acc, current) => {
acc[current.FirstAndLastName] = current;
return acc;
});

Starting from Carra's solution you can also write it as:
foreach(var person in personList.Where(el => !myDictionary.ContainsKey(el.FirstAndLastName)))
{
myDictionary.Add(person.FirstAndLastName, person);
}

Related

Pass expression to initializer

I would like to pass an expression that represents a variable to used when instantiating an object.
Instead of:
class MyObject : IMyInterface { ... }
var list = db.MyObjects.Where(x => !x.IsDeleted).ToList();
var anotherList = list.Select(x => new AnotherObject() {
Id = x.Id,
Value = x.Value
});
I would like to make this so that a list of objects of IMyInterface can be transformed into another type of list (AnotherObject as example) using defined expressions as so:
var list = db.MyObjects
.Where(x => !x.IsDeleted)
.ToAnotherObjectList(x => x.Id, x => x.Value);
...
public static List<AnotherObject> ToAnotherObjectList<T>(
this IEnumerable<IMyInterface> list,
Expression id,
Expression value)
{
return list.Select(x => new AnotherObject() { Id = id, Value = value }).ToList();
}
I'm not sure how to accomplish this. I know I can use reflection to create objects and set properties by a string but I'm not sure how to pass expressions.
UPDATE
Well, I thought I'd have to do some reflection but it's simpler than what I was thinking. Here's my solution that works in IRL.
public static IEnumerable<AnotherObject> ToAnotherObject<T>(this IEnumerable<T> list, Func<T, int> getId, Func<T, string> getValue, Func<T, bool> getSelected = null) where T : IMyInterface
{
return list.Select(x => new AnotherObject {
Display = getValue(x),
Id = getId(x),
Selected = getSelected != null && getSelected(x),
});
}
You could use a Func<TInput,TReturn> for that. For example:
public static List<AnotherObject> ToAnotherObjectList<T>(
this IEnumerable<T> list,
Func<T, int> getId,
Func<T, object> getValue)
{
return list.Select(x => new AnotherObject() { Id = getId(x), Value = getValue(x) }).ToList();
}
Call:
list.ToAnotherObjectList(i => i.Id, i=> i.Value);
In this example I used Funcs with one parameter (of type T) and return type int/object.

LINQ: Group by index and value [duplicate]

This question already has answers here:
linq group by contiguous blocks
(5 answers)
Closed 4 years ago.
Lets say I have an list of strings with the following values:
["a","a","b","a","a","a","c","c"]
I want to execute a linq query that will group into 4 groups:
Group 1: ["a","a"] Group 2: ["b"] Group 3: ["a","a","a"] Group 4:
["c","c"]
Basically I want to create 2 different groups for the value "a" because they are not coming from the same "index sequence".
Anyone has a LINQ solution for this?
You just need key other than items of array
var x = new string[] { "a", "a", "a", "b", "a", "a", "c" };
int groupId = -1;
var result = x.Select((s, i) => new
{
value = s,
groupId = (i > 0 && x[i - 1] == s) ? groupId : ++groupId
}).GroupBy(u => new { groupId });
foreach (var item in result)
{
Console.WriteLine(item.Key);
foreach (var inner in item)
{
Console.WriteLine(" => " + inner.value);
}
}
Here is the result: Link
Calculate the "index sequence" first, then do your group.
private class IndexedData
{
public int Sequence;
public string Text;
}
string[] data = [ "a", "a", "b" ... ]
// Calculate "index sequence" for each data element.
List<IndexedData> indexes = new List<IndexedData>();
foreach (string s in data)
{
IndexedData last = indexes.LastOrDefault() ?? new IndexedData();
indexes.Add(new IndexedData
{
Text = s,
Sequence = (last.Text == s
? last.Sequence
: last.Sequence + 1)
});
}
// Group by "index sequence"
var grouped = indexes.GroupBy(i => i.Sequence)
.Select(g => g.Select(i => i.Text));
This is a naive foreach implementation where whole dataset ends up in memory (probably not an issue for you since you do GroupBy):
public static IEnumerable<List<string>> Split(IEnumerable<string> values)
{
var result = new List<List<string>>();
foreach (var value in values)
{
var currentGroup = result.LastOrDefault();
if (currentGroup?.FirstOrDefault()?.Equals(value) == true)
{
currentGroup.Add(value);
}
else
{
result.Add(new List<string> { value });
}
}
return result;
}
Here comes a slightly complicated implementation with foreach and yield return enumerator state machine which keeps only current group in memory - this is probably how this would be implemented on framework level:
EDIT: This is apparently also the way MoreLINQ does it.
public static IEnumerable<List<string>> Split(IEnumerable<string> values)
{
var currentValue = default(string);
var group = (List<string>)null;
foreach (var value in values)
{
if (group == null)
{
currentValue = value;
group = new List<string> { value };
}
else if (currentValue.Equals(value))
{
group.Add(value);
}
else
{
yield return group;
currentValue = value;
group = new List<string> { value };
}
}
if (group != null)
{
yield return group;
}
}
And this is a joke version using LINQ only, it is basically the same as the first one but is slightly harder to understand (especially since Aggregate is not the most frequently used LINQ method):
public static IEnumerable<List<string>> Split(IEnumerable<string> values)
{
return values.Aggregate(
new List<List<string>>(),
(lists, str) =>
{
var currentGroup = lists.LastOrDefault();
if (currentGroup?.FirstOrDefault()?.Equals(str) == true)
{
currentGroup.Add(str);
}
else
{
lists.Add(new List<string> { str });
}
return lists;
},
lists => lists);
}
Using an extension method based on the APL scan operator, that is like Aggregate but returns intermediate results paired with source values:
public static IEnumerable<KeyValuePair<TKey, T>> ScanPair<T, TKey>(this IEnumerable<T> src, TKey seedKey, Func<KeyValuePair<TKey, T>, T, TKey> combine) {
using (var srce = src.GetEnumerator()) {
if (srce.MoveNext()) {
var prevkv = new KeyValuePair<TKey, T>(seedKey, srce.Current);
while (srce.MoveNext()) {
yield return prevkv;
prevkv = new KeyValuePair<TKey, T>(combine(prevkv, srce.Current), srce.Current);
}
yield return prevkv;
}
}
}
You can create extension methods for grouping by consistent runs:
public static IEnumerable<IGrouping<int, TResult>> GroupByRuns<TElement, TKey, TResult>(this IEnumerable<TElement> src, Func<TElement, TKey> key, Func<TElement, TResult> result, IEqualityComparer<TKey> cmp = null) {
cmp = cmp ?? EqualityComparer<TKey>.Default;
return src.ScanPair(0,
(kvp, cur) => cmp.Equals(key(kvp.Value), key(cur)) ? kvp.Key : kvp.Key + 1)
.GroupBy(kvp => kvp.Key, kvp => result(kvp.Value));
}
public static IEnumerable<IGrouping<int, TElement>> GroupByRuns<TElement, TKey>(this IEnumerable<TElement> src, Func<TElement, TKey> key) => src.GroupByRuns(key, e => e);
public static IEnumerable<IGrouping<int, TElement>> GroupByRuns<TElement>(this IEnumerable<TElement> src) => src.GroupByRuns(e => e, e => e);
public static IEnumerable<IEnumerable<TResult>> Runs<TElement, TKey, TResult>(this IEnumerable<TElement> src, Func<TElement, TKey> key, Func<TElement, TResult> result, IEqualityComparer<TKey> cmp = null) =>
src.GroupByRuns(key, result).Select(g => g.Select(s => s));
public static IEnumerable<IEnumerable<TElement>> Runs<TElement, TKey>(this IEnumerable<TElement> src, Func<TElement, TKey> key) => src.Runs(key, e => e);
public static IEnumerable<IEnumerable<TElement>> Runs<TElement>(this IEnumerable<TElement> src) => src.Runs(e => e, e => e);
And using the simplest version, you can get either an IEnumerable<IGrouping>>:
var ans1 = src.GroupByRuns();
Or a version that dumps the IGrouping (and its Key) for an IEnumerable:
var ans2 = src.Runs();

Merge two list based on Id [duplicate]

I am taking a union of two lists using Linq to Sql. Using List1 and List2:
var tr = List1.Union(List2).ToList();
Union works fine, but the problem is it is checking each column and removes some of the rows that I want. So I was wondering if there is a a way I can perform a union based on one column only, like let's say id, of each list?
Something Like:
var t = List1.id.Union(List2.id).ToList();
This doesn't work, but I was wondering if there is a way to do this, either with LINQ or T-SQL
You should use this Union() overload (with a custom equality comparer) , or something like this:
list1.Concat(list2).GroupBy(x => x.DateProperty).Select(m => m.First());
The first solution is certainly more efficient.
Sure, you need a custom IEqualityComparer with Union. I have one that's really dynamic, big block of code incoming though:
public class PropertyEqualityComparer<TObject, TProperty>
: IEqualityComparer<TObject>
{
Func<TObject, TProperty> _selector;
IEqualityComparer<TProperty> _internalComparer;
public PropertyEqualityComparer(Func<TObject, TProperty> propertySelector,
IEqualityComparer<TProperty> innerEqualityComparer = null)
{
_selector = propertySelector;
_internalComparer = innerEqualityComparer;
}
public int GetHashCode(TObject obj)
{
return _selector(obj).GetHashCode();
}
public bool Equals(TObject x, TObject y)
{
IEqualityComparer<TProperty> comparer =
_internalComparer ?? EqualityComparer<TProperty>.Default;
return comparer.Equals(_selector(x), _selector(y));
}
}
public static class PropertyEqualityComparer
{
public static PropertyEqualityComparer<TObject, TProperty>
GetNew<TObject, TProperty>(Func<TObject, TProperty> propertySelector)
{
return new PropertyEqualityComparer<TObject, TProperty>
(propertySelector);
}
public static PropertyEqualityComparer<TObject, TProperty>
GetNew<TObject, TProperty>
(Func<TObject, TProperty> propertySelector,
IEqualityComparer<TProperty> comparer)
{
return new PropertyEqualityComparer<TObject, TProperty>
(propertySelector, comparer);
}
}
Now, all you need to do is call Union with that equality comparer (instantiated with a lambda that fits your circumstance):
var tr = List1.Union(List2, PropertyEqualityComparer.GetNew(n => n.Id)).ToList();
try somthing this
var List3 = List1.Join(
List2,
l1 => l1.Id,
l2 => l2.Id,
(l1, l2) => new Model
{
Id = l1.Id,
Val1 = l1.Val1 or other,
Val2 = l2.Val2 or other
});
for more details you can show your model
Try this:
var merged = new List<Person>(list1);
merged.AddRange(list2.Where(p2 =>
list1.All(p1 => p1.Id != p2.Id)));

Best method to find element which throws exception in ToDictionary()

I have a list of items which is populated by some larger configuration file.
List<TextEntrtry> localTextEntries;
with elements of type TextEntry:
public class TextEntry
{
public Guid Id { get; set; }
....
This list is converted to a dictionary:
Dictionary<Guid, TextEntry> textEntries;
and this line throws an exception 'Element with same key already exists':
textEntries = localTextEntries.ToDictionary(x => x.Id);
Obviously my list contains two elements with the same Id.
My question: what is the best way to find out which elements cause the exception?
(Which would allow me to produce a meaningfull error message)
You can re-write ToDictionary to use your own implementation that includes the key in the exception message:
//TODO come up with a slightly better name
public static Dictionary<TKey, TValue> MyToDictionary<TSource, TKey, TValue>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
Func<TSource, TValue> valueSelector,
IEqualityComparer<TKey> comparer)
{
comparer = comparer ?? EqualityComparer<TKey>.Default;
var dictionary = new Dictionary<TKey, TValue>(comparer);
foreach (var item in source)
{
var key = keySelector(item);
try
{
dictionary.Add(key, valueSelector(item));
}
catch (ArgumentException ex)
{
throw new ArgumentException("Missing key: " + key, ex);
}
}
return dictionary;
}
You'd want to create overloads without a comparer or value selector, in which default values for those parameters are used.
You may also want to create a new type of Exception that stores the key as a property, rather than including the string value of the key in the exception message (in the event there is no good string representation of the object).
Run this on your collection to get the ones with duplicate entries:
var duplicateEntries = localTextEntries.GroupBy(k = > k.Id)
.Where(g = > g.Count() > 1)
.Select(g = > g.Key);
You can also always add an extension method and get distinct values from your source
IEnumerable <TextEntry> distinctList = localTextEntries.DistinctBy(x = > x.Id);
public static IEnumerable<TSource> Distinctify<TSource, TKey>(this IEnumerable<TSource> inSrc_, Func<TSource, TKey> keyFunct_)
{
var uniqueSet = new HashSet<TKey>();
return inSrc_.Where(tmp => uniqueSet.Add(keyFunct_(tmp)));
}
You can use group by to check for repeated or not repeated items. To create the dictionary run the ToDictionary method on the grouped items:
// get repeated items
var repeated = localTextEntries.GroupBy(t => t.Id).Where(g => g.Count() > 1).Select(i => i);
// get not repeated items
var notRepeated = localTextEntries.GroupBy(t => t.Id).Where(g => g.Count() == 1).Select(i => i.First());
// find not repeated items and create a dictionary
var dictFromNotRepeated = localTextEntries.GroupBy(t => t.Id)
.Where(g => g.Count() == 1)
.ToDictionary(g => g.Key, g => g.First());
Finally the approach of #Servy worked best for me. I just chose a matching overload and added some better error handling:
public static Dictionary<TKey, TSource> ToDictionary2<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer = null)
{
comparer = comparer ?? EqualityComparer<TKey>.Default;
Dictionary<TKey, TSource> dictionary =
new Dictionary<TKey, TSource>(comparer);
foreach (var item in source)
{
var key = keySelector(item);
try
{
dictionary.Add(key, item);
}
catch (Exception ex)
{
string msg = string.Format("Problems with key {0} value {1}",
key,
item);
throw new Exception(msg, ex);
}
}
return dictionary;
}

C# Extension Method for generic GetOnlyKeys from IEnumerable<KeyValuePair<int, string>>

I have an IEnumberable> and I want only the list of Keys but cast to the needed type (i.e. perhaps short and not int). This is used in a custom generic multi-select control the binds to but the database needs potientially 'short' to save.
public static IEnumerable<T> GetKeysOnly<T>(this IEnumerable<KeyValuePair<int, string>> values)
{
Dictionary<int, string> valuesDictionary = values.ToDictionary(i => i.Key, i => i.Value);
List<int> keyList = new List<int>(valuesDictionary.Keys);
// Returns 0 records cuz nothing matches
//List<T> results = keyList.OfType<T>().ToList();
// Throws exception cuz unable to cast any items
//List<T> results = keyList.Cast<T>().ToList();
// Doesn't compile - can't convert int to T here: (T)i
//List<T> results = keyList.ConvertAll<T>(delegate(int i) { return (T)i; });
throw new NotImplementedException();
}
public static IEnumerable<short> GetKeysOnly(this IEnumerable<KeyValuePair<int, string>> values)
{
Dictionary<int, string> valuesDictionary = values.ToDictionary(i => i.Key, i => i.Value);
List<int> keyList = new List<int>(valuesDictionary.Keys);
// Works but not flexable and requires extension method for each type
List<short> results = keyList.ConvertAll(i => (short)i);
return results;
}
Any advice how to make my generic extension method work?
Thanks!
You want to get only the keys converted to a short?
var myList = valuesDictionary.Select(x => (short)x.Key).ToList();
// A Dictionary can be enumerated like a List<KeyValuePair<TKey, TValue>>
If you want to go to any type, then you would do something like this:
public static IEnumerable<T> ConvertKeysTo<T>(this IEnumerable<KeyValuePair<int, string>> source)
{
return source.Select(x => (T)Convert.ChangeType(x.Key, typeof(T)));
// Will throw an exception if x.Key cannot be converted to typeof(T)!
}

Categories