Get last duplicate element in a list - c#

I have a list contains duplicate items.
List<string> filterList = new List<string>()
{
"postpone", "access", "success", "postpone", "success"
};
I get the output which is postpone, access, success by using
List<string> filter = filterList.Distinct().ToList();
string a = string.Join(",", filter.Select(a => a).ToArray());
Console.WriteLine(a);
I had saw other example, they can use groupby to get the latest element since they have other item like ID etc. Now I only have the string, how can I get the latest item in the list which is access, postpone, success? Any suggestion?

One way to do this would be use the Index of the item in original collection along with GroupBy. For example,
var lastDistinct = filterList.Select((x,index)=> new {Value=x,Index=index})
.GroupBy(x=>x.Value)
.Select(x=> x.Last())
.OrderBy(x=>x.Index)
.Select(x=>x.Value);
var result = string.Join(",",lastDistinct);
Output
access,postpone,success

An OrderedDictionary does this. All you have to do is add your items to it with a logic of "if it's in the dictionary, remove it. add it". OrderedDictionary preserves the order of adding so by removing an earlier added one and re-adding it it jumps to the end of the dictionary
var d = new OrderedDictionary();
filterList.ForEach(x => { if(d.Contains(x)) d.Remove(x); d[x] = null; });
Your d.Keys is now a list of strings
access
postpone
success
OrderedDictionary is in the Collections.Specialized namespace
If you wanted the keys as a CSV, you can use Cast to turn them from object to string
var s = string.Join(",", d.Keys.Cast<string>());

Your input list is only of type string, so using groupBy doesn't really add anything. If you consider your code, your first line gives you the distinct list, you only lose the distinct items because you did a string.join on line 2. All you need to do is add a line before you join:
List<string> filter = filterList.Distinct().ToList();
string last = filter.LastOrDefault();
string a = string.Join(",", filter.Select(a => a).ToArray());
Console.WriteLine(a);
I suppose you could make your code more terse because you need neither .Select(a => a) nor .ToArray() in your call to string.Join.
GroupBy would be used if you had a list of class/struct/record/tuple items, where you might want to group by a specific key (or keys) rather than using Distinct() on the whole thing. GroupBy is very useful and you should learn that, and also the ToDictionary and ToLookup LINQ helper functionality.

So why shouldn't you return the first occurrence of "postpone"? Because later in the sequence you see the same word "postpone" again. Why would you return the first occurrence of "access"? Because later in the sequence you don't see this word anymore.
So: return a word if the rest of the sequence does not have this word.
This would be easy in LINQ, with recursion, but it is not very efficient: for every word you would have to check the rest of the sequence to see if the word is in the rest.
It would be way more efficient to remember the highest index on which you found a word.
As an extension method. If you are not familiar with extension methods, see extension methods demystified.
private static IEnumerable<T> FindLastOccurences<T>(this IEnumerable<T> source)
{
return FindLastOccurrences<T>(source, null);
}
private static IEnumerable<T> FindLastOccurences<T>(this IEnumerable<T> source,
IEqualityComparer<T> comparer)
{
// TODO: check source not null
if (comparer == null) comparer = EqualityComparer<T>.Default;
Dictionary<T, int> dictionary = new Dictionary<T, int>(comparer);
int index = 0;
foreach (T item in source)
{
// did we already see this T? = is this in the dictionary
if (dictionary.TryGetValue(item, out int highestIndex))
{
// we already saw it at index highestIndex.
dictionary[item] = index;
}
else
{
// it is not in the dictionary, we never saw this item.
dictionary.Add(item, index);
}
++index;
}
// return the keys after sorting by value (which contains the highest index)
return dictionay.OrderBy(keyValuePair => keyValuePair.Value)
.Select(keyValuePair => keyValuePair.Key);
}
So for every item in the source sequence, we check if it is in the dictionary. If not, we add the item as key to the dictionary. The value is the index.
If it is already in the dictionary, then the value was the highest index of where we found this item before. Apparently the current index is higher, so we replace the value in the dictionary.
Finally we order the key value pairs in the dictionary by ascending value, and return only the keys.

Related

Create dictionary outside and initialize it using LINQ

I have dictionary indices and want to add several keys to it from another dictionary using LINQ.
var indices = new Dictionary<string, int>();
var source = new Dictionary<string, int> { { "1", 1 }, { "2", 2 } };
source.Select(name => indices[name.Key] = 0); // doesn't work
var res = indices.Count; // returns 0
Then I replace Select with Min and everything works as expected, LINQ creates new keys in my dictionary.
source.Min(name => indices[name.Key] = 0); // works!!!
var res = indices.Count; // returns 2
Question
All I want to do is to initialize dictionary without foreach. Why dictionary keys disappear when LINQ is executed? What iterator or aggregator I could use instead of Min to create keys for a dictionary declared outside of LINQ query?
Update #1
Decided to go with System.Interactive extension.
Update #2
I appreciate and upvote all answers, but need to clarify that, purpose of the question is not to copy a dictionary, but to execute some code in a LINQ query. To add more sense to it, I actually have hierarchical structure of classes with dictionaries and at some point they need to be synchronized, so I want to create flat, non-hierarchical dictionary, used for tracking, that includes all hierarchical keys.
class Account
{
Dictionary<string, User> Users;
}
class User
{
Dictionary<string, Activity> Activities;
}
class Activity
{
string Name;
DateTime Time;
}
Now I want to sync all actions by time, so I need a tracker that will help me to align all actions by time, and I don't want to create 3 loops for Account, User, and Activity. Because that would be considered a hierarchical hell of loops, the same as async or callback hell. With LINQ I don't have to create loop inside loop, inside loop, etc.
Accounts.ForEach(
account => account.Value.Users.ForEach(
user => user.Value.Activities.ForEach(
activity => indices[account.Key + user.Key + activity.Key] = 0));
Also, having loops where it can be replaced with LINQ can be considered as a code smell, not my opinion, but I totally agree, because having too many loops you will probably end up in duplicated code.
https://jasonneylon.wordpress.com/2010/02/23/refactoring-to-linq-part-1-death-to-the-foreach/
You can say that LINQ is used for querying, not for setting a variable, I would say I'm querying ... the KEYS.
Linq is not intended to be used to mutate the elements of a sequence. Rather, it is intended to be used to traverse, filter and project elements of a sequence. In this respect, it is intended to be used more in a "functional programming" style.
As you have discovered, Linq can be used in other than a functional programming style - but by using it in that way you are really misusing it.
Technically, the reason that source.Min() has the effect you were looking for is that it has to visit each of the elements of your sequence in order to determine the minimum element.
Because your selector for Min() has a side-effect (i.e. indices[name.Key] = 0) then a side-effect of finding the minimum value is to add each element's key to indices, but with a value of zero rather than the original value.
(I suspect you might have meant to put indices[name.Key] = name.Value...)
The reason that your use of Select() has no effect is that it has not been used to traverse the sequence - it uses "deferred execution".
You can force it to traverse the sequence by counting the elements, like so:
source.Select(name => indices[name.Key] = 0).Count();
However, that is also counter-intuitive and is a misuse of Linq.
The correct solution is to use foreach. This expresses your intent clearly and unambiguously.
An alternative approach is to write an AddRange() extension method for Dictionary like so:
public static class DictionaryExt
{
public static Dictionary<TKey, TValue> AddRange<TKey, TValue>(
this Dictionary<TKey, TValue> self,
IEnumerable<KeyValuePair<TKey, TValue>> items)
{
foreach (var item in items)
{
self[item.Key] = item.Value;
}
return self;
}
}
Then you can just call indices.AddRange(source); to achieve your aim.
Interestingly, the ImmutableDictionary type does already have an AddRange() method that you could use like so:
var indices = ImmutableDictionary.Create<string, int>();
var source = new Dictionary<string, int> { { "1", 1 }, { "2", 2 } };
indices = indices.AddRange(source);
Console.WriteLine(indices.Count);
But I wouldn't recommend you change over to using ImmutableDictionary just so you can use its AddRange().
Also note that ImmutableDictionary is, well, immutable - so you can't just do indices.AddRange(source);; you have to assign the result back as in indices = indices.AddRange(source); (like when you modify a string using ToUpper()).
You wrote:
All I want to do is to initialize dictionary without foreach
Do you want to replace the values in your indices dictionary with the values in source? Use Enumerable.ToDictionary
indices = (KeyValuePair<string, int>)source // regard the items in the dictionary as KeyValuePairs
.ToDictionary(pair => pair.Key, // the key is the key from original dictionary
pair => pair.Value); // the value is the value from the original
Or do you want to add the values from source to the already existing values in indices? If you don't want a foreach you'll have to take the current values from both dictionaries and Concat them to the values from source. Then use the ToDictionary to create a new Dictionary.
indices = (KeyValuePair<string, int>) indices
.Concat(KeyValuePair<string, int>) source)
.ToDictionary(... etc)
However this would be a waste of processing power.
Consider creating extension functions for Dictionary. See Extension Methods Demystified
public static Dictionary<TKey, TValue> Copy>Tkey, TValue>(
this Dictionary<TKey, TValue> source)
{
return source.ToDictionary(x => x.Key, x => x.Value);
}
public static void AddRange<TKey, TValue>(
this Dictionary<TKey, TValue> destination,
Dictionary<TKey, TValue> source)
{
foreach (var keyValuePair in source)
{
destination.Add(keyValuePair.Key, keyValuePair.Value);
// TODO: decide what to do if Key already in Destination
}
}
Usage:
// initialize:
var indices = source.Copy();
// add values:
indices.AddRange(otherDictionary);

Remove N items from IList where match predicate

I would like to remove N items from an IList collection. Here's what I've got:
public void RemoveSubcomponentsByTemplate(int templateID, int countToRemove)
{
// TaskDeviceSubcomponents is an IList
var subcomponents = TaskDeviceSubcomponents.Where(tds => tds.TemplateID == templateID).ToList();
if (subcomponents.Count < countToRemove)
{
string message = string.Format("Attempted to remove more subcomponents than found. Found: {0}, attempted: {1}", subcomponents.Count, countToRemove);
throw new ApplicationException(message);
}
subcomponents.RemoveRange(0, countToRemove);
}
Unfortunately, this code does not work as advertised. TaskDeviceSubcomponents is an IList, so it doesn't have the RemoveRange method. So, I call .ToList() to instantiate an actual List, but this gives me a duplicate collection with references to the same collection items. This is no good because calling RemoveRange on subcomponents does not affect TaskDeviceSubcomponents.
Is there a simple way to achieve this? I'm just not seeing it.
Unfortunately, I think you need to remove each item individually. I would change your code to this:
public void RemoveSubcomponentsByTemplate(int templateID, int countToRemove)
{
// TaskDeviceSubcomponents is an IList
var subcomponents = TaskDeviceSubcomponents
.Where(tds => tds.TemplateID == templateID)
.Take(countToRemove)
.ToList();
foreach (var item in subcomponents)
{
TaskDeviceSubcomponents.Remove(item);
}
}
Note that it is important to use ToList here so you are not iterating TaskDeviceSubcomponents while removing some of its items. This is because LINQ uses lazy evaluation, so it doesn't iterate over TaskDeviceSubcomponents until you iterate over subcomponents.
Edit: I neglected to only remove the number of items contained in countToRemove, so I added a Take call after the Where.
Edit 2: Specification for the Take()-Method: http://msdn.microsoft.com/en-us/library/bb503062(v=vs.110).aspx

find and delete tuple from list of tuples in C# 4.0

I have created a list of tuples:
static List<Tuple<string, string>> Alt;
The user adds to this list:
Alt.Add(new Tuple<string, string>(tbAlt.Text, ""));
What is the best way to find a Tuple based on the first string (i.e. the tbAlt.Text ) and either delete it or modify the second string?
I am new to using Tuples and lists :)
Many thanks for any help!
It appears the first string must be unique or you would not find a (singular)
Why are you using List<Tuple<string, string>>?
Why not Dictionary<string,string>?
Dictionary<TKey, TValue>.ContainsKey is very very fast.
Dictionary.ContainsKey
Your list of tuples looks much like dictionary. Consider using it instead - Dictionary<string,string> . It already has methods for retreiving value by key, deleting it, e.t.c.
If there could be multiple values for the same key, you can use Lookup class.
You could use List<T>.FindIndex to find the matching index, then replace as needed.
int index = Alt.FindIndex(t => t.Item1 == tbAlt.Text);
if (index != -1)
{
// Modify
Alt[index] = Tuple.Create(tbAlt.Text, "NewText");
// Remove:
Alt.RemoveAt(index);
}
Find tuple based on value of first string (Item1):
var t = Alt.FirstOrDefault(i => i.Item1 == "SomeString");
if(t != null)
{
// delete
Alt.Remove(t);
}
From comments:
You can't modify value of second item (Item2), because tuples are immutable, so you 'll have to remove it and add it again.
If you are sure the list contain your item :
Alt.Remove(Alt.First(i => i.Item1 == tbAlt.Text));
However, in order to modify a Tuple, you must create a new one.
if (Alt.Any(i => i.Item1 == tbAlt.Text))
{
Alt.Remove(Alt.First(i => i.Item1 == tbAlt.Text));
Alt.Add(new Tuple<string, string>(tbAlt.Text, "Something New"));
}

How to retrieve first row from dictionary in C#?

How to retrieve first row from this dictionary. I put around some 15 records.
Dictionary<string, Tuple<string, string>> headINFO =
new Dictionary<string, Tuple<string, string>> { };
You can use headINFO.First(), but unless you know that there always will be at least one entry in headINFO I would recommend that you use headINFO.FirstOrDefault().
More information on the differences between the two available here.
Edit: Added simple examples
Here is a quick example on how to use FirstOrDefault()
var info = headINFO.FirstOrDefault().Value;
Console.WriteLine(info.Item1); // Prints Tuple item1
Console.WriteLine(info.Item2); // Prints Tuple item2
If you want to print multiple items you can use Take(x). In this case we will loop through three dictionary items, but you can easily modify the number to grab more items.
foreach (var info in headINFO.Take(3))
{
Console.WriteLine(info.Value.Item1);
Console.WriteLine(info.Value.Item2);
}
You should also keep in mind that the above foreach does not allow you to modify the values of your Dictionary entries directly.
Edit2: Clarified usage of First() and added clean foreach example
Keep in mind that while First() and FirstOrDefault() will provide you with a single item it does in no way guarantee that it will be the first item added.
Also, if you simply want to loop through all the items you can remove the Take(3) in the foreach loop mentioned above.
foreach (var info in headINFO)
{
Console.WriteLine(info.Key); // Print Dictionary Key
Console.WriteLine(info.Value.Item1); // Prints Turple Value 1
Console.WriteLine(info.Value.Item2); // Prints Turple Value 2
}
Dictionaries are unordered, so there is no way to retrieve the first key-value pair you inserted.
You can retrieve an item by calling the LINQ .First() method.
The easiest way is just to use Linq's First extension method:
var firstHead = headINFO.First();
Or if you want to be safer, the FirstOrDefault method will return null if the dictionary is empty:
var firstHead = headINFO.FirstOrDefault();
If you'd like to loop through all items in the dictionary, try this:
foreach(var head in headINFO)
{
...
}
Try this code.
headINFO.FirstOrDefault();
Try this:
int c=0;
foreach(var item in myDictionary)
{
c++;
if(c==1)
{
myFirstVar = item.Value;
}
else if(c==2)
{
mySecondVar = item.Value;
}
......
}

IEnumerable<T>.Union(IEnumerable<T>) overwrites contents instead of unioning

I've got a collection of items (ADO.NET Entity Framework), and need to return a subset as search results based on a couple different criteria. Unfortunately, the criteria overlap in such a way that I can't just take the collection Where the criteria are met (or drop Where the criteria are not met), since this would leave out or duplicate valid items that should be returned.
I decided I would do each check individually, and combine the results. I considered using AddRange, but that would result in duplicates in the results list (and my understanding is it would enumerate the collection every time - am I correct/mistaken here?). I realized Union does not insert duplicates, and defers enumeration until necessary (again, is this understanding correct?).
The search is written as follows:
IEnumerable<MyClass> Results = Enumerable.Empty<MyClass>();
IEnumerable<MyClass> Potential = db.MyClasses.Where(x => x.Y); //Precondition
int parsed_key;
//For each searchable value
foreach(var selected in SelectedValues1)
{
IEnumerable<MyClass> matched = Potential.Where(x => x.Value1 == selected);
Results = Results.Union(matched); //This is where the problem is
}
//Ellipsed....
foreach(var selected in SelectedValuesN) //Happens to be integer
{
if(!int.TryParse(selected, out parsed_id))
continue;
IEnumerable<MyClass> matched = Potential.Where(x => x.ValueN == parsed_id);
Results = Results.Union(matched); //This is where the problem is
}
It seems, however, that Results = Results.Union(matched) is working more like Results = matched. I've stepped through with some test data and a test search. The search asks for results where the first field is -1, 0, 1, or 3. This should return 4 results (two 0s, a 1 and a 3). The first iteration of the loops works as expected, with Results still being empty. The second iteration also works as expected, with Results containing two items. After the third iteration, however, Results contains only one item.
Have I just misunderstood how .Union works, or is there something else going on here?
Because of deferred execution, by the time you eventually consume Results, it is the union of many Where queries all of which are based on the last value of selected.
So you have
Results = Potential.Where(selected)
.Union(Potential.Where(selected))
.Union(potential.Where(selected))...
and all the selected values are the same.
You need to create a var currentSelected = selected inside your loop and pass that to the query. That way each value of selected will be captured individually and you won't have this problem.
You can do this much more simply:
Reuslts = SelectedValues.SelectMany(s => Potential.Where(x => x.Value == s));
(this may return duplicates)
Or
Results = Potential.Where(x => SelectedValues.Contains(x.Value));
As pointed out by others, your LINQ expression is a closure. This means your variable selected is captured by the LINQ expression in each iteration of your foreach-loop. The same variable is used in each iteration of the foreach, so it will end up having whatever the last value was. To get around this, you will need to declare a local variable within the foreach-loop, like so:
//For each searchable value
foreach(var selected in SelectedValues1)
{
var localSelected = selected;
Results = Results.Union(Potential.Where(x => x.Value1 == localSelected));
}
It is much shorter to just use .Contains():
Results = Results.Union(Potential.Where(x => SelectedValues1.Contains(x.Value1)));
Since you need to query multiple SelectedValues collections, you could put them all inside their own collection and iterate over that as well, although you'd need some way of matching the correct field/property on your objects.
You could possibly do this by storing your lists of selected values in a Dictionary with the name of the field/property as the key. You would use Reflection to look up the correct field and perform your check. You could then shorten the code to the following:
// Store each of your searchable lists here
Dictionary<string, IEnumerable<MyClass>> DictionaryOfSelectedValues = ...;
Type t = typeof(MyType);
// For each list of searchable values
foreach(var selectedValues in DictionaryOfSelectedValues) // Returns KeyValuePair<TKey, TValue>
{
// Try to get a property for this key
PropertyInfo prop = t.GetProperty(selectedValues.Key);
IEnumerable<MyClass> localSelected = selectedValues.Value;
if( prop != null )
{
Results = Results.Union(Potential.Where(x =>
localSelected.Contains(prop.GetValue(x, null))));
}
else // If it's not a property, check if the entry is for a field
{
FieldInfo field = t.GetField(selectedValues.Key);
if( field != null )
{
Results = Results.Union(Potential.Where(x =>
localSelected.Contains(field.GetValue(x, null))));
}
}
}
No, your use of union is absoloutely correct.
The only thing to keep in mind is it excludes duplicates as based on the equality operator. Do you have sample data?
Okay, I think you are are haveing a problem because Union uses deferred execution.
What happens if you do,
var unionResults = Results.Union(matched).ToList();
Results = unionResults;

Categories