How to validate if a collection contains all unique objects - c#

I have a C# collection of objects that do not implement IEquatable or IComparable. I want to check if the collection contains duplicate objects. I.e. I want to know if Object.ReferenceEquals(x, y) is false for any x and y in my list.
How would I do that efficiently?
It would be nice with both a C# and a LINQ method.

Non-LINQ, when your collection implements ICollection<T> or ICollection:
bool allItemsUnique =
new HashSet<YourType>(yourCollection).Count == yourCollection.Count;
Non-LINQ, when your collection doesn't implement ICollection<T> or ICollection. (This version has slightly better theoretical performance than the first because it will break out early as soon as a duplicate is found.)
bool allItemsUnique = true;
var tempSet = new HashSet<YourType>();
foreach (YourType obj in yourCollection)
{
if (!tempSet.Add(obj))
{
allItemsUnique = false;
break;
}
}
LINQ. (This version's best case performance -- when your collection implements ICollection<T> or ICollection -- will be roughly the same as the first non-LINQ solution. If your collection doesn't implement ICollection<T> or ICollection then the LINQ version will be less efficient.)
bool allItemsUnique =
yourCollection.Distinct().Count() == yourCollection.Count();

I would suggest you to use
collection.GroupBy(x=>x).Any(x=>x.Count() != 1)
Profit is: iterating through collection would stop as soon, as first duplicate object would be found.

Related

System.InvalidOperationException in my c# project

my project is about a class Account and 2 child classes (current account and deposit account).
in the main I created an arraylist of accounts
but I'm trying to delete an object in this method:
public static void Remove(ArrayList L, int accnb)
{
foreach(Account obj in L)
{
if(obj.AccN == accnb)
L.Remove(obj);
}
}
but I got an error : Collection was modified; enumeration operation may not execute.
all other methods like add or return string worked fine..
Don't remove elements while iterating over the collection with a foreach.
Also, I'd recommend using List<T> rather than ArrayList.
An easier way to solve the task at hand is to simply do:
public static void Remove(List<Account> L, int accnb) =>
L.RemoveAll(obj => obj.AccN == accnb);
foreach does not actually work with collections, but with Enumerators. While all collections are implicitly convertable into a Enumerator, Enumeartor rules still apply.
So basically your code is interpreted as:
temp IEnumerator = L.GetEnumerator();
foreach(Account obj in temp)
All enumerators have the rule that you can not change the underlying collection. Doing so will (has to) invalidate the enumerator. Wich will throw said except the next call of .Current().
As you can not change the collections while using a enumerator and foreach uses only Enumerators below the hood, that means you can not change the colelciton (inlcuding removing elements) while using foreach. You need one of the much more wordy loops to do this.
The items in an ArrayList are typed as object. Therefore, C# does not know that they have a member named AccN.
There is a strongly typed, generic equivalent of ArrayList named List<T>. Here you specify the type of the list items explicitly when you create the list with
List<Account> accounts = new List<Account>();
This list can also contain objects of the derived classes CurrentAccount and DepositAccount. Use it like this
public static void Remove(List<Account> L, int accnb)
{
foreach(Account acc in L)
{
if(acc.AccN == accnb)
L.Remove(acc);
}
}
Note: in C# 1.0 and C# 1.1 there were no generics. Therefore, the weakly typed collection ArrayList was implemented. Since generics were introduced in C# 2.0, this type is mostly obsolete.
With ArrayList you would have to cast the object to the right type to make it work
if(((Account)acc).AccN == accnb)
You also have another problem. You cannot change the very collection you are enumerating with foreach, because this confuses foreach. Use a for-loop instead and make sure you loop in reverse order to not change the indexes of the items ahead when removing items.
for (int i = L.Count - 1; i >= 0; i--) {
if (L[i].AccN == accnb) {
L.RemoveAt(i);
}
}
The C# Reference says:
The foreach statement is used to iterate through the collection to get the information that you want, but can not be used to add or remove items from the source collection to avoid unpredictable side effects. If you need to add or remove items from the source collection, use a for loop.

ICollection - check if a collection contains an object

Knowing that the non-generic ICollection doesn't offer a Contains method, what's the best way to check if a given object already is in a collection?
If I had two ICollections: A and B and wanted to check if B has all elements of A, what would be the best way to accomplish that? My first thought is adding all elements of A to a HashSet and then checking if all B's elements are in the set using Contains.
If I had two ICollections A and B and wanted to check if B has all elements of A, what would be the best way to accomplish that?
Let me rephrase your question in the languages of sets.
If I had two sets A and B and wanted to check if A is a subset of B, what would be the best way to accomplish that?
Now it becomes easy to see the answer:
https://msdn.microsoft.com/en-us/library/bb358446%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396
Construct a HashSet<T> from A and then use the IsSubsetOf method to see if A is a subset of B.
I note that if these are the sorts of operations you must perform frequently, then you should keep your data in HashSet<T> collections to begin with. The IsSubsetOf operation is possibly more efficient if both collections are hash sets.
A and B and wanted to check if B has all elements of A
I think you have it backwards. Add the B to the HashSet.
HashSet.Contains is O(1)
Overall it will be O(n + m)
Going to assume string
HashSet<string> HashSetB = new HashSet<string>(iCollecionB);
foreach (string s in iCollecionA)
{
if(HashSetB.Contains(s))
{
}
else
{
}
}
Boolean ICollectionContains(ICollection collection, Object item)
{
for (Object o in collection)
{
if (o == item)
return true;
}
return false;
}
Or in extension form:
public static class CollectionExtensions
{
public static Boolean Contains(this ICollection collection, Object item)
{
for (Object o in collection)
{
if (o == item)
return true;
}
return false;
}
}
With usage:
ICollection turboEncabulators = GetSomeTrunnions();
if (turboEncabulators.Contains(me))
Environment.FailFast(); //How did you find me!

Does IEnumerable always imply a collection?

Just a quick question regarding IEnumerable:
Does IEnumerable always imply a collection? Or is it legitimate/viable/okay/whatever to use on a single object?
The IEnumerable and IEnumerable<T> interfaces suggest a sequence of some kind, but that sequence doesn't need to be a concrete collection.
For example, where's the underlying concrete collection in this case?
foreach (int i in new EndlessRandomSequence().Take(5))
{
Console.WriteLine(i);
}
// ...
public class EndlessRandomSequence : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
var rng = new Random();
while (true) yield return rng.Next();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
It is always and mandatory that IEnumerable is used on a single object - the single object is always the holder or producer of zero or more other objects that do not necessarily have any relation to IEnumerable.
It's usual, but not mandatory, that IEnumerable represents a collection.
Enumerables can be collections, as well as generators, queries, and even computations.
Generator:
IEnumerable<int> Generate(
int initial,
Func<int, bool> condition,
Func<int, int> iterator)
{
var i = initial;
while (true)
{
yield return i;
i = iterator(i);
if (!condition(i))
{
yield break;
}
}
}
Query:
IEnumerable<Process> GetProcessesWhereNameContains(string text)
{
// Could be web-service or database call too
var processes = System.Diagnostics.Process.GetProcesses();
foreach (var process in processes)
{
if (process.ProcessName.Contains(text))
{
yield return process;
}
}
}
Computation:
IEnumerable<double> Average(IEnumerable<double> values)
{
var sum = 0.0;
var count = 0;
foreach (var value in values)
{
sum += value;
yield return sum/++count;
}
}
LINQ is itself a series of operators that produce objects that implement IEnumerable<T> that don't have any underlying collections.
Good question, BTW!
NB: Any reference to IEnumerable also applies to IEnumerable<T> as the latter inherits the former.
Yes, IEnumerable implies a collection, or possible collection, of items.
The name is derived from enumerate, which means to:
Mention (a number of things) one by one.
Establish the number of.
According to the docs, it exposes the enumerator over a collection.
You can certainly use it on a single object, but this object will then just be exposed as an enumeration containing a single object, i.e. you could have an IEnumerable<int> with a single integer:
IEnumerable<int> items = new[] { 42 };
IEnumerable represents a collection that can be enumerated, not a single item. Look at MSDN; the interface exposes GetEnumerator(), which
...[r]eturns an enumerator that iterates through a collection.
Yes, IEnumerable always implies a collection, that is what enumerate means.
What is your use case for a single object?
I don't see a problem with using it on a single object, but why do want to do this?
I'm not sure whether you mean a "collection" or a .NET "ICollection" but since other people have only mentioned the former I will mention the latter.
http://msdn.microsoft.com/en-us/library/92t2ye13.aspx
By that definition, All ICollections are IEnumerable. But not the other way around.
But most data structure (Array even) just implement both interfaces.
Going on this train of thought: you could have a car depot (a single object) that does not expose an internal data structure, and put IEnumerable on it. I suppose.

How can I add an item to a IEnumerable<T> collection?

My question as title above. For example
IEnumerable<T> items = new T[]{new T("msg")};
items.ToList().Add(new T("msg2"));
but after all it only has 1 item inside. Can we have a method like items.Add(item) like the List<T>?
You cannot, because IEnumerable<T> does not necessarily represent a collection to which items can be added. In fact, it does not necessarily represent a collection at all! For example:
IEnumerable<string> ReadLines()
{
string s;
do
{
s = Console.ReadLine();
yield return s;
} while (!string.IsNullOrEmpty(s));
}
IEnumerable<string> lines = ReadLines();
lines.Add("foo") // so what is this supposed to do??
What you can do, however, is create a new IEnumerable object (of unspecified type), which, when enumerated, will provide all items of the old one, plus some of your own. You use Enumerable.Concat for that:
items = items.Concat(new[] { "foo" });
This will not change the array object (you cannot insert items into to arrays, anyway). But it will create a new object that will list all items in the array, and then "Foo". Furthermore, that new object will keep track of changes in the array (i.e. whenever you enumerate it, you'll see the current values of items).
The type IEnumerable<T> does not support such operations. The purpose of the IEnumerable<T> interface is to allow a consumer to view the contents of a collection. Not to modify the values.
When you do operations like .ToList().Add() you are creating a new List<T> and adding a value to that list. It has no connection to the original list.
What you can do is use the Add extension method to create a new IEnumerable<T> with the added value.
items = items.Add("msg2");
Even in this case it won't modify the original IEnumerable<T> object. This can be verified by holding a reference to it. For example
var items = new string[]{"foo"};
var temp = items;
items = items.Add("bar");
After this set of operations the variable temp will still only reference an enumerable with a single element "foo" in the set of values while items will reference a different enumerable with values "foo" and "bar".
EDIT
I contstantly forget that Add is not a typical extension method on IEnumerable<T> because it's one of the first ones that I end up defining. Here it is
public static IEnumerable<T> Add<T>(this IEnumerable<T> e, T value) {
foreach ( var cur in e) {
yield return cur;
}
yield return value;
}
Have you considered using ICollection<T> or IList<T> interfaces instead, they exist for the very reason that you want to have an Add method on an IEnumerable<T>.
IEnumerable<T> is used to 'mark' a type as being...well, enumerable or just a sequence of items without necessarily making any guarantees of whether the real underlying object supports adding/removing of items. Also remember that these interfaces implement IEnumerable<T> so you get all the extensions methods that you get with IEnumerable<T> as well.
In .net Core, there is a method Enumerable.Append that does exactly that.
The source code of the method is available on GitHub..... The implementation (more sophisticated than the suggestions in other answers) is worth a look :).
A couple short, sweet extension methods on IEnumerable and IEnumerable<T> do it for me:
public static IEnumerable Append(this IEnumerable first, params object[] second)
{
return first.OfType<object>().Concat(second);
}
public static IEnumerable<T> Append<T>(this IEnumerable<T> first, params T[] second)
{
return first.Concat(second);
}
public static IEnumerable Prepend(this IEnumerable first, params object[] second)
{
return second.Concat(first.OfType<object>());
}
public static IEnumerable<T> Prepend<T>(this IEnumerable<T> first, params T[] second)
{
return second.Concat(first);
}
Elegant (well, except for the non-generic versions). Too bad these methods are not in the BCL.
No, the IEnumerable doesn't support adding items to it. The alternative solution is
var myList = new List(items);
myList.Add(otherItem);
To add second message you need to -
IEnumerable<T> items = new T[]{new T("msg")};
items = items.Concat(new[] {new T("msg2")})
I just come here to say that, aside from Enumerable.Concat extension method, there seems to be another method named Enumerable.Append in .NET Core 1.1.1. The latter allows you to concatenate a single item to an existing sequence. So Aamol's answer can also be written as
IEnumerable<T> items = new T[]{new T("msg")};
items = items.Append(new T("msg2"));
Still, please note that this function will not change the input sequence, it just return a wrapper that put the given sequence and the appended item together.
Not only can you not add items like you state, but if you add an item to a List<T> (or pretty much any other non-read only collection) that you have an existing enumerator for, the enumerator is invalidated (throws InvalidOperationException from then on).
If you are aggregating results from some type of data query, you can use the Concat extension method:
Edit: I originally used the Union extension in the example, which is not really correct. My application uses it extensively to make sure overlapping queries don't duplicate results.
IEnumerable<T> itemsA = ...;
IEnumerable<T> itemsB = ...;
IEnumerable<T> itemsC = ...;
return itemsA.Concat(itemsB).Concat(itemsC);
Others have already given great explanations regarding why you can not (and should not!) be able to add items to an IEnumerable. I will only add that if you are looking to continue coding to an interface that represents a collection and want an add method, you should code to ICollection or IList. As an added bonanza, these interfaces implement IEnumerable.
you can do this.
//Create IEnumerable
IEnumerable<T> items = new T[]{new T("msg")};
//Convert to list.
List<T> list = items.ToList();
//Add new item to list.
list.add(new T("msg2"));
//Cast list to IEnumerable
items = (IEnumerable<T>)items;
Easyest way to do that is simply
IEnumerable<T> items = new T[]{new T("msg")};
List<string> itemsList = new List<string>();
itemsList.AddRange(items.Select(y => y.ToString()));
itemsList.Add("msg2");
Then you can return list as IEnumerable also because it implements IEnumerable interface
Instances implementing IEnumerable and IEnumerator (returned from IEnumerable) don't have any APIs that allow altering collection, the interface give read-only APIs.
The 2 ways to actually alter the collection:
If the instance happens to be some collection with write API (e.g. List) you can try casting to this type:
IList<string> list = enumerableInstance as IList<string>;
Create a list from IEnumerable (e.g. via LINQ extension method toList():
var list = enumerableInstance.toList();
IEnumerable items = Enumerable.Empty(T);
List somevalues = new List();
items.ToList().Add(someValues);
items.ToList().AddRange(someValues);
Sorry for reviving really old question but as it is listed among first google search results I assume that some people keep landing here.
Among a lot of answers, some of them really valuable and well explained, I would like to add a different point of vue as, to me, the problem has not be well identified.
You are declaring a variable which stores data, you need it to be able to change by adding items to it ? So you shouldn't use declare it as IEnumerable.
As proposed by #NightOwl888
For this example, just declare IList instead of IEnumerable: IList items = new T[]{new T("msg")}; items.Add(new T("msg2"));
Trying to bypass the declared interface limitations only shows that you made the wrong choice.
Beyond this, all methods that are proposed to implement things that already exists in other implementations should be deconsidered.
Classes and interfaces that let you add items already exists. Why always recreate things that are already done elsewhere ?
This kind of consideration is a goal of abstracting variables capabilities within interfaces.
TL;DR : IMO these are cleanest ways to do what you need :
// 1st choice : Changing declaration
IList<T> variable = new T[] { };
variable.Add(new T());
// 2nd choice : Changing instantiation, letting the framework taking care of declaration
var variable = new List<T> { };
variable.Add(new T());
When you'll need to use variable as an IEnumerable, you'll be able to. When you'll need to use it as an array, you'll be able to call 'ToArray()', it really always should be that simple. No extension method needed, casts only when really needed, ability to use LinQ on your variable, etc ...
Stop doing weird and/or complex things because you only made a mistake when declaring/instantiating.
Maybe I'm too late but I hope it helps anyone in the future.
You can use the insert function to add an item at a specific index.
list.insert(0, item);
Sure, you can (I am leaving your T-business aside):
public IEnumerable<string> tryAdd(IEnumerable<string> items)
{
List<string> list = items.ToList();
string obj = "";
list.Add(obj);
return list.Select(i => i);
}

ICollection - Get single value

What is the best way to get a value from a ICollection?
We know the Collection is empty apart from that.
You can use LINQ for this:.
var foo = myICollection.OfType<YourType>().FirstOrDefault();
// or use a query
var bar = (from x in myICollection.OfType<YourType>() where x.SomeProperty == someValue select x)
.FirstOrDefault();
The simplest way to do this is:
foreach(object o in collection) {
return o;
}
But this isn't particularly efficient if it's actually a generic collection because IEnumerator implements IDisposable, so the compiler has to put in a try/finally, with a Dispose() call in the finally block.
If it's a non-generic collection, or you know the generic collection implements nothing in its Dispose() method, then the following can be used:
IEnumerator en = collection.GetEnumerator();
en.MoveNext();
return en.Current;
If you know if may implement IList, you can do this:
IList iList = collection as IList;
if (iList != null) {
// Implements IList, so can use indexer
return iList[0];
}
// Use the slower way
foreach (object o in collection) {
return o;
}
Likewise, if it's likely it'll be of a certain type of your own definition that has some kind of indexed access, you can use the same technique.
collection.ToArray()[i]
This way is slow, but very simple to use.
Without generics and because ICollection implements IEnumerable you can do like in example 1. With generics you simple need to do like example 2:
List<string> l = new List<string>();
l.Add("astring");
ICollection col1 = (ICollection)l;
ICollection<string> col2 = (ICollection<string>)l;
//example 1
IEnumerator e1 = col1.GetEnumerator();
if (e1.MoveNext())
Console.WriteLine(e1.Current);
//example 2
if (col2.Count != 0)
Console.WriteLine(col2.Single());
If you know your collection has only one item, should only ever have one item, you can use the Linq extension method Single().
This converts a ICollection<T> into a T object containing the single item of that collection. If the length of the collection is 0, or more than one, this will throw an InvalidOperationException.

Categories