ICollection - check if a collection contains an object - c#

Knowing that the non-generic ICollection doesn't offer a Contains method, what's the best way to check if a given object already is in a collection?
If I had two ICollections: A and B and wanted to check if B has all elements of A, what would be the best way to accomplish that? My first thought is adding all elements of A to a HashSet and then checking if all B's elements are in the set using Contains.

If I had two ICollections A and B and wanted to check if B has all elements of A, what would be the best way to accomplish that?
Let me rephrase your question in the languages of sets.
If I had two sets A and B and wanted to check if A is a subset of B, what would be the best way to accomplish that?
Now it becomes easy to see the answer:
https://msdn.microsoft.com/en-us/library/bb358446%28v=vs.110%29.aspx?f=255&MSPPError=-2147217396
Construct a HashSet<T> from A and then use the IsSubsetOf method to see if A is a subset of B.
I note that if these are the sorts of operations you must perform frequently, then you should keep your data in HashSet<T> collections to begin with. The IsSubsetOf operation is possibly more efficient if both collections are hash sets.

A and B and wanted to check if B has all elements of A
I think you have it backwards. Add the B to the HashSet.
HashSet.Contains is O(1)
Overall it will be O(n + m)
Going to assume string
HashSet<string> HashSetB = new HashSet<string>(iCollecionB);
foreach (string s in iCollecionA)
{
if(HashSetB.Contains(s))
{
}
else
{
}
}

Boolean ICollectionContains(ICollection collection, Object item)
{
for (Object o in collection)
{
if (o == item)
return true;
}
return false;
}
Or in extension form:
public static class CollectionExtensions
{
public static Boolean Contains(this ICollection collection, Object item)
{
for (Object o in collection)
{
if (o == item)
return true;
}
return false;
}
}
With usage:
ICollection turboEncabulators = GetSomeTrunnions();
if (turboEncabulators.Contains(me))
Environment.FailFast(); //How did you find me!

Related

How do I verify a collection of values is unique (contains no duplicates) in C#

Surely there is an easy way to verify a collection of values has no duplicates [using the default Comparison of the collection's Type] in C#/.NET ? Doesn't have to be directly built in but should be short and efficient.
I've looked a lot but I keep hitting examples of using collection.Count() == collection.Distinct().Count() which for me is inefficient. I'm not interested in the result and want to bail out as soon as I detect a duplicate, should that be the case.
(I'd love to delete this question and/or its answer if someone can point out the duplicates)
Okay, if you just want to get out as soon as the duplicate is found, it's simple:
// TODO: add an overload taking an IEqualityComparer<T>
public bool AllUnique<T>(this IEnumerable<T> source)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
var distinctItems = new HashSet<T>();
foreach (var item in source)
{
if (!distinctItems.Add(item))
{
return false;
}
}
return true;
}
... or use All, as you've already shown. I'd argue that this is slightly simpler to understand in this case... or if you do want to use All, I'd at least separate the creation of the set from the method group conversion, for clarity:
public static bool IsUnique<T>(this IEnumerable<T> source)
{
// TODO: validation
var distinctItems = new HashSet<T>();
// Add will return false if the element already exists. If
// every element is actually added, then they must all be unique.
return source.All(distinctItems.Add);
}
Doing it inline, you can replace:
collection.Count() == collection.Distinct().Count()
with
collection.All( new HashSet<T>().Add );
(where T is the type of your collection's elements)
Or you can extract the above to a helper extension method[1] so you can say:
collection.IsUnique()
[1]
static class EnumerableUniquenessExtensions
{
public static bool IsUnique<T>(this IEnumerable<T> that)
{
return that.All( new HashSet<T>().Add );
}
}
(and as Jon has pointed out in his answer, one really should separate and comment the two lines as such 'cuteness' is generally Not A Good Idea)

List contains in List check

I have a IEnumerable<Object> a with 6 items in chronological order in it.
I want to test if list IEnumerable<Object> b with 3 items in chronological order.
IEnumerable<Object> a item values: a,b,c,d,f,g
IEnumerable<Object> b item values: b,d,f
Is it possible to be done with LINQ ?
You can use the following:
bool AContainsEverythingInBInTheSameOrder =
a.Intersect(b).SequenceEquals(b);
a.Intersect(b) returns everything that is in both a and b, in the same order in which it appears in a.
The one liner approach of Rawling and Tim is very nice, but it has one little gotcha: b is iterated twice.
If that is a problem for you, you could use an iterator based approach. This can be created as an extension method:
public static bool IsContainedWithinInOrder<T>(this IEnumerable<T> values,
IEnumerable<T> reference)
{
using(var iterator = reference.GetEnumerator())
{
foreach(var item in values)
{
do
{
if(!iterator.MoveNext())
return false;
} while(!Equals(iterator.Current, item));
}
return true;
}
}
This would iterate both sequences only once and overall is more lightweight. You would call it like this:
b.IsContainedWithinInOrder(a);
Please forgive the name of the method...
I assume that you have two lists and you want to check if the second list item have the same order as the same items in the first list.
Perhaps:
var allSameOrder = list1.Intersect(list2).SequenceEqual(list2);
Demo

Does IEnumerable always imply a collection?

Just a quick question regarding IEnumerable:
Does IEnumerable always imply a collection? Or is it legitimate/viable/okay/whatever to use on a single object?
The IEnumerable and IEnumerable<T> interfaces suggest a sequence of some kind, but that sequence doesn't need to be a concrete collection.
For example, where's the underlying concrete collection in this case?
foreach (int i in new EndlessRandomSequence().Take(5))
{
Console.WriteLine(i);
}
// ...
public class EndlessRandomSequence : IEnumerable<int>
{
public IEnumerator<int> GetEnumerator()
{
var rng = new Random();
while (true) yield return rng.Next();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
It is always and mandatory that IEnumerable is used on a single object - the single object is always the holder or producer of zero or more other objects that do not necessarily have any relation to IEnumerable.
It's usual, but not mandatory, that IEnumerable represents a collection.
Enumerables can be collections, as well as generators, queries, and even computations.
Generator:
IEnumerable<int> Generate(
int initial,
Func<int, bool> condition,
Func<int, int> iterator)
{
var i = initial;
while (true)
{
yield return i;
i = iterator(i);
if (!condition(i))
{
yield break;
}
}
}
Query:
IEnumerable<Process> GetProcessesWhereNameContains(string text)
{
// Could be web-service or database call too
var processes = System.Diagnostics.Process.GetProcesses();
foreach (var process in processes)
{
if (process.ProcessName.Contains(text))
{
yield return process;
}
}
}
Computation:
IEnumerable<double> Average(IEnumerable<double> values)
{
var sum = 0.0;
var count = 0;
foreach (var value in values)
{
sum += value;
yield return sum/++count;
}
}
LINQ is itself a series of operators that produce objects that implement IEnumerable<T> that don't have any underlying collections.
Good question, BTW!
NB: Any reference to IEnumerable also applies to IEnumerable<T> as the latter inherits the former.
Yes, IEnumerable implies a collection, or possible collection, of items.
The name is derived from enumerate, which means to:
Mention (a number of things) one by one.
Establish the number of.
According to the docs, it exposes the enumerator over a collection.
You can certainly use it on a single object, but this object will then just be exposed as an enumeration containing a single object, i.e. you could have an IEnumerable<int> with a single integer:
IEnumerable<int> items = new[] { 42 };
IEnumerable represents a collection that can be enumerated, not a single item. Look at MSDN; the interface exposes GetEnumerator(), which
...[r]eturns an enumerator that iterates through a collection.
Yes, IEnumerable always implies a collection, that is what enumerate means.
What is your use case for a single object?
I don't see a problem with using it on a single object, but why do want to do this?
I'm not sure whether you mean a "collection" or a .NET "ICollection" but since other people have only mentioned the former I will mention the latter.
http://msdn.microsoft.com/en-us/library/92t2ye13.aspx
By that definition, All ICollections are IEnumerable. But not the other way around.
But most data structure (Array even) just implement both interfaces.
Going on this train of thought: you could have a car depot (a single object) that does not expose an internal data structure, and put IEnumerable on it. I suppose.

Best way to compare two Dictionary<T> for equality

Is this the best way to create a comparer for the equality of two dictionaries? This needs to be exact. Note that Entity.Columns is a dictionary of KeyValuePair(string, object) :
public class EntityColumnCompare : IEqualityComparer<Entity>
{
public bool Equals(Entity a, Entity b)
{
var aCol = a.Columns.OrderBy(KeyValuePair => KeyValuePair.Key);
var bCol = b.Columns.OrderBy(KeyValuePAir => KeyValuePAir.Key);
if (aCol.SequenceEqual(bCol))
return true;
else
return false;
}
public int GetHashCode(Entity obj)
{
return obj.Columns.GetHashCode();
}
}
Also not too sure about the GetHashCode implementation.
Thanks!
Here's what I would do:
public bool Equals(Entity a, Entity b)
{
if (a.Columns.Count != b.Columns.Count)
return false; // Different number of items
foreach(var kvp in a.Columns)
{
object bValue;
if (!b.Columns.TryGetValue(kvp.Key, out bValue))
return false; // key missing in b
if (!Equals(kvp.Value, bValue))
return false; // value is different
}
return true;
}
That way you don't need to order the entries (which is a O(n log n) operation) : you only need to enumerate the entries in the first dictionary (O(n)) and try to retrieve values by key in the second dictionary (O(1)), so the overall complexity is O(n).
Also, note that your GetHashCode method is incorrect: in most cases it will return different values for different dictionary instances, even if they have the same content. And if the hashcode is different, Equals will never be called... You have several options to implement it correctly, none of them ideal:
build the hashcode from the content of the dictionary: would be the best option, but it's slow, and GetHashCode needs to be fast
always return the same value, that way Equals will always be called: very bad if you want to use this comparer in a hashtable/dictionary/hashset, because all instances will fall in the same bucket, resulting in O(n) access instead of O(1)
return the Count of the dictionary (as suggested by digEmAll): it won't give a great distribution, but still better than always returning the same value, and it satisfies the constraint for GetHashCode (i.e. objects that are considered equal should have the same hashcode; two "equal" dictionaries have the same number of items, so it works)
Something like this comes to mind, but there might be something more efficient:
public static bool Equals<TKey, TValue>(IDictionary<TKey, TValue> x,
IDictionary<TKey, TValue> y)
{
return x.Keys.Intersect(y.Keys).Count == x.Keys.Count &&
x.Keys.All(key => Object.Equals(x[key], y[key]));
}
It seems good to me, perhaps not the fastest but working.
You just need to change the GetHashCode implementation that is wrong.
For example you could return obj.Columns.Count.GetHashCode()

How to validate if a collection contains all unique objects

I have a C# collection of objects that do not implement IEquatable or IComparable. I want to check if the collection contains duplicate objects. I.e. I want to know if Object.ReferenceEquals(x, y) is false for any x and y in my list.
How would I do that efficiently?
It would be nice with both a C# and a LINQ method.
Non-LINQ, when your collection implements ICollection<T> or ICollection:
bool allItemsUnique =
new HashSet<YourType>(yourCollection).Count == yourCollection.Count;
Non-LINQ, when your collection doesn't implement ICollection<T> or ICollection. (This version has slightly better theoretical performance than the first because it will break out early as soon as a duplicate is found.)
bool allItemsUnique = true;
var tempSet = new HashSet<YourType>();
foreach (YourType obj in yourCollection)
{
if (!tempSet.Add(obj))
{
allItemsUnique = false;
break;
}
}
LINQ. (This version's best case performance -- when your collection implements ICollection<T> or ICollection -- will be roughly the same as the first non-LINQ solution. If your collection doesn't implement ICollection<T> or ICollection then the LINQ version will be less efficient.)
bool allItemsUnique =
yourCollection.Distinct().Count() == yourCollection.Count();
I would suggest you to use
collection.GroupBy(x=>x).Any(x=>x.Count() != 1)
Profit is: iterating through collection would stop as soon, as first duplicate object would be found.

Categories