LINQ's ForEach on HashSet? - c#

I am curious as to what restrictions necessitated the design decision to not have HashSet's be able to use LINQ's ForEach query.
What's really going on differently behind the scenes for these two implementations:
var myHashSet = new HashSet<T>;
foreach( var item in myHashSet ) { do.Stuff(); }
vs
var myHashSet = new HashSet<T>;
myHashSet.ForEach( item => do.Stuff(); }
I'm (pretty) sure that this is just because HashSet does not implement IEnumerable -- but what is a normal ForEach loop doing differently that makes it more supported by a HashSet?
Thanks

LINQ doesn't have ForEach. Only the List<T> class has a ForEach method.
It's also important to note that HashSet does implement IEnumerable<T>.
Remember, LINQ stands for Language INtegrated Query. It is meant to query collections of data. ForEach has nothing to do with querying. It simply loops over the data. Therefore it really doesn't belong in LINQ.

LINQ is meant to query data, I'm guessing it avoided ForEach() because there's a chance it could mutate data that would affect the way the data could be queried (i.e. if you changed a field that affected the hash code or equality).
You may be confused with the fact that List<T> has a ForEach()?
It's easy enough to write one, of course, but it should be used with caution because of those aforementioned concerns...
public static class EnumerableExtensions
{
public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
{
if (source == null) throw new ArgumentNullException("source");
if (action == null) throw new ArgumentNullException("action");
foreach(var item in source)
{
action(item);
}
}
}

var myHashSet = new HashSet<T>;
myHashSet.ToList().ForEach( x => x.Stuff() );

The first use the method GetEnumerator of HashSet
The second the method ForEach
Maybe the second use GetEnumerator behind the scene but I'm not sure.

Related

Is the IEnumerable<T> function with a yield more efficient than the List<T> function?

I am coding a C# forms application, and would like to know if the following two functions achieve the same result:
public List<object> Method1(int parentId)
{
List<object> allChildren = new List<object>();
foreach (var item in list.Where(c => c.parentHtmlNodeForeignKey == parentId))
{
allChildren.Add(item);
allChildren.AddRange(Method1(item.id));
}
return allChildren;
}
public IEnumerable<object> Method2(int parentId)
{
foreach (var item in list.Where(c => c.parentHtmlNodeForeignKey == parentId))
{
yield return item;
foreach (var itemy in Method2(item.id))
{
yield return itemy;
}
}
}
Am I correct in saying that the Method1 function is more efficient than the Method2?
Also, can either of the above functions be coded to be more efficient?
EDIT
I am using the function to return some objects that are then displayed in a ListView. I am then looping through these same objects to check if a string occurs.
Thanks.
This highly depends on what you want to do. For example if you use FirstOrDefault(p => ....) the yield method can be faster because it's not required to store all the stuff into a list and if the first element is the right one the list method has some overhead ( Of course the yield method has also overhead but as i said it depends ).
If you want to iterate over and over again over the data then you should go with the list.
It depends on lot's of things.
Here are some reasons to use IEnumerable<T> over List<T>:
When you are iterating a part of a collection (e.g. using FirstOrDefault, Any, Take etc.).
When you have an large collection and you can ToList() it (e.g. Fibonacci Series).
When you shouldn't use IEnumerable<T> over List<T>:
When you are enumerating a DB query multiple times with different conditions (You may want the results in memory).
When you want to iterate the whole collection more than once - There is no need to create iterators each time.

Which is more efficient in this case? LINQ Query or FOREACH loop?

In my project, I implemented a service class which has a function naming GetList() which is as follows:
IList<SUB_HEAD> GetList(string u)
{
var collection = (from s in context.DB.SUB_HEAD where (s.head_code.Equals(u))
select s);
return collection.ToList();
}
which can also be implemented as
Arraylist unitlist= new Arraylist();
ObjectSet<SUB_HEAD> List = subheadService.GetAll();
foreach(SUB_HEAD unit in List)
{
unitlist.Add(unit.sub_head_code);
}
Purpose of doing this is to populate dropdown menu.
My question is that "which of the above method will be more efficient with respect to processing?" because my project have lot of places where i have to use drop down menu.
Please, just use the LINQ version. You can perform optimizations later if you profile and determine this is too slow (by the way, it won't be). Also, you can use the functional-style LINQ to make a single expression that I think reads better.
IList<SUB_HEAD> GetList(string u)
{
return context.DB.SUB_HEAD.Where(s => s.head_code == u).ToList();
}
The ToList() method is going to do exactly the same thing as you're doing manually. The implementation in the .NET framework looks something like this:
public static class Enumerable
{
public static List<T> ToList<T>(this IEnumerable<T> source)
{
var list = new List<T>();
foreach (var item in source)
{
list.Add(item);
}
return list;
}
}
If you can express these 4 lines of code with the characters "ToList()" then you should do so. Code duplication is bad, even when it's for something this simple.

How to collect a single property in a list of objects?

Is it possible to create an extension method to return a single property or field in a list of objects?
Currently I have a lot of functions like the following.
public static List<int> GetSpeeds(this List<ObjectMotion> motions) {
List<int> speeds = new List<int>();
foreach (ObjectMotion motion in motions) {
speeds.Add(motion.Speed);
}
return speeds;
}
This is "hard coded" and only serves a single property in a single object type. Its tedious and I'm sure there's a way using LINQ / Reflection to create an extension method that can do this in a generic and reusable way. Something like this:
public static List<TProp> GetProperties<T, TProp>(this List<T> objects, Property prop){
List<TProp> props = new List<TProp>();
foreach (ObjectMotion obj in objects) {
props.Add(obj.prop??);
}
return props;
}
Apart from the easiest method using LINQ, I'm also looking for the fastest method. Is it possible to use code generation (and Lambda expression trees) to create such a method at runtime? I'm sure that would be faster than using Reflection.
You could do:
public static List<TProp> GetProperties<T, TProp>(this IEnumerable<T> seq, Func<T, TProp> selector)
{
return seq.Select(selector).ToList();
}
and use it like:
List<int> speeds = motions.GetProperties(m => m.Speed);
it's questionable whether this method is better than just using Select and ToList directly though.
It is, no reflection needed:
List<int> values = motions.Select(m=>m.Speed).ToList();
A for loop would be the fastest I think, followed closely by linq (minimal overhead if you don't do use closures). I can't image any other mechanism would be any better than that.
You could replace the List<int> with a int[] or initialize the list with a certain capacity. That would probably do more to speed up your code than anything else (though still not much).

Duplicate IEnumerable, List and Cast

after reading this very interesting thread on duplicate removal, i ended with this =>
public static IEnumerable<T> deDuplicateCollection<T>(IEnumerable<T> input)
{
var hs = new HashSet<T>();
foreach (T t in input)
if (hs.Add(t))
yield return t;
}
by the way, as i'm brand new to C# and coming from Python, i'm a bit lost between casting and this kind of thing... i was able to compile and build with :
foreach (KeyValuePair<long, List<string>> kvp in d)
{
d[kvp.Key] = (List<string>) deDuplicateCollection(kvp.Value);
}
but i must have missed something here... as i get a "System.InvalidCastException" # runtime, maybe could you point interesting things about casting and where i'm wrong? Thank you in advance.
First, about the usage of the method.
Drop the cast, invoke ToList() on the result of the method. The result of the method is IEnumerable<string>, this is not a List<string>. The fact the source is originally a List<string> is irrelevant, you don't return the list, you yield return a sequence.
d[kvp.Key] = deDuplicateCollection(kvp.Value).ToList();
Second, your deDuplicateCollection method is redundant, Distinct() already exists in the library and performs the same function.
d[kvp.Key] = kvp.Value.Distinct().ToList();
Just be sure you have a using System.Linq; in the directives so you can use these Distinct() and ToList() extension methods.
Finally, you'll notice making this change alone, you run into a new exception when trying to change the dictionary in the loop. You cannot update the collection in a foreach. The simplest way to do what you want is to omit the explicit loop entirely. Consider
d = d.ToDictionary(kvp => kvp.Key, kvp => kvp.Value.Distinct().ToList());
This uses another Linq extension method, ToDictionary(). Note: this creates a new dictionary in memory and updates d to reference it. If you need to preserve the original dictionary as referenced by d, then you would need to approach this another way. A simple option here is to build a dictionary to shadow d, and then update d with it.
var shadow = new Dictionary<string, string>();
foreach (var kvp in d)
{
shadow[kvp.Key] = kvp.Value.Distinct().ToList();
}
foreach (var kvp in shadow)
{
d[kvp.Key] = kvp.Value;
}
These two loops are safe, but you see you need to loop twice to avoid the problem of updating the original collection while enumerating over it while also preserving the original collection in memory.
d[kvp.Key] = kvp.Value.Distinct().ToList();
There is already a Distinct extension method to remove duplicates!

Pushing Items into stack with LINQ

How can i programatically push an array of strings into generic Stack ?
string array
string[] array=new string[]{"Liza","Ana","Sandra","Diya"};
Stack Setup
public class stack<T>
{
private int index;
List<T> list;
public stack()
{
list = new List<T>();
index=-1;
}
public void Push(T obj)
{
list.Add(obj);
index++;
}
...........
}
What is the change do i need here ?
stack<string> slist = new stack<string>();
var v = from vals in array select (p => slist.Push(p));
Error Report :
The type of the expression in the select clause is incorrect.
LINQ is a query language/framework. What you want to perform here is a modification to a collection object rather than a query (selection) - this is certainly not what LINQ is designed for (or even capable of).
What you might like to do, however, is to define an extension method that for the Stack<T> class, however. Note that it also makes sense to here to use the BCL Stack<T> type, which is exactly what you need, instead of reinventing the wheel using List<T>.
public static void PushRange<T>(this Stack<T> source, IEnumerable<T> collection)
{
foreach (var item in collection)
source.Push(item);
}
Which would then allow you do the following:
myStack.PushRange(myCollection);
And if you're not already convinced, another philosophical reason: LINQ was created to bring functional paradigms to C#/.NET, and at the core of functional programming is side-effect free code. Combining LINQ with state-modifying code would thus be quite inconsistent.
The first issue is you Push returns a void. Select is expecting a value of something.
You are just doing a loop and don't need to use link.
Since you stack is internally storing a list, you can create a list by passing it an array.
so in your case
List<string> myList = new List<string>(array);
Creats the list.
Change
public void Push(T obj)
to
public T Push(T obj)
and ignore the return values.
Disclaimer: I would not recommend mutation like this.
Try this
string[] arr = new string[]{"a","f"};
var stack = new Stack<string>();
arr.ToList().ForEach(stack.Push);
While this is "cool" is isn't any better than a for loop.
Push needs a return type for you to be able to use it in a select clause. As it is, it returns void. Your example is, I think, a horrible abuse of LINQ. Even if it worked, you'd be using a side-effect of the function in the select clause to accomplish something totally unrelated to the task that the select is intended for. If there was a "ForEach" extension, then it would be reasonable to use LINQ here, but I'd avoid it and stick with a foreach loop.
foreach (var val in array)
{
slist.Push(val);
}
This is much clearer in intent and doesn't leave one scratching their head over what you're trying to accomplish.

Categories