How can I make ComponentTraversal.GetDescendants() better using LINQ?
Question
public static class ComponentTraversal
{
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
//How can I do this better using LINQ?
IList<Component> descendants = new Component[]{};
foreach(var child in composite.Children)
{
descendants.Add(child);
if(child is Composite)
{
descendants.AddRange((child as Composite).GetDescendants());
}
}
return descendants;
}
}
public class Component
{
public string Name { get; set; }
}
public class Composite: Component
{
public IEnumerable<Component> Children { get; set; }
}
public class Leaf: Component
{
public object Value { get; set; }
}
Answer
I edited Chris's answer to provide a generic extension method that I've added to my Common library. I can see this being helpful for other people as well so here it is:
public static IEnumerable<T> GetDescendants<T>(this T component, Func<T,bool> isComposite, Func<T,IEnumerable<T>> getCompositeChildren)
{
var children = getCompositeChildren(component);
return children
.Where(isComposite)
.SelectMany(x => x.GetDescendants(isComposite, getCompositeChildren))
.Concat(children);
}
Thanks Chris!
Also,
Please look at LukeH's answer at http://blogs.msdn.com/b/wesdyer/archive/2007/03/23/all-about-iterators.aspx . His answer provides a better way to approach this problem in general, but I did not select it because it was not a direct answer to my question.
There are often good reasons to avoid (1) recursive method calls, (2) nested iterators, and (3) lots of throwaway allocations. This method avoids all of those potential pitfalls:
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
var stack = new Stack<Component>();
do
{
if (composite != null)
{
// this will currently yield the children in reverse order
// use "composite.Children.Reverse()" to maintain original order
foreach (var child in composite.Children)
{
stack.Push(child);
}
}
if (stack.Count == 0)
break;
Component component = stack.Pop();
yield return component;
composite = component as Composite;
} while (true);
}
And here's the generic equivalent:
public static IEnumerable<T> GetDescendants<T>(this T component,
Func<T, bool> hasChildren, Func<T, IEnumerable<T>> getChildren)
{
var stack = new Stack<T>();
do
{
if (hasChildren(component))
{
// this will currently yield the children in reverse order
// use "composite.Children.Reverse()" to maintain original order
// or let the "getChildren" delegate handle the ordering
foreach (var child in getChildren(component))
{
stack.Push(child);
}
}
if (stack.Count == 0)
break;
component = stack.Pop();
yield return component;
} while (true);
}
var result = composite.Children.OfType<Composite>().SelectMany(child => child.GetDescendants()).Concat(composite.Children);
return result.ToList();
When doing a translation from imperitive syntax to LINQ, it is usually pretty easy to take the translation one step at a time. Here is how this works:
This is looping over composite.Children, so that will be the collection we apply LINQ to.
There are two general operations occuring in the loop, so lets do one of them at a time
The "if" statement is performing a filter. Normally, we would use "Where" to perform a filter, but in this case the filter is based on type. LINQ has "OfType" built in for this.
For each child composite, we want to recursively call GetDescendants and add the results to a single list. Whenever we want to transform an element into something else, we use either Select or SelectMany. Since we want to transform each element into a list and merge them all together, we use SelectMany.
Finally, to add in the composite.Children themselves, we concatenate those results to the end.
I don't know about better, but I think this performs the same logic:
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
return composite.Children
.Concat(composite.Children
.Where(x => x is Composite)
.SelectMany(x => x.GetDescendants())
);
}
It might be shorter, but there is nothing wrong with what you have. As I said above, this is supposed to perform the same thing and I doubt that the performance of the function is improved.
This is a good example for when you might want to implement an iterator. This has the advantage of lazy evaluation in a slightly more readable syntax. Also, if you need to add additional custom logic then this form is more extensible
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
foreach(var child in composite.Children)
{
yield return child;
if(!(child is Composite))
continue;
foreach (var subChild in ((Composite)child).GetDescendants())
yield return subChild;
}
}
Related
This is my code (extension method)
public static IEnumerable<uint> GetFieldVals(this DataSource rs, IEnumerable<string> columnNames, Predicate<uint> shouldRun)
{
var rList = new List<uint>();
if (columnNames.Any())
foreach (var name in columnNames)
{
rs.GetFieldVal(name, out uint temp);
if (shouldRun(temp))
{
rList.Add(temp);
}
}
return rList;
}
This works. However, if I change it to this, the results are all the final item in the generated collection (although the Count is correct value).
public static IEnumerable<uint> GetFieldVals(this DataSource rs, IEnumerable<string> columnNames, Predicate<uint> shouldRun)
{
if (!columnNames.Any()) yield break;
foreach (var name in columnNames)
{
rs.GetFieldVal(name, out uint temp);
if (shouldRun(temp))
{
yield return temp;
}
}
}
What gives?
EDIT
Thank you everyone for your comments. I had written this in a bit of a hurry and then had a busy weekend, so I was unable to properly address this. I will do so now. You're all 100% correct that I left out too much.
I am attempting to take a clunky DataSource api and create an IEnumerable of valueobject items with it (which is easier and more flexible to work with). I am implementing this using a factory to keep it portable; my factory method implementation calls the code I wrote in my original post. This is a sample of what my valueobject looks like:
public class MyTableDataObject : IDataObject<uint>
{
public uint ID { get; set; }
public string Name { get; set; }
//MOAR properties
public IEnumerable<uint> SomeCollection { get; set; }
//MOAR properties
}
The issue I spoke about occurs when I have a collection of some type as a property in my valueobject (ie "SomeCollection" in the snippet above)
FWIW, here's my code for the collection of columnnames That I pass to the extension method from my original post.
public static IEnumerable<string> ColumnNames
{
get
{
yield return "COLUMNNAME00";
yield return "COLUMNNAME01";
yield return "COLUMNNAME02";
yield return "COLUMNNAME03";
yield return "COLUMNNAME04";
yield return "COLUMNNAME05";
yield return "COLUMNNAME06";
yield return "COLUMNNAME07";
yield return "COLUMNNAME08";
yield return "COLUMNNAME09";
yield return "COLUMNNAME10";
yield return "COLUMNNAME11";
yield return "COLUMNNAME12";
yield return "COLUMNNAME13";
yield return "COLUMNNAME14";
yield return "COLUMNNAME15";
}
}
Here is the calling code.
var rs = new DataSource();
rs.Open("Select * From MyTable");
//The Generic type on the enumerable indicates the type of the identifier of the items, not that the Enumerable is itself a list of uints. Do not get confused by this!
var dse = new DataSourceEnumerable<uint>(rs, new MyTableDataObjectFactory());
using (var writer = new MyWriterFacade("MyOutput.json"))
{
var json = new JsonSerializer(); //Newtonsoft.Json lib
var str = JsonConvert.SerializeObject(dse, Formatting.Indented);
writer.Write(str);
}
While the output json file's values are mostly correct, each "SomeCollection" has the same items in it (I believe it's the last item's SomeCollection values) when I use the yield keyword. When I don't use yield and use more traditional code, though, the json output illustrates the correct values for each SomeCollection in the file.
This is the code in the actual Enumerable:
public DataSourceEnumerable(DataSource ds, DataObjectFactory<T, DataSource> factory)
{
ds.MoveFirst();
innerList = new List<IDataObject<T>>();
_enumerator = Create(ds, factory, innerList);
}
public static IEnumerator<IDataObject<T>> Create(DataSource ds, DataObjectFactory<T, DataSource> factory,
IList<IDataObject<T>> innerList)
{
while (!ds.Eof)
{
innerList.Add(factory.InitializeDataObject<object, object>(ds));
ds.MoveNext();
}
return new DataSourceEnumerator(innerList);
}
I hope that sheds some light on it, if anyone can break that down for me a bit better. Appreciate it!
The only real difference in the code as shown relates to timing. With the list version, the operations are performed when the method is called. With the yield version, it is called later, when the result of the method is actually iterated.
Now: things can sometimes.change between calling a method that returns a sequence, and iterating that sequence. For example, the content of the data source or field sequence parameters could change. Or the logic of the predicate could change, usually due to "captured variables". So: the difference is in the code that calls this, which we can't see. But: look for timing between calling the method, and actually iterating over it (foreach etc).
Im busy updating an entity using entity framework and web api (on the PUT method of the controller). For each collection property on the updated object, I loop through and check if each item exists in the collection on the existing object or not. If not, I add it.
The trouble is I have a lot of collections on the object and I find myself repeating the following code many times over.
Is there a way for me to wrap this into a generic method and pass that method the 2 collections to compare? Maybe by specifying the name of the property to check and primary key? How would I be able to specify the type for the foreach loop for example?
foreach (HBGender gender in updated.HBGenders)
{
HBGender _gender = existing.HBGenders.FirstOrDefault(o => o.GenderID == gender.GenderID);
if (_gender == null)
{
//do some stuff here like attach and add
}
}
return existing; //return the modified object
Thanks in advance. I hope this makes sense.
In its simplest form you could write an extension method as such:
public static class IEnumerableExtensionMethods
{
public static ICollection<T> ForEachAndAdd<T>(this IEnumerable<T> self,
ICollection<T> other,
Func<T, T, bool> predicate) where T : class
{
foreach(var h1 in self)
{
if(other.FirstOrDefault(h2 => predicate(h1, h2)) == null)
other.Add(h1);
}
return other;
}
}
Usage:
List<HBGender> updated = new List<HBGender>();
List<HBGender> existing = new List<HBGender<();
return updated.ForEachAndAdd(existing, (h1, h2) => h1.Gender == h2.Gender);
Note that if there is extra logic needed during an add, you could add an additonal Action<T> parameter to do so.
I don't know what you are trying to do, but you can play with this example:
List<object> a = new List<object>();
a.Add("awgf");
a.Add('v');
a.Add(4);
foreach (object b in a)
{
Type type = b.GetType().//Select more usefull
Convert.ChangeType(object,type);
}
Just pass your existing check function, as an extra parameter
public List<Class1> Find(List<Class1> updated, List<Class1> existing, Func<Class1, bool> predicate)
{
foreach (Class1 gender in updated)
{
Class1 _gender = existing.FirstOrDefault(predicate); //predicate for quoted example will be o => o.GenderID == gender.GenderID
if (_gender == null)
{
//do some stuff here like attach and add
}
}
return existing;
}
I'm trying to maintain a list of unique models from a variety of queries. Unfortunately, the equals method of our models are not defined, so I couldn't use a hash map easily.
As a quick fix I used the following code:
public void AddUnique(
List<Model> source,
List<Model> result)
{
if (result != null)
{
if (result.Count > 0
&& source != null
&& source.Count > 0)
{
source.RemoveAll(
s => result.Contains(
r => r.ID == s.ID));
}
result.AddRange(source);
}
}
Unfortunately, this does not work. When I step throught the code, I find that even though I've checked to make sure that there was at least one Model with the same ID in both source and result, the RemoveAll(Predicate<Model>) line does not change the number of items in source. What am I missing?
The above code shouldn't even compile, as Contains expects a Model, not a predicate.
You can use Any() instead:
source.RemoveAll(s => result.Any(r => r.ID == s.ID));
This will remove the items from source correctly.
I might opt to tackle the problem a different way.
You said you do not have suitable implementations of equality inside the class. Maybe you can't change that. However, you can define an IEqualityComparer<Model> implementation that allows you to specify appropriate Equals and GetHashCode implementations external to the actual Model class itself.
var comparer = new ModelComparer();
var addableModels = newSourceOfModels.Except(modelsThatAlreadyExist, comparer);
// you can then add the result to the existing
Where you might define the comparer as
class ModelComparer : IEqualityComparer<Model>
{
public bool Equals(Model x, Model y)
{
// validations omitted
return x.ID == y.ID;
}
public int GetHashCode(Model m)
{
return m.ID.GetHashCode();
}
}
source.RemoveAll(source.Where(result.Select(r => r.ID).Contains(source.Select(s => s.ID))));
The goal of this statement is to make two enumerations of IDs, one for source and one for result. It then will return true to the where statement for each of the elements in both enumerations. Then it will remove any elements that return true.
Your code is removing all the models which are the same between the two lists, not those which have the same ID. Unless they're actually the same instances of the model, it won't work like you're expecting.
Sometimes I use these extension methods for that sort of thing:
public static class CollectionHelper
{
public static void RemoveWhere<T>(this IList<T> list, Func<T, bool> selector)
{
var itemsToRemove = list.Where(selector).ToList();
foreach (var item in itemsToRemove)
{
list.Remove(item);
}
}
public static void RemoveWhere<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, Func<KeyValuePair<TKey, TValue>, bool> selector)
{
var itemsToRemove = dictionary.Where(selector).ToList();
foreach (var item in itemsToRemove)
{
dictionary.Remove(item);
}
}
}
Are multiple iterators (for a single class or object) possible in C# .NET? If they are, give me some simple examples.
Sorry if the question is not understandable and please make me clear.
You could certainly create different iterators to traverse in different ways. For example, you could have:
public class Tree<T>
{
public IEnumerable<T> IterateDepthFirst()
{
// Iterate, using yield return
...
}
public IEnumerable<T> IterateBreadthFirst()
{
// Iterate, using yield return
...
}
}
Is that the kind of thing you were asking?
You could also potentially write:
public class Foo : IEnumerable<int>, IEnumerable<string>
but that would cause a lot of confusion, and the foreach loop would pick whichever one had the non-explicitly-implemented GetEnumerator call.
You can also iterate multiple times over the same collection at the same time:
foreach (Person person1 in party)
{
foreach (Person person2 in party)
{
if (person1 != person2)
{
person1.SayHello(person2);
}
}
}
It's not really clear if you mean that you can implement more than one iterator for a class, or if you can use more than one iterater for a class at a time. Either is possible.
You can have as many iterators as you like for a class:
public class OddEvenList<T> : List<T> {
public IEnumerable<T> GetOddEnumerator() {
return this.Where((x, i) => i % 2 == 0);
}
public IEnumerable<T> GetEvenEnumerator() {
return this.Where((x, i) => i % 2 == 1);
}
}
You can have as many instances of an iterator for a class active at the same time as you like:
foreach (int x in list) {
foreach (int y in list) {
foreach (int z in list) {
...
}
}
}
One option would be to implement the Strategy pattern:
Create separate IEnumerator classes for each traversal strategy.
Create a private attribute in the collection that stores the current strategy (with a default).
Create a SetStrategy() method that changes that private attribute to the selected concrete strategy.
Override GetEnumerator() to return an instance of the current strategy.
Of course, this means two threads trying to set the strategy at the same time could interfere, so if sharing the collection between threads is important, this isn't the best solution.
A straight Iterator pattern would also work, which is what I believe Jon Skeet is suggesting in his first example, but you lose the syntactic sugar of being able to use foreach.
I have an private List and I want to expose the ability to query the List and return a new List with new cloned items. I can pass in delegates for filtering and sorting, which works fine, but being able to use Linq expressions would be much more elegant.
I've added an simplified example of what I'm trying to do, which might help as I don't think I've explained what I want to do very well.
public class Repository
{
private List<SomeModel> _models;
private object _lock;
public List<SomeModel> GetModels(Func<SomeModel, bool> predicate)
{
List<SomeModel> models;
lock (_lock)
{
models = _models.Where(m => predicate(m))
.Select(m => new SomeModel(m))
.ToList();
}
return models;
}
}
Why does your code involve locking? Assuming your "SomeModel" class has a copy constructor as your example suggests, this should work:
public List<SomeModel> GetModels(Predicate<SomeModel> predicate)
{
return _models.Where(m => predicate(m))
.Select(m => new SomeModel(m))
.ToList();
}
You could expose the private collection as an iterator block by doing something like this:
public IEnumerable<Model> Models
{
get
{
foreach (Model mod in this._models)
yield return new Model(mod);
// equivalent to:
// return _models.Select(m => new Model(m));
// as per Jon's comment
}
}
Which would give you the ability to then write queries against it like any other IEnumerable datasource.