This is my code (extension method)
public static IEnumerable<uint> GetFieldVals(this DataSource rs, IEnumerable<string> columnNames, Predicate<uint> shouldRun)
{
var rList = new List<uint>();
if (columnNames.Any())
foreach (var name in columnNames)
{
rs.GetFieldVal(name, out uint temp);
if (shouldRun(temp))
{
rList.Add(temp);
}
}
return rList;
}
This works. However, if I change it to this, the results are all the final item in the generated collection (although the Count is correct value).
public static IEnumerable<uint> GetFieldVals(this DataSource rs, IEnumerable<string> columnNames, Predicate<uint> shouldRun)
{
if (!columnNames.Any()) yield break;
foreach (var name in columnNames)
{
rs.GetFieldVal(name, out uint temp);
if (shouldRun(temp))
{
yield return temp;
}
}
}
What gives?
EDIT
Thank you everyone for your comments. I had written this in a bit of a hurry and then had a busy weekend, so I was unable to properly address this. I will do so now. You're all 100% correct that I left out too much.
I am attempting to take a clunky DataSource api and create an IEnumerable of valueobject items with it (which is easier and more flexible to work with). I am implementing this using a factory to keep it portable; my factory method implementation calls the code I wrote in my original post. This is a sample of what my valueobject looks like:
public class MyTableDataObject : IDataObject<uint>
{
public uint ID { get; set; }
public string Name { get; set; }
//MOAR properties
public IEnumerable<uint> SomeCollection { get; set; }
//MOAR properties
}
The issue I spoke about occurs when I have a collection of some type as a property in my valueobject (ie "SomeCollection" in the snippet above)
FWIW, here's my code for the collection of columnnames That I pass to the extension method from my original post.
public static IEnumerable<string> ColumnNames
{
get
{
yield return "COLUMNNAME00";
yield return "COLUMNNAME01";
yield return "COLUMNNAME02";
yield return "COLUMNNAME03";
yield return "COLUMNNAME04";
yield return "COLUMNNAME05";
yield return "COLUMNNAME06";
yield return "COLUMNNAME07";
yield return "COLUMNNAME08";
yield return "COLUMNNAME09";
yield return "COLUMNNAME10";
yield return "COLUMNNAME11";
yield return "COLUMNNAME12";
yield return "COLUMNNAME13";
yield return "COLUMNNAME14";
yield return "COLUMNNAME15";
}
}
Here is the calling code.
var rs = new DataSource();
rs.Open("Select * From MyTable");
//The Generic type on the enumerable indicates the type of the identifier of the items, not that the Enumerable is itself a list of uints. Do not get confused by this!
var dse = new DataSourceEnumerable<uint>(rs, new MyTableDataObjectFactory());
using (var writer = new MyWriterFacade("MyOutput.json"))
{
var json = new JsonSerializer(); //Newtonsoft.Json lib
var str = JsonConvert.SerializeObject(dse, Formatting.Indented);
writer.Write(str);
}
While the output json file's values are mostly correct, each "SomeCollection" has the same items in it (I believe it's the last item's SomeCollection values) when I use the yield keyword. When I don't use yield and use more traditional code, though, the json output illustrates the correct values for each SomeCollection in the file.
This is the code in the actual Enumerable:
public DataSourceEnumerable(DataSource ds, DataObjectFactory<T, DataSource> factory)
{
ds.MoveFirst();
innerList = new List<IDataObject<T>>();
_enumerator = Create(ds, factory, innerList);
}
public static IEnumerator<IDataObject<T>> Create(DataSource ds, DataObjectFactory<T, DataSource> factory,
IList<IDataObject<T>> innerList)
{
while (!ds.Eof)
{
innerList.Add(factory.InitializeDataObject<object, object>(ds));
ds.MoveNext();
}
return new DataSourceEnumerator(innerList);
}
I hope that sheds some light on it, if anyone can break that down for me a bit better. Appreciate it!
The only real difference in the code as shown relates to timing. With the list version, the operations are performed when the method is called. With the yield version, it is called later, when the result of the method is actually iterated.
Now: things can sometimes.change between calling a method that returns a sequence, and iterating that sequence. For example, the content of the data source or field sequence parameters could change. Or the logic of the predicate could change, usually due to "captured variables". So: the difference is in the code that calls this, which we can't see. But: look for timing between calling the method, and actually iterating over it (foreach etc).
Related
I have a class that stores a string list, I would like to make this class usable in a foreach statement, so I found these two interfaces and I tried to implement them.
public class GroupCollection : IEnumerable, IEnumerator
{
public List<string> Groups { get; set; }
public int Count { get { return Groups.Count; } }
int position = -1;
}
public IEnumerator GetEnumerator()
{
return (IEnumerator)this;
}
public object Current
{
get
{
try
{
return new Group(Groups[position]);
}
catch (IndexOutOfRangeException)
{
throw new InvalidOperationException();
}
}
}
public bool MoveNext()
{
position++;
return position < Groups.Count;
}
public void Reset()
{
position = 0;
}
I'm iterating through a GroupCollection variable twice:
foreach (GroupCollection.Group in groups) // where groups is a GroupCollection
{
}
foreach (GroupCollection.Group in groups)
{
}
// where Group is a nested class in GroupCollection.
When it is at the first foreach it works well (count is 1 at this time). I don't modify anything, and when it goes to the second foreach it doesn't go into the loop. I went through the code line by line in debugging mode and found out that the reset is not called after the first foreach. So should I manually call reset after the foreach? Isn't there a nicer way to do this?
I don't modify anything
Yes you do - your MoveNext() modifies the state of the class. This is why you shouldn't implement both IEnumerable and IEnumerator in the same class. (The C# compiler does for iterator blocks, but that's a special case.) You should be able to call GetEnumerator() twice and get two entirely independent iterators. For example:
foreach (var x in collection)
{
foreach (var y in collection)
{
Console.WriteLine("{0}, {1}", x, y);
}
}
... should give you all possible pairs of items in a collection. But that only works when the iterators are independent.
I went through the code line by line in debugging mode and found out that the reset is not called after the first foreach.
Why would you expect it to? I don't believe the specification says anything about foreach calling Reset - and that's a good job, as many implementations don't really implement it (they throw an exception instead).
Basically, you should make your GetEnumerator() method return a new object which keeps the mutable state of the "cursor" over your data. Note that the simplest way of implementing an iterator in C# is usually to use an iterator block (yield return etc).
I'd also strongly encourage you to implement the generic interfaces rather than just the non-generic ones; that way your type can be used much more easily in LINQ code, the iterator variable in a foreach statement can be implicitly typed appropriately, etc.
Reset is not called at the end of a foreach loop - you could do that in the GetEnumerator call, or just return the enumerator for the List:
public IEnumerator GetEnumerator()
{
return Groups.GetEnumerator;
}
Note that with the yield keyword there is almost no need to implement IEnumerator or IEnumerable explicitly:
public IEnumerator<string> GetEnumerator()
{
foreach(string s in Groups)
yield return s;
}
Is there a way to use the IEnumerable extension Select, (or any other IEnumerable extension)
to perform the equivalent of a ForEach() that returns the enumeration? i.e., Can you already do, with a built-in provided IEnumerable extension, what would be the equivalent of this custom extension ?
public static class EnumerableExtensions
{
public static IEnumerable<T> Do<T>
(this IEnumerable<T> source, Action<T> action)
{
foreach (var elem in source)
{
action(elem);
yield return elem;
}
}
}
For those comments below, this method is not exactly the same as calling foreach. In the case of foreach, the enumeration is not returned. In this case, the purpose of the construction is simply to perform the action, so it can hardly be called a side-effect.
It allows chaining of Actions, as in
var Invoices = GetUnProcessedInvoices();
Invoices.Where(i=>i.Date > someDate)
.Do(i=>SendEmail(i.Customer))
.Do(i=>ShipProduct(i.Customer, i.Product, i.Quamtity))
.Do(i=>Inventory.Adjust(i.Product, i.Quantity));
You could use:
var result = source.Select(x => { action(x); return x; });
But of course lazy evalution means it won't get executed until you evaluate the result:
var result = source.Select(x => { action(x); return x; });
var list = result.ToList(); // action executed here for each element
And if the enumeration is not fully enumerated, action won't be called for all elements:
var result = source.Select(x => { action(x); return x; });
var first = result.First(); // action executed here only for first element
And if the user enumerates multiple times, your action will be enumerated multiple times, as in the following contrived example:
var result = source.Select(x => { action(x); return x; });
if (result.Count() > 0) // Count() enumerates and calls action() for each element
{
return result.ToList(); // ToList() enumerates and calls action() again for each element.
}
else
{
return null;
}
IMHO it's probably confusing to rely on the user of your enumeration to ensure your action is called exactly once, so I would generally avoid this design.
Microsoft has released the NuGet package Ix-Main which includes various useful extensions like Do().
Your example:
var Invoices = GetUnProcessedInvoices();
Invoices.Where(i=>i.Date > someDate)
.Do(i=>SendEmail(i.Customer))
.Do(i=>ShipProduct(i.Customer, i.Product, i.Quamtity))
.Do(i=>Inventory.Adjust(i.Product, i.Quantity));
Will work out of the box.
The equivalent of your method is just to use Select:
var x = list.Select(x => { x.Something(); return x; });
You can put a side effect in several LINQ extension methods, like Where and Select:
var en = enumeration.Where(x => { action(x); return true; });
or:
var en = enumeration.Select(x => { action(x); return x; });
You get the shortest code if your action returns the value that it is called with, then you can use the method as delegate in the extension method:
var en = enumeration.Select(action);
or if it always returns true:
var en = enumeration.Where(action);
To have the method actually called, you also have to use something that will actually enumerate the collection, like the Last() method:
en.Last();
Well, I'll add my comment as an answer. I think if you want a code base to use chaining, then the code base should really support it. Otherwise, each developer may or may not write in the functional style that you want. So, I would do this:
public static class EnumerableExtensions {
public static IEnumerable<Invoice> SendEmail(this IEnumerable<Invoice> invoices) {
foreach (var invoice in invoices) {
SendEmail(invoice.Customer);
yield return invoice;
}
}
public static IEnumerable<Invoice> ShipProduct(this IEnumerable<Invoice> invoices) {
foreach (var invoice in invoices) {
ShipProduct(invoice.Customer, invoice.Product, invoice.Quantity);
yield return invoice;
}
}
public static IEnumerable<Invoice> AdjustInventory(this IEnumerable<Invoice> invoices) {
foreach (var invoice in invoices) {
AdjustInventory(invoice.Product, invoice.Quantity);
yield return invoice;
}
}
private static void SendEmail(object p) {
Console.WriteLine("Email!");
}
private static void AdjustInventory(object p1, object p2) {
Console.WriteLine("Adjust inventory!");
}
private static void ShipProduct(object p1, object p2, object p3) {
Console.WriteLine("Ship!");
}
}
Then your code becomes:
invoices.SendEmail().ShipProduct().AdjustInventory().ToList();
Of course, I'm not sure about the ToList() part. If you forget to call that then nothing happens :). I was just copying your current extension methods, but it might make more sense to not use yield so that you could just do:
invoices.SendEmail().ShipProduct().AdjustInventory();
I was getting help related to a previous question but then told to ask a new question related to it but the code given I run into an error:
public void AddPersonToCommunity(string person, string communityName)
{
var result = communities.Where(n => String.Equals(n.CommunityName, communityName)).FirstOrDefault();
if (result != null)
{
result.Add(new Person() { PersonName = person }); //no definition for add?
}
}
You can see the previous question here for more specifics: relationships in rest?
If I do var result = communities; result will then have the definition for Add so im not sure whats going on?
You're calling Where() which will return an IEnumerable<Community> (no Add method) and then FirstOrDefault() which returns a Community (also no Add method). Where would you expect an Add method to come from?
I suspect you really want:
if (result != null)
{
result.People.Add(new Person { PersonName = person });
}
... because Community.People is a list of the people in that community, right?
Note that if you do var result = communities; there will indeed be an Add method - but with a signature of Add(Community), not Add(Person).
It's important that you keep the types of everything straight. This actually has very little to do with LINQ. You'd have seen the same result if you'd tried:
Community community = new Community();
community.Add(new Person { PersonName = person });
Adding onto #Jon's answer
There is a Concat extension method for IEnumerable<T> which allows you to logically combine two IEnumerable<T> objects into a single IEnumerable<T> which spans both collection. You could extend this logic to appending a single element and then apply that to your current situation.
public static IEnumerable<T> Concat(this IEnumerable<T> enumerable, T value) {
foreach (var cur in enumerable) {
yield return cur;
}
yield return value;
}
...
result = result.Concat(new Person() { PersonName = person });
How can I make ComponentTraversal.GetDescendants() better using LINQ?
Question
public static class ComponentTraversal
{
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
//How can I do this better using LINQ?
IList<Component> descendants = new Component[]{};
foreach(var child in composite.Children)
{
descendants.Add(child);
if(child is Composite)
{
descendants.AddRange((child as Composite).GetDescendants());
}
}
return descendants;
}
}
public class Component
{
public string Name { get; set; }
}
public class Composite: Component
{
public IEnumerable<Component> Children { get; set; }
}
public class Leaf: Component
{
public object Value { get; set; }
}
Answer
I edited Chris's answer to provide a generic extension method that I've added to my Common library. I can see this being helpful for other people as well so here it is:
public static IEnumerable<T> GetDescendants<T>(this T component, Func<T,bool> isComposite, Func<T,IEnumerable<T>> getCompositeChildren)
{
var children = getCompositeChildren(component);
return children
.Where(isComposite)
.SelectMany(x => x.GetDescendants(isComposite, getCompositeChildren))
.Concat(children);
}
Thanks Chris!
Also,
Please look at LukeH's answer at http://blogs.msdn.com/b/wesdyer/archive/2007/03/23/all-about-iterators.aspx . His answer provides a better way to approach this problem in general, but I did not select it because it was not a direct answer to my question.
There are often good reasons to avoid (1) recursive method calls, (2) nested iterators, and (3) lots of throwaway allocations. This method avoids all of those potential pitfalls:
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
var stack = new Stack<Component>();
do
{
if (composite != null)
{
// this will currently yield the children in reverse order
// use "composite.Children.Reverse()" to maintain original order
foreach (var child in composite.Children)
{
stack.Push(child);
}
}
if (stack.Count == 0)
break;
Component component = stack.Pop();
yield return component;
composite = component as Composite;
} while (true);
}
And here's the generic equivalent:
public static IEnumerable<T> GetDescendants<T>(this T component,
Func<T, bool> hasChildren, Func<T, IEnumerable<T>> getChildren)
{
var stack = new Stack<T>();
do
{
if (hasChildren(component))
{
// this will currently yield the children in reverse order
// use "composite.Children.Reverse()" to maintain original order
// or let the "getChildren" delegate handle the ordering
foreach (var child in getChildren(component))
{
stack.Push(child);
}
}
if (stack.Count == 0)
break;
component = stack.Pop();
yield return component;
} while (true);
}
var result = composite.Children.OfType<Composite>().SelectMany(child => child.GetDescendants()).Concat(composite.Children);
return result.ToList();
When doing a translation from imperitive syntax to LINQ, it is usually pretty easy to take the translation one step at a time. Here is how this works:
This is looping over composite.Children, so that will be the collection we apply LINQ to.
There are two general operations occuring in the loop, so lets do one of them at a time
The "if" statement is performing a filter. Normally, we would use "Where" to perform a filter, but in this case the filter is based on type. LINQ has "OfType" built in for this.
For each child composite, we want to recursively call GetDescendants and add the results to a single list. Whenever we want to transform an element into something else, we use either Select or SelectMany. Since we want to transform each element into a list and merge them all together, we use SelectMany.
Finally, to add in the composite.Children themselves, we concatenate those results to the end.
I don't know about better, but I think this performs the same logic:
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
return composite.Children
.Concat(composite.Children
.Where(x => x is Composite)
.SelectMany(x => x.GetDescendants())
);
}
It might be shorter, but there is nothing wrong with what you have. As I said above, this is supposed to perform the same thing and I doubt that the performance of the function is improved.
This is a good example for when you might want to implement an iterator. This has the advantage of lazy evaluation in a slightly more readable syntax. Also, if you need to add additional custom logic then this form is more extensible
public static IEnumerable<Component> GetDescendants(this Composite composite)
{
foreach(var child in composite.Children)
{
yield return child;
if(!(child is Composite))
continue;
foreach (var subChild in ((Composite)child).GetDescendants())
yield return subChild;
}
}
Can someone share a simple example of using the foreach keyword with custom objects?
Given the tags, I assume you mean in .NET - and I'll choose to talk about C#, as that's what I know about.
The foreach statement (usually) uses IEnumerable and IEnumerator or their generic cousins. A statement of the form:
foreach (Foo element in source)
{
// Body
}
where source implements IEnumerable<Foo> is roughly equivalent to:
using (IEnumerator<Foo> iterator = source.GetEnumerator())
{
Foo element;
while (iterator.MoveNext())
{
element = iterator.Current;
// Body
}
}
Note that the IEnumerator<Foo> is disposed at the end, however the statement exits. This is important for iterator blocks.
To implement IEnumerable<T> or IEnumerator<T> yourself, the easiest way is to use an iterator block. Rather than write all the details here, it's probably best to just refer you to chapter 6 of C# in Depth, which is a free download. The whole of chapter 6 is on iterators. I have another couple of articles on my C# in Depth site, too:
Iterators, iterator blocks and data pipelines
Iterator block implementation details
As a quick example though:
public IEnumerable<int> EvenNumbers0To10()
{
for (int i=0; i <= 10; i += 2)
{
yield return i;
}
}
// Later
foreach (int x in EvenNumbers0To10())
{
Console.WriteLine(x); // 0, 2, 4, 6, 8, 10
}
To implement IEnumerable<T> for a type, you can do something like:
public class Foo : IEnumerable<string>
{
public IEnumerator<string> GetEnumerator()
{
yield return "x";
yield return "y";
}
// Explicit interface implementation for nongeneric interface
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator(); // Just return the generic version
}
}
(I assume C# here)
If you have a list of custom objects you can just use the foreach in the same way as you do with any other object:
List<MyObject> myObjects = // something
foreach(MyObject myObject in myObjects)
{
// Do something nifty here
}
If you want to create your own container you can use the yield keyword (from .Net 2.0 and upwards I believe) together with the IEnumerable interface.
class MyContainer : IEnumerable<int>
{
private int max = 0;
public MyContainer(int max)
{
this.max = max;
}
public IEnumerator<int> GetEnumerator()
{
for(int i = 0; i < max; ++i)
yield return i;
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
And then use it with foreach:
MyContainer myContainer = new MyContainer(10);
foreach(int i in myContainer)
Console.WriteLine(i);
From MSDN Reference:
The foreach statement is not limited to IEnumerable types and can be applied to an instance of any type that satisfies the following conditions:
has the public parameterless GetEnumerator method whose return type is either class, struct, or interface type,
the return type of the GetEnumerator method has the public Current property and the public parameterless MoveNext method whose return type is Boolean.
If you declare those methods, you can use foreach keyword without IEnumerable overhead. To verify this, take this code snipped and see that it produces no compile-time error:
class Item
{
public Item Current { get; set; }
public bool MoveNext()
{
return false;
}
}
class Foreachable
{
Item[] items;
int index;
public Item GetEnumerator()
{
return items[index];
}
}
Foreachable foreachable = new Foreachable();
foreach (Item item in foreachable)
{
}