Efficiency of creating delegate instance inline of LINQ query? - c#

Following are two examples that do the same thing in different ways. I'm comparing them.
Version 1
For the sake of an example, define any method to create and return an ExpandoObject from an XElement based on business logic:
var ToExpando = new Func<XElement, ExpandoObject>(xClient =>
{
dynamic o = new ExpandoObject();
o.OnlineDetails = new ExpandoObject();
o.OnlineDetails.Password = xClient.Element(XKey.onlineDetails).Element(XKey.password).Value;
o.OnlineDetails.Roles = xClient.Element(XKey.onlineDetails).Element(XKey.roles).Elements(XKey.roleId).Select(xroleid => xroleid.Value);
// More fields TBD.
}
Call the above delegate from a LINQ to XML query:
var qClients =
from client in xdoc.Root.Element(XKey.clients).Elements(XKey.client)
select ToExpando(client);
Version 2
Do it all in the LINQ query, including creation and call to Func delegate.
var qClients =
from client in xdoc.Root.Element(XKey.clients).Elements(XKey.client)
select (new Func<ExpandoObject>(() =>
{
dynamic o = new ExpandoObject();
o.OnlineDetails = new ExpandoObject();
o.OnlineDetails.Password = client.Element(XKey.onlineDetails).Element(XKey.password).Value;
o.OnlineDetails.Roles = client.Element(XKey.onlineDetails).Element(XKey.roles).Elements(XKey.roleId).Select(xroleid => xroleid.Value);
// More fields TBD.
return o;
}))();
Considering delegate creation is in the select part, is Version 2 inefficient? Is it managed or optimized by either the C# compiler or runtime so it won't matter?
I like Version 2 for its tightness (keeping the object creation logic in the query), but am aware it might not be viable depending on what the compiler or runtime does.

The latter approach looks pretty horrible to me. I believe it will have to genuinely create a new delegate each time as you're capturing a different client each time, but personally I wouldn't do it that way at all. Given that you've got real statements in there, why not write a normal method?
private static ToExpando(XElement client)
{
// Possibly use an object initializer instead?
dynamic o = new ExpandoObject();
o.OnlineDetails = new ExpandoObject();
o.OnlineDetails.Password = client.Element(XKey.onlineDetails)
.Element(XKey.password).Value;
o.OnlineDetails.Roles = client.Element(XKey.onlineDetails)
.Element(XKey.roles)
.Elements(XKey.roleId)
.Select(xroleid => xroleid.Value);
return o;
}
and then query it with:
var qClients = xdoc.Root.Element(XKey.clients)
.Elements(XKey.client)
.Select(ToExpando);
I would be much more concerned about the readability of the code than the performance of creating delegates, which is generally pretty quick. I don't think there's any need to use nearly as many lambdas as you currently seem keen to do. Think about when you come back to this code in a year's time. Are you really going to find the nested lambda easier to understand than a method?
(By the way, separating the conversion logic into a method makes that easy to test in isolation...)
EDIT: Even if you do want to do it all in the LINQ expression, why are you so keen to create another level of indirection? Just because query expressions don't allow statement lambdas? Given that you're doing nothing but a simple select, that's easy enough to cope with:
var qClients = xdoc.Root
.Element(XKey.clients)
.Elements(XKey.client)
.Select(client => {
dynamic o = new ExpandoObject();
o.OnlineDetails = new ExpandoObject();
o.OnlineDetails.Password = client.Element(XKey.onlineDetails)
.Element(XKey.password).Value;
o.OnlineDetails.Roles = client.Element(XKey.onlineDetails)
.Element(XKey.roles)
.Elements(XKey.roleId)
.Select(xroleid => xroleid.Value);
return o;
});

It is true that your second version creates new Func instance repeatedly - however, this just means allocating some small object (closure) and using pointer to a function. I don't think this is a large overhead compared to dynamic lookups that you need to perform in the body of the delegate (to work with dynamic objects).
Alternatively, you could declare a local lambda function like this:
Func<XElement, ExpandoObject> convert = client => {
dynamic o = new ExpandoObject();
o.OnlineDetails = new ExpandoObject();
o.OnlineDetails.Password =
client.Element(XKey.onlineDetails).Element(XKey.password).Value;
o.OnlineDetails.Roles = client.Element(XKey.onlineDetails).
Element(XKey.roles).Elements(XKey.roleId).
Select(xroleid => xroleid.Value);
// More fields TBD.
return o;
}
var qClients =
from client in xdoc.Root.Element(XKey.clients).Elements(XKey.client)
select convert(client);
This way, you can create just a single delegate, but keep the code that does the conversion close to the code that implements the query.
Another option would be to use anonymous types instead - what are the reasons for using ExpandoObject in your scenario? The only limitation of anonymous types would be that you may not be able to access them from other assemblies (they are internal), but working with them using dynamic should be fine...
Your select could look like:
select new { OnlineDetails = new { Password = ..., Roles = ... }}
Finally, you could also use Reflection to convert anonymous type to ExpandoObject, but that would probably be even more inefficient (i.e. very difficult to write efficiently)

Related

Serialize dynamic Dapper result to CSV

I'm trying to serialize a dynamic Dapper result to CSV using ServiceStack.Text, but I'm getting a collection of line breaks. According to ServiceStack.Text, it can handle both anonymous and IDictionary<string, object> types.
using (var conn = new SqlConnection(...))
{
var data = conn.Query("select * from data");
var output = CsvSerializer.SerializeToCsv(data);
Console.WriteLine(output);
Console.Read();
}
When I use the same type, it works.
IEnumerable<dynamic> list = new List<dynamic>
{
new
{
Name = "Nathan",
Id = 1,
Created = DateTime.UtcNow
}
};
Console.WriteLine(CsvSerializer.SerializeToString(list));
Console.Read();
What am I missing about Dapper's return type?
I know I can solve this by projecting onto a model class, but the beauty of my approach lies in the use of dynamics. Do I have any options?
Dapper's Query method returns IEnumerable<dynamic>, which is basically: IEnumerable<object> - where each row happens to implement IDictionary<string,object>. I wonder whether SS is looking for the T is IEnumerable<T>: in which case, yeah, that won't work well. You could try:
var typed = data.Select(x => (IDictionary<string,object>)typed);
var output = CsvSerializer.SerializeToCsv(typed);
This does the cast in the projection, so that if SerializeToCsv is a generic method, it'll know about the interface support.
Dapper does not return anonymous types.
dynamic Generic IDictionary's like what Dapper returns should now be supported from this commit.
This change is available from v4.0.43+ that's now available on MyGet.

What is the specificity of var?

So, as far as a question to a real problem goes, this probably isn't a very good question, but it's bugging me and I can't find an answer, so I consider that to be a problem.
What is the specificity of var? The MSDN reference on it states the following:
An implicitly typed local variable is strongly typed just as if you had declared the type yourself
Bur it doesn't seem to say anywhere what type it is strongly typed for. For example, if I have the following:
var x = new Tree();
But I then don't call any methods of Tree, is x still strongly typed to tree? Or could I have something like the following?
var x = new Tree();
x = new Object();
I'm guessing this isn't allowed, but I don't have access to a compiler right now, and I'm really wondering if there are any caveats that allow unexpected behaviour like the above.
It's strongly typed to the type of the expression on the right side:
The var keyword instructs the compiler to infer the type of the variable from the expression on the right side of the initialization statement.
From here.
It's tied to the type on the right-side of the equals-sign, so in this case, it is equivalent to:
Tree x = new Tree();
Regardless of whatever interface or base classes are tied to Tree. If you need x to be of a lower type, you have to declare it specifically, like:
Plant x = new Tree();
// or
IHasLeaves x = new Tree();
Yes, in your example x is strongly typed to Tree just as if you had declared the type yourself.
Your second example would not compile because you are redefining x.
No, it is exactly the same, had you typed Tree x = new Tree();. Obviously the only unambiguous inference the compiler can do is the exact type of the right hand side expression, so it won't suddenly become ITree x
So this doesn't work:
Tree x = new Tree();
x = new Object(); //cannot convert implicitly
If you are curious, the dynamic is closer to the behavior you expect.
dynamic x = new Tree();
x = new Object();
In the example:
var x = new Tree();
is the same as
Tree x = new Tree();
I've found it is always better to use "var" since it facilitates code re-factoring.
Also, adding,
var x = new Object();
in the same scope would break compilation due to the fact that you cannot declare a variable twice.
var is neither a type nor does it make the variable something special. It tells the compiler to infer the type of the variable AT COMPILE TIME by analyzing the initialization expression on the right hand side of the assignment operator.
These two expressions are equivalent:
Tree t = new Tree();
and
var t = new Tree();
Personally I prefer to use var when the type name is mentioned explicitly on the right hand side or when the exact type is complicated and not really relevant as for results returned from LINQ queries. These LINQ results are often just intermediate results that are processed further:
var x = new Dictionary<string, List<int>>();
is easier to read than the following statement and yet very clear:
Dictionary<string, List<int>> x = new Dictionary<string, List<int>>();
var query = someSource
.Where(x => x.Name.StartsWith("A"))
.GroupBy(x => x.State)
.OrderBy(x => x.Date);
Here query is of type IOrderedEnumerable<IGrouping<string, SomeType>>. Who cares?
When the type name does not appear on the right hand side and is simple, then I prefer to write it explicitly as it doesn't simplify anything to use var:
int y = 7;
string s = "hello";
And of cause, if you create anonymous types, you must use var because you have no type name:
var z = new { Name = "Coordinate", X = 5.343, Y = 76.04 };
The var keyword was introduced together with LINQ in order to simplify their use and to allow to create types on the fly in order to simulate the way you would work with SQL:
SELECT Name, Date FROM Person
var result = DB.Persons.Select(p => new { p.Name, p.Date });

Parallel.ForEach and IGrouping source item issue

I am trying to parallelize a query with a groupby statement in it. The query is similar to
var colletionByWeek = (
from item in objectCollection
group item by item.WeekStartDate into weekGroups
select weekGroups
).ToList();
If I use Parallel.ForEach with shared variable like below, it works fine. But I don't want to use shared variables in parallel query.
var pSummary=new List<object>();
Parallel.ForEach(colletionByWeek, week =>
{
pSummary.Add(new object()
{
p1 = week.First().someprop,
p2= week.key,
.....
});
}
);
So, I have changed the above parallel statement to use local variables. But the compiler complains about the source type <IEnumerable<IGrouping<DateTime, object>> can not be converted into System.Collections.Concurrent.OrderablePartitioner<IEnumerable<IGrouping<DateTime, object>>.
Am I giving a wrong source type? or is this type IGouping type handled differently? Any help would be appreciated. Thanks!
Parallel.ForEach<IEnumerable<IGrouping<DateTime, object>>, IEnumerable<object>>
(spotColletionByWeek,
() => new List<object>(),
(week, loop, summary) =>
{
summary.Add(new object()
{
p1 = week.First().someprop,
p2= week.key,
.....
});
return new List<object>();
},
(finalResult) => pSummary.AddRange(finalResult)
);
The type parameter TSource is the element type, not the collection type. And the second type parameter represents the local storage type, so it should be List<T>, if you want to Add() to it. This should work:
Parallel.ForEach<IGrouping<DateTime, object>, List<object>>
That's assuming you don't actually have objects there, but some specific type.
Although explicit type parameters are not even necessary here. The compiler should be able to infer them.
But there are other problems in the code:
you shouldn't return new List from the main delegate, but summary
the delegate that processes finalResult might be executed concurrently on multiple threads, so you should use locks or a concurrent collection there.
I'm going to skip the 'Are you sure you even need to optimize this' stage, and assume you have a performance issue which you hope to solve by parallelizing.
First of all, you're not doing yourself any favors trying to use Parallel.Foreach<> for this task. I'm pretty sure you will get a readable and more optimal result using PLINQ:
var random = new Random();
var weeks = new List<Week>();
for (int i=0; i<1000000; i++)
{
weeks.Add(
new Week {
WeekStartDate = DateTime.Now.Date.AddDays(7 * random.Next(0, 100))
});
}
var parallelCollectionByWeek =
(from item in weeks.AsParallel()
group item by item.WeekStartDate into weekGroups
select new
{
p1 = weekGroups.First().WeekStartDate,
p2 = weekGroups.Key,
}).ToList();
It's worth noting that there is some overhead associated with parallelizing the GroupBy operator, so the benefit will be marginal at best. (Some crude benchmarks hint at a 10-20% speed up)
Apart from that, the reason you're getting a compile error is because the first Type parameter is supposed to be an IGrouping<DateTime, object> and not an IE<IG<..,..>>.

Member by Member copy

In an application we have we have a set of ORM objects, and a set of business object. Most of the time we're simply doing a member by member copy. Other times we process the data slightly. For instance:
tEmployee emp = new tEmployee();
emp.Name = obj.Name;
emp.LastName = obj.LastName;
emp.Age = obj.Age;
emp.LastEdited = obj.LastEdited.ToGMT();
Now this works just fine, and is rather fast, but not exactly terse when it comes to coding. Some of our objects have upto 40 members, so doing a copy like this can get rather tedious. Granted you only need 2 methods for two->from conversion, but I'd like to find a better way to do this.
Reflection is an natural choice, but on a benchmark I found that execution time was about 100x slower when using reflection.
Is there a better way to go about this?
Clarification:
I'm converting from one type to another. In the above example obj is of type BLogicEmployee and emp is of type tEmployee. They share member names, but that is it.
You might want to check out AutoMapper.
If you don't mind it being a bit slow the first time you can compile a lambda expression:
public static class Copier<T>
{
private static readonly Action<T, T> _copier;
static Copier()
{
var x = Expression.Parameter(typeof(T), "x");
var y = Expression.Parameter(typeof(T), "y");
var expressions = new List<Expression>();
foreach (var property in typeof(T).GetProperties())
{
if (property.CanWrite)
{
var xProp = Expression.Property(x, property);
var yProp = Expression.Property(y, property);
expressions.Add(Expression.Assign(yProp, xProp));
}
}
var block = Expression.Block(expressions);
var lambda = Expression.Lambda<Action<T, T>>(block, x, y);
_copier = lambda.Compile();
}
public static void CopyTo(T from, T to)
{
_copier(from, to);
}
}
Reflection can be sped up an awful lot if you use delegates. Basically, you can create a pair of delegates for each getter/setter pair, and then execute those - it's likely to go very fast. Use Delegate.CreateDelegate to create a delegate given a MethodInfo etc. Alternatively, you can use expression trees.
If you're creating a new object, I already have a bunch of code to do this in MiscUtil. (It's in the MiscUtil.Reflection.PropertyCopy class.) That uses reflection for properties to copy into existing objects, but a delegate to convert objects into new ones. Obviously you can adapt it to your needs. I'm sure if I were writing it now I'd be able to avoid the reflection for copying using Delegate.CreateDelegate, but I'm not about to change it :)
Consider using AutoMapper. From its documentation:
.. AutoMapper works best as long as
the names of the members match up to
the source type's members. If you have
a source member called "FirstName",
this will automatically be mapped to a
destination member with the name
"FirstName".
This will save you a great deal of explicit mapping, and AutoMapper of course allows for the customization of particular mappings along the lines of:
Mapper.CreateMap<Model.User, Api.UserInfo>()
.ForMember(s => s.Address, opt => opt.Ignore())
.ForMember(s => s.Uri, opt => opt.MapFrom(c => HttpEndpoint.GetURI(c)))
Object.MemberwiseClone might be useful if all you need is a shallow clone. Not sure how well it performs though, and obviously any complex objects would need additional handling to ensure a proper copy.
See if you can use this
RECAP:
and class must be Serializable for this to work.
public static T DeepClone<T>(T obj)
{
using (var ms = new MemoryStream())
{
var formatter = new BinaryFormatter();
formatter.Serialize(ms, obj);
ms.Position = 0;
return (T) formatter.Deserialize(ms);
}
}
Look att Automapper it can autmatically map your objects if your fields match...
http://automapper.codeplex.com/

C# Lists: How to copy elements from one list to another, but only certain properties

So I have a list of objects with a number of properties. Among these properties are name and id. Let's call this object ExtendedObject. I've also declared a new List of different objects that have only the properties name and id. Let's call this object BasicObject.
What I'd like to do is convert or copy (for lack of better words) the List of ExtendedObject objects to a list of BasicObject objects. I know C# Lists have a lot of interesting methods that can be useful, so I wondered if there were an easy way to say something to the effect of:
basicObjectList = extendedObjectList.SomeListMethod<BasicObject>(some condition here);
But I realize it may end up looking nothing like that. I also realize that I could just loop through the list of ExtendedObjects, create a new BasicObject from each ExtendedObject's name and id, and push it onto a list of BasicObjects. But I was hoping for something a little more elegant than that.
Does anyone have any ideas? Thanks very much.
It depends on exactly how you'd construct your BasicObject from an ExtendedObject, but you could probably use the ConvertAll method:
List<BasicObject> basicObjectList =
extendedObjectList.ConvertAll(x => new BasicObject
{
id = x.id,
name = x.name
});
Or, if you prefer, you could use the LINQ Select method and then convert back to a list:
List<BasicObject> basicObjectList =
extendedObjectList.Select(x => new BasicObject
{
id = x.id,
name = x.name
}).ToList();
if you are on .NET 3.5 or greater this could be done by using LINQ projections:
basicObjectList = extendedObjectList.Select(x => new BasicObject { Id=x.Id, Name=x.Name})
var basicObjectList = extendedObjectList.Select(eo => new BasicObject { name = eo.name, id = eo.id });
I think that the OP's suggestion of "BasicObject" was just a pseudonym for a resulting object with a specific subset of properties from the original set. Anonymous types are your friend (as indicated by #mumtaz).
Assuming the following extendedObjectList if of IEnumerable<T> (including a List):
// "var" used so that runtime infers the type automatically
var subset = extendedObjectList
// you can use any Linq based clause for filtering
.Where(a => <conditions>)
// where the runtime creates a runtime anonymous type to describe your "BasicObject"
.Select(a => new { a.Property1, a.Property2, a.Property3 })
// conversion to a List collection of your anonymous type
.ToList();
At this point, subset contains a List of an anonymous (runtime) type that contains three properties - Property1, Property2, Property3.
You can manipulate this resulting list as follows:
// execute an anonymous delegate (method) for each of the new anonymous objects
subset.ForEach
(
basicObject =>
{
Console.WriteLine("Property1 - {0}", basicObject.Property1);
Console.WriteLine("Property2 - {0}", basicObject.Property2);
Console.WriteLine("Property3 - {0}", basicObject.Property3);
}
);
// grab the first object off the list
var firstBasicObject = subset.First();
// sort the anonymously typed list
var sortedSubset = subset.OrderBy(a => a.Property1).ToList();
Once the runtime has resolved the new object (of any combination of properties from the source object), you can use it virtually any way that you wish.
For Linq-to-Sql applications (using IQueryable<T>), the Select statement can be used to obtain specific column data (instead of the entire row), thereby creating an anonymous type to describe a subset of column data for a given row.

Categories