Running a simple LINQ query in parallel - c#

I'm still very new to LINQ and PLINQ. I generally just use loops and List.BinarySearch in a lot of cases, but I'm trying to get out of that mindset where I can.
public class Staff
{
// ...
public bool Matches(string searchString)
{
// ...
}
}
Using "normal" LINQ - sorry, I'm unfamiliar with the terminology - I can do the following:
var matchedStaff = from s
in allStaff
where s.Matches(searchString)
select s;
But I'd like to do this in parallel:
var matchedStaff = allStaff.AsParallel().Select(s => s.Matches(searchString));
When I check the type of matchedStaff, it's a list of bools, which isn't what I want.
First of all, what am I doing wrong here, and secondly, how do I return a List<Staff> from this query?
public List<Staff> Search(string searchString)
{
return allStaff.AsParallel().Select(/* something */).AsEnumerable();
}
returns IEnumerable<type>, not List<type>.

For your first question, you should just replace Select with Where :
var matchedStaff = allStaff.AsParallel().Where(s => s.Matches(searchString));
Select is a projection operator, not a filtering one, that's why you are getting an IEnumerable<bool> corresponding to the projection of all your Staff objects from the input sequence to bools returned by your Matches method call.
I understand it can be counter intuitive for you not to use select at all as it seems you are more familiar with the "query syntax" where select keyword is mandatory which is not the case using the "lambda syntax" (or "fluent syntax" ... whatever the naming), but that's how it is ;)
Projections operators, such a Select, are taking as input an element from the sequence and transform/projects this element somehow to another type of element (here projecting to bool type). Whereas filtering operators, such as Where, are taking as input an element from the sequence and either output the element as such in the output sequence or are not outputing the element at all, based on a predicate.
As for your second question, AsEnumerable returns an IEnumerable as it's name indicates ;)
If you want to get a List<Staff> you should rather call ToList() (as it's name indicates ;)) :
return allStaff.AsParallel().Select(/* something */).ToList();
Hope this helps.

There is no need to abandon normal LINQ syntax to achieve parallelism. You can rewrite your original query:
var matchedStaff = from s in allStaff
where s.Matches(searchString)
select s;
The parallel LINQ (“PLINQ”) version would be:
var matchedStaff = from s in allStaff.AsParallel()
where s.Matches(searchString)
select s;
To understand where the bools are coming from, when you write the following:
var matchedStaff = allStaff.AsParallel().Select(s => s.Matches(searchString));
That is equivalent to the following query syntax:
var matchedStaff = from s in allStaff.AsParallel() select s.Matches(searchString);
As stated by darkey, if you want to use the C# syntax instead of the query syntax, you should use Where():
var matchedStaff = allStaff.AsParallel().Where(s => s.Matches(searchString));

Related

How to query an array of objects in C#

I want to query an array of objects "sortedData", where each object has two values (ItemId, Sort), for a specific ItemId and set the 'Sort' value. Like this below but this isn't the correct linq syntax.
var sortedData = db.Fetch<object>("SELECT ItemId, Sort FROM CollectionItems WHERE CollectionId = #0", collectionId);
dataWithSort = db.Fetch<OrganizationForExportWithSort>(TpShared.DAL.StoredProcedures.GetOrganizationsForTargetListUI(clientId, organizationIdList));
foreach(OrganizationForExportWithSort export in dataWithSort)
{
export.Sort = sortedData.Select("Sort").Where(sortedData.ItemId == export.Id);
}
As I understand it, you want the Sort property from the item that matches that particular ID. This being the case, you have a few problems with what you've written:
"Where" and "Select" both take Lambda expressions, not property names and expressions, so the code snippet you provide shouldn't compile.
"Where" and "Select" both return collections (even if there's only one item that actually matches the "Where" filter; in fact, even if no items in the collection match the condition in the "Where" clause it'll still return a collection, albeit an empty one). Think of LINQ Select more in terms of running a transform on a collection and LINQ "Where" as applying a filter to one.
As a general rule for LINQ queries, if possible, you should actually run "where" before "select" (filter first, then apply some kind of transform to the remaining items).
In this case, I think you actually just want one item, so you can actually use "FirstOrDefault" instead of "Where." This will leave you with a single .NET object. This is analogous to the TOP 1 restriction in SQL. Once you have the .NET object you can retrieve the property from the object itself.
Try this:
foreach(OrganizationForExportWithSort export in dataWithSort)
{
export.Sort = sortedData.FirstOrDefault(data => data.ItemId == export.Id)?.Sort;
}
The "?" is a new C# feature that will try to call .Sort on the object if (and only if) the query succeeded in finding an item with that ID. If it doesn't it'll just return null.
Have you tried Linq sorting?
var sortedData = db.Fetch<object>("SELECT ItemId, Sort FROM CollectionItems WHERE CollectionId = #0", collectionId);
dataWithSort = db.Fetch<OrganizationForExportWithSort>(TpShared.DAL.StoredProcedures.GetOrganizationsForTargetListUI(clientId, organizationIdList));
// create a list ordered by fields
var sorted = dataWithSort.OrderBy(o => o.SomeField).ThenBy(o => o.OtherField);
The o in the lambda stands for object...
I will add my voice to the chorus of folks saying to read up on some good linq tutorials. Start Here

Buffering an expression via property in LINQ (vs Lambda)

I have a large select expression to reuse in several classes. For the DRY principle I have chosen to create a property that returns the Expression to the caller code
protected virtual Expression<Func<SezioneJoin, QueryRow>> Select
{
get
{
return sj => new QueryRow
{
A01 = sj.A.A01,
A01a = sj.A.A01a,
A01b = sj.A.A01b,
A02 = sj.A.A02,
A03 = sj.A.A03,
A11 = sj.A.A11,
A12 = sj.A.A12,
A12a = sj.A.A12a,
A12b = sj.A.A12b,
A12c = sj.A.A12c,
A21 = sj.A.A21,
A22 = sj.A.A22,
..............
Lots of assignements
};
}
}
Now I can successfully use the property if I do
var query = dataContext.entity.Join(...).Where(x => ...).Select(Select);
But the following will not compile:
from SezioneJoin sj in (
from A a in ...
join D d in ... on new { ... } equals new { ... }
where
d.D13 == "086" &&
!String.IsNullOrEmpty(a.A32) && a.A32 != "086"
orderby a.A21
orderby a.prog
select new SezioneJoin{...})
select Select
Error is
Unable to cast 'System.Linq.IQueryable<System.Linq.Expressions.Expression<System.Func<DiagnosticoSite.Data.Query.SezioneJoin,DiagnosticoSite.Data.Query.QueryRow>>>' into 'System.Linq.IQueryable<DiagnosticoSite.Data.Query.QueryRow>'
I can understand that the LINQ syntax requires the body of the select statement to be the inner type of the IQueryable that it returns, so the compiler is fooled into returning a list of expressions. With the Lambda syntax, the expression is a parameter that is either compiled in-line or returned by some other method (even dynamically!).
I would like to ask if there is any way to circumvent this and avoid defining large select expressions inline
protected virtual Expression> Select
I'd avoid using the names of any of the Linq-mapped methods (Select, Where, GroupBy, OrderBy, OrderByDescending) as member names. It works in this case, but when it causes problems by matching the definitions for those it can be confusing if you aren't in the habit of just not using those names unless you deliberately want to override Linq.
On a related note. Consider that:
from var item in source select item.Something
is equivalent to:
source.Select(item => item.Something);
Therefore:
from SezioneJoin sj in (/*…*/) select Select;
is equivalent to:
(/*…*/).Select(sj => Select);
That is, you arent' creating a query that executes the expression in Select, but one that returns the expression itself.
You should either just use the form .Select(Select) or use select sj => (Select)(sj) but that second one will (if I even have the parentheses correct to stop it clashing with Queryable.Select, I haven't tested that) call the Select property every time so is at best wasteful and at worse not going to be something a query provider can make use of, so it will fail with most linq-providers. In all, use the .Select(Select) form (and change the name).
(On a separate note, if you're going to buffer an expression, actually buffer it; create a private Expression<Func<SezioneJoin, QueryRow>> once and return it in the property's getter, rather than creating it every time).
Simply use extension method in place of last LINQ select statement:
var query = from SezioneJoin sj ... select new SezioneJoin{...});
var projection = query.Select(Select);

What's the point of lambda expression in OrderBy?

I have a list of strings and I'd like to order them.
IEnumerable<String> strings = ...;
strings = strings.OrderBy(a => a);
What I don't get is the point of the lambda expression a => a in there. First I thought that I can pull out a property and order at the same like like this.
IEnumerable<Something> somethings = ...;
IEnumerable<String> strings = somethings.OrderBy(a => a.StringProperty);
But that doesn't compile. So I'll have to go like this.
IEnumerable<Something> somethings = ...;
IEnumerable<String> strings = somethings.Select(a
=> a.StringProperty).OrderBy(a => a);
So why am I enforced to use the lambda expression in the OrderBy command?!
The lambda indicates the "what you want to order by".
If you take a set of people, and order them by their birthday, you still have a set of people - not a set of birthdays; i.e.
IEnumerable<Person> people = ...;
IEnumerable<Person> sorted = people.OrderBy(a => a.DateOfBirth);
so similarly, ordering a set of Somethings by StringProperty still results in a set of Somethings:
IEnumerable<Something> somethings = ...;
IEnumerable<Something> sorted = somethings.OrderBy(a => a.StringProperty);
In some (very few) cases, you do actually mean "and order it by the thing itself". This usually applies only to things like IEnumerable<string> or IEnumerable<int> - so the minor inconvenience of .OrderBy(x => x) is trivial. If it bothers you, you could always write an extension method to hide this detail.
When you order a collection it doesn't change it's type, hence
IEnumerable<Something> somethings = ...;
var strings = somethings.OrderBy(a => a.StringProperty);
results in an IEnumerable<Something>, you have to select the property to change the type:
IEnumerable<String> strings = somethings
.OrderBy(s => s.StringProperty)
.Select(s => s.StringProperty);
So why am I enforced to use the lambda expression in the OrderBy
command?!
Because Enumerable.OrderBy is a method that needs an argument.
Because you're not selecting it, you're ordering by it.
Try this:
Console.WriteLine(string.Join(", ",
new[] { new { Int = 1 }, new { Int = 2 }, new { Int = 0 }
.OrderBy(a => a.Int));
This will give you the lists, ordered by the Int property, not just randomly ordered!
This means that you can order by any property of the object, instead of just the object itself.
the structure of the .OrderBy(TSource, TKey) method has a requirement for both the Source item and the item to sort by. the lambda is saying "Order TSource using TKey", or in your case, "Order a using a"
The purpose of the parameter lambda in OrderBy is precisely tell the criteria using for ordering. It takes an object you're sorting, and returns another "thing" (same type or not) which will be sorted, sort of extracting a key to be sorted from the original source.
Your first sample is really trivial, and your rant is somewhat justified there, since if you start from a list of strings, you most likely will want to sort by those strings precisely. Which makes me wonder too, why we can't have a parameterless OrderBy for those trivial cases.
For the second snippet:
IEnumerable<Something> somethings = ...;
IEnumerable<Something> strings = somethings.OrderBy(a => a.StringProperty);
Here is when the "sorting criteria" makes sense, as you order the objects by some property value derived from them, and not just for the objects themselves (which generally aren't comparable). The reason it doesn't compiles is in the second enumerable declaration, it should be an IEnumerable<Something> instead of IEnumerable<string>, because the ordering will return another list of the very same type as it received, but in a different order, regardless of sorting criteria.
In the third snippet, you solve that by Selecting the string property, that effectively yields a list of strings, but you lose all the input objects in the process. The lambda parameter is more or less pointless and trivial here, as you're starting from a plain string to begin with, the very same as the first sample.
Another way to use it would be to specify some different sorting criteria other than the trivial for strings. Say you want to sort not alphabetically, but by the third letter instead:
IEnumerable<String> strings = ...;
strings = strings.OrderBy(a => a.Substring(2, 1));

How to insert a lambda into a Linq declarative query expression

Let's say you have the following code:
string encoded="9,8,5,4,9";
// Parse the encoded string into a collection of numbers
var nums=from string s in encoded.Split(',')
select int.Parse(s);
That's easy, but what if I want to apply a lambda expression to s in the select, but still keep this as a declarative query expression, in other words:
string encoded="9,8,5,4,9";
// Parse the encoded string into a collection of numbers
var nums=from string s in encoded.Split(',')
select (s => {/* do something more complex with s and return an int */});
This of course does not compile. But, how can I get a lambda in there without switching this to fluent syntax.
Update: Thanks to guidance from StriplingWarrior, I have a convoluted but compilable solution:
var result=from string s in test.Split(',')
select ((Func<int>)
(() => {string u="1"+s+"2"; return int.Parse(u);}))();
The key is in the cast to a Func<string,int> followed by evaluation of the lambda for each iteration of the select with (s). Can anyone come up with anything simpler (i.e., without the cast to Func followed by its evaluation or perhaps something less verbose that achieves the same end result while maintaining the query expression syntax)?
Note: The lambda content above is trivial and exemplary in nature. Please don't change it.
Update 2: Yes, it's me, crazy Mike, back with an alternate (prettier?) solution to this:
public static class Lambda
{
public static U Wrap<U>(Func<U> f)
{
return f();
}
}
...
// Then in some function, in some class, in a galaxy far far away:
// Look what we can do with no casts
var res=from string s in test.Split(',')
select Lambda.Wrap(() => {string u="1"+s+"2"; return int.Parse(u);});
I think this solves the problem without the ugly cast and parenarrhea. Is something like the Lambda.Wrap generic method already present somewhere in the .NET 4.0 Framework, so that I do not have to reinvent the wheel? Not to overburden this discussion, I have moved this point into its own question: Does this "Wrap" generic method exist in .NET 4.0.
Assuming you're using LINQ to Objects, you could just use a helper method:
select DoSomethingComplex(s)
If you don't like methods, you could use a Func:
Func<string, string> f = s => { Console.WriteLine(s); return s; };
var q = from string s in new[]{"1","2"}
select f(s);
Or if you're completely hell-bent on putting it inline, you could do something like this:
from string s in new[]{"1","2"}
select ((Func<string>)(() => { Console.WriteLine(s); return s; }))()
You could simply do:
var nums = from string s in encoded.Split(',')
select (s => { DoSomething(); return aValueBasedOnS; });
The return tells the compiler the type of the resulting collection.
How about this:
var nums= (from string s in encoded.Split(',') select s).Select( W => ...);
Can anyone come up with anything
simpler?
Yes. First, you could rewrite it like this
var result = from s in encoded.Split(',')
select ((Func<int>)(() => int.Parse("1" + s + "2")))();
However, that's not really readable, particularly for a query expression. For this particular query and projection, the let keyword could be used.
var result = from s in encoded.Split(',')
let t = "1" + s + "2"
select int.Parse(t);
IEnumerable integers = encoded.Split(',').Select(s => int.Parse(s));
Edit:
IEnumerable<int> integers = from s in encoded.Split(',') select int.Parse(string.Format("1{0}2",s));

C# - AsEnumerable Example

What is the exact use of AsEnumerable? Will it change non-enumerable collection to enumerable
collection?.Please give me a simple example.
From the "Remarks" section of the MSDN documentation:
The AsEnumerable<TSource> method has no effect
other than to change the compile-time
type of source from a type that
implements IEnumerable<T> to
IEnumerable<T> itself.
AsEnumerable<TSource> can be used to choose
between query implementations when a
sequence implements IEnumerable<T> but also has a different set
of public query methods available. For
example, given a generic class Table
that implements IEnumerable<T> and has its own methods such
as Where, Select, and SelectMany, a
call to Where would invoke the public
Where method of Table. A Table type
that represents a database table could
have a Where method that takes the
predicate argument as an expression
tree and converts the tree to SQL for
remote execution. If remote execution
is not desired, for example because
the predicate invokes a local method,
the AsEnumerable<TSource>
method can be used to hide the custom
methods and instead make the standard
query operators available.
If you take a look in reflector:
public static IEnumerable<TSource> AsEnumerable<TSource>(this IEnumerable<TSource> source)
{
return source;
}
It basically does nothing more than down casting something that implements IEnumerable.
Nobody has mentioned this for some reason, but observe that something.AsEnumerable() is equivalent to (IEnumerable<TSomething>) something. The difference is that the cast requires the type of the elements to be specified explicitly, which is, of course, inconvenient. For me, that's the main reason to use AsEnumerable() instead of the cast.
AsEnumerable() converts an array (or list, or collection) into an IEnumerable<T> of the collection.
See http://msdn.microsoft.com/en-us/library/bb335435.aspx for more information.
From the above article:
The AsEnumerable<TSource>(IEnumerable<TSource>) method has no
effect other than to change the compile-time type of source from a type
that implements IEnumerable<T> to IEnumerable<T> itself.
After reading the answers, i guess you are still missing a practical example.
I use this to enable me to use linq on a datatable
var mySelect = from table in myDataSet.Tables[0].AsEnumerable()
where table["myColumn"].ToString() == "Some text"
select table;
AsEnumerable can only be used on enumerable collections. It just changes the type of the collection to IEnumerable<T> to access more easily the IEnumerable extensions.
No it doesn't change a non-enumerable collection to an enumerable one. What is does it return the collection back to you as an IEnumerable so that you can use it as an enumerable. That way you can use the object in conjunction with IEnumerable extensions and be treated as such.
Here's example code which may illustrate LukeH's correct explanation.
IEnumerable<Order> orderQuery = dataContext.Orders
.Where(o => o.Customer.Name == "Bob")
.AsEnumerable()
.Where(o => MyFancyFilterMethod(o, MyFancyObject));
The first Where is Queryable.Where, which is translated into sql and run in the database (o.Customer is not loaded into memory).
The second Where is Enumerable.Where, which calls an in-memory method with an instance of something I don't want to send into the database.
Without the AsEnumerable method, I'd have to write it like this:
IEnumerable<Order> orderQuery =
((IEnumerable<Order>)
(dataContext.Orders.Where(o => o.Customer.Name == "Bob")))
.Where(o => MyFancyFilterMethod(o, MyFancyObject));
Or
IEnumerable<Order> orderQuery =
Enumerable.Where(
dataContext.Orders.Where(o => o.Customer.Name == "Bob"),
(o => MyFancyFilterMethod(o, MyFancyObject));
Neither of which flow well at all.
static void Main()
{
/*
"AsEnumerable" purpose is to cast an IQueryable<T> sequence to IEnumerable<T>,
forcing the remainder of the query to execute locally instead of on database as below example so it can hurt performance. (bind Enumerable operators instead of Queryable).
In below example we have cars table in SQL Server and are going to filter red cars and filter equipment with some regex:
*/
Regex wordCounter = new Regex(#"\w");
var query = dataContext.Cars.Where(car=> article.Color == "red" && wordCounter.Matches(car.Equipment).Count < 10);
/*
SQL Server doesn’t support regular expressions therefore the LINQ-to-db providers will throw an exception: query cannot be translated to SQL.
TO solve this firstly we can get all cars with red color using a LINQ to SQL query,
and secondly filtering locally for Equipment of less than 10 words:
*/
Regex wordCounter = new Regex(#"\w");
IEnumerable<Car> sqlQuery = dataContext.Cars
.Where(car => car.Color == "red");
IEnumerable<Car> localQuery = sqlQuery
.Where(car => wordCounter.Matches(car.Equipment).Count < 10);
/*
Because sqlQuery is of type IEnumerable<Car>, the second query binds to the local query operators,
therefore that part of the filtering is run on the client.
With AsEnumerable, we can do the same in a single query:
*/
Regex wordCounter = new Regex(#"\w");
var query = dataContext.Cars
.Where(car => car.Color == "red")
.AsEnumerable()
.Where(car => wordCounter.Matches(car.Equipment).Count < 10);
/*
An alternative to calling AsEnumerable is ToArray or ToList.
*/
}
The Enumerable.AsEnumerable method can be used to hide a type's custom implementation of a standard query operator
Consider the following example. we have a custom List called MyList
public class MyList<T> : List<T>
{
public string Where()
{
return $"This is the first element {this[0]}";
}
}
MyList has a method called Where which is Enumerable.Where() exact same name. when I use it, actually I am calling my version of Where, not Enumerable's version
MyList<int> list = new MyList<int>();
list.Add(4);
list.Add(2);
list.Add(7);
string result = list.Where();
// the result is "This is the first element 4"
Now how can I find the elements which are less than 5 with the Enumerable's version of Where?
The answer is: Use AsEnumerable() method and then call Where
IEnumerable<int> result = list.AsEnumerable().Where(e => e < 5);
This time the result contains the list of elements that are less than 5

Categories