I have large yielded collection and I would like to get distinct values of each property independently:
IEnumerable<MyClass> collection = ...;
var prop1Values = collection.Select(i => i.Prop1).Distinct();
var prop2Values = collection.Select(i => i.Prop2).Distinct();
var prop3Values = collection.Select(i => i.Prop3).Distinct();
How to get it without enumerating the collection multiple times? Looking for most intuitive solution :)
You can try do it in a single foreach with a help of HashSet<T>s:
//TODO: put the right types for TypeOfProp1, TypeOfProp2, TypeOfProp3
var prop1Values = new HashSet<TypeOfProp1>();
var prop2Values = new HashSet<TypeOfProp2>();
var prop3Values = new HashSet<TypeOfProp3>();
foreach (var item in collection) {
prop1Values.Add(item.Prop1);
prop2Values.Add(item.Prop2);
prop3Values.Add(item.Prop3);
}
Related
modelData has 100,000 items in the list.
I am doing 2 "Selects" within 2 loops.
Could it be structured differently - as it take a long time - 10 mins
public class ModelData
{
public string name;
public DateTime DT;
public int real;
public int trade;
public int position;
public int dayPnl;
}
List<ModelData> modelData;
var dates = modelData.Select(x => x.DT.Date).Distinct();
var names = modelData.Select(x => x.name).Distinct();
foreach (var aDate in dates)
{
var dateRealTrades = modelData.Select(x => x)
.Where(x => x.DT.Date.Equals(aDate) && x.real.Equals(1));
foreach (var aName in names)
{
var namesRealTrades = dateRealTrades.Select(x => x)
.Where(x => x.name.Equals(aName));
// DO MY PROCESSING
}
}
I believe what you want can be achieved with two queries using group by. One to create a lookup by the date and the other to give you the name-date grouped items.
var data = modelData.Where(x => x.real.Equals(1))
.GroupBy(x => new { x.DT.Date, x.name });
var byDate = modelData.Where(x => x.real.Equals(1))
.ToLookup(x => x.DT.Date);
foreach(var item in data)
{
var aDate = item.Key.Date;
var aName = item.Key.name;
var namesRealTrades = item.ToList();
var dateRealTrades = byDate[aDate].ToList();
// DO MY PROCESSING
}
The first query will give you items grouped by the name and date to iterate over and the second will give you a lookup to get all the items associated with a given date. The second uses a lookup so that the list is iterated once and gives you fast access to the resulting list of items.
This should greatly reduce the number of times you iterate over modelData from what you currently have.
You could rewrite your for loop like this:
foreach (var namesRealTrades in names.Select(aName => dateRealTrades.Where(x => x.name.Equals(aName))))
{
//DO STUFF
}
Depending on your data this could reduce the number of queries you have to make
Did you try to compile your query as suggested on MSDN WebSite?
When you have an application that executes structurally similar
queries many times, you can often increase performance by compiling
the query one time and executing it several times with different
parameters. For example, an application might have to retrieve all the
customers who are in a particular city, where the city is specified at
runtime by the user in a form. LINQ to SQL supports the use of
compiled queries for this purpose.
https://msdn.microsoft.com/en-us/library/bb399335(v=vs.110).aspx
A couple of things:
use .ToList() to calculate a sequence once, so you can keep it for later.
use .GroupBy() to avoid re-searching modelData for things you have already found.
// Collections of models having the same Date or Name.
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.GroupBy(x => x.Name);
foreach (var modelsWithDate in dates)
{
var aDate = modelsWithDate.Key;
var dateRealTrades = modelsWithDate.Where(x => x.real == 1).ToList();
foreach (var modelsWithName in names)
{
var aName = modelsWithName.Key;
var namesRealTrades = modelsWithName.ToList();
// DO MY PROCESSING
}
}
There are two ways the code is ineffective.
names has deffered evaluation. Every time You iterate over it, it has to go though the whole data to find all the distinct names again. You should save the result.
You find distinct values from collection and then You go through collection again for every distinct value and look fot its occurences. You should use grouping.
the rewritten code can look like this
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.Select(x => x.name).Distinct().ToArray();
foreach (var date in dates)
{
var dateRealTrades = date.Where(x => x.real.Equals(1)).ToArray();
var namesRealTradesLookup = dateRealTrades.ToLookup(x => x.name);
foreach (var aName in names)
{
var namesRealTrades = namesRealTradesLookup[aName];
// DO MY PROCESSING
// var aDate = date.Key;
}
}
In case You are not interestested in date/name combination with no real trade, it can be done in much more straightforward way
var realModelData = modelData.Where(x => x.real.Equals(1));
foreach (var dateRealTrades in realModelData.ToLookup(x => x.DT.Date))
{
foreach (var namesRealTrades in dateRealTrades.ToLookup(x => x.name))
{
// DO MY PROCESSING
//var aDate = dateRealTrades.Key;
//var aName = namesRealTrades.Key;
//foreach(var trade in namesRealTrades) { ...
//foreach(var trade in dateRealTrades) { ...
}
}
I have a variable number of "filters" to apply to an entity collection, and these filters are stored in a List. Right now, I'm doing the following:
IQueryable<Items> items = SharedContext.Context.Items.GetAll();
//This list is dynamic, but shown static here for simplicity
IEnumerable<Items> filterList = new List<string>(){"new", "old", "current"};
IEnumerable<Item> items_temp = new List<Item>();
foreach (string type in filterList)
{
var temp = items.Where(i => i.Type.ToLower().Trim() == type.ToLower().Trim());
items_temp = items_temp.Union(temp);
}
items = items_temp.AsQueryable();
Unfortunately, this causes a huge performance issue. I know somebody out there has a better solution... What do you guys think?
EDIT
Running my application with the code above takes around 30 seconds to execute, but if do the following:
items.Where(item => item.Type.ToLower().Trim() == "new" ||
item.Type.ToLower().Trim() == "old" ||
item.Type.ToLower().Trim() == "current");
my application executes in about 4 seconds. Can anyone think of a solution that can match this performance or at least fill me in on why the results are so drastically different? FYI, I'm binding my data to a grid with multiple grids nested inside... a small improvement can go a long way.
Sounds like what you want is this:
var items = SharedContext.Context.Items.GetAll();
IEnumerable<string> filterList = new List<string>(){"new", "old", "current"};
var filteredItems = items.Where(i => filterList.Contains(i.Type.ToLower()));
If you're having issues with this, you might want to try using an array instead:
string[] filterList = new string[] {"new", "old", "current"};
var filteredItems = items.Where(i => filterList.Contains(i.Type.ToLower()));
Update: Here's another strategy which generates a filter expression dynamically:
var filterList = new[] { "new", "old", "current" };
var param = Expression.Parameter(typeof(Item));
var left =
Expression.Call(
Expression.Call(
Expression.PropertyOrField(param, "Type"),
typeof(string).GetMethod("ToLower", Type.EmptyTypes)),
typeof(string).GetMethod("Trim", Type.EmptyTypes));
var filterExpr = (Expression<Func<Item, bool>>)Expression.Lambda(
filterList
.Select(f => Expression.Equal(left, Expression.Constant(f)))
.Aggregate((l, r) => Expression.OrElse(l, r)),
param);
var filteredItems = items.Where(filterExpr);
You could do a join to filter the items:
IEnumerable<Items> filterList = new List<string>(){"new", "old", "current"};
IQueryable<Items> items = SharedContext.Context.Items.GetAll();
var filteredItems = from i items
join f in filterList
on i.Type.ToLower().Trim() equals t.ToLower().Trim()
select i;
I have a linq query that works when it I had a list of a single value now that I change to having a List that has several properties I need to change the where clause
So this works:
List<string> etchList = new List<string>();
etchList.Add("24");
var etchVect = (from vio in AddPlas
where etchList.Any(v => vio.Key.Formatted.Equals(v))
let firstOrDefault = vio.Shapes.FirstOrDefault()
where firstOrDefault != null
select new
{
EtchVectors = firstOrDefault.Formatted
}).ToList();
However I have a new hard coded list (which will represent incoming data:
List<ExcelViolations> excelViolations = new List<ExcelViolations>();
excelViolations.Add(new ExcelViolations
{
VioID = 24,
RuleType = "SPACING",
VioType = "Line-Line",
XCoordinate = 6132,
YCoordinate = 10031.46
});
So the NEW Linq query looks like this, but is obviously will not work as
AddPlas is a List and so using this other list of excelviolations, I wish to have it do where on each one of the properties in the excelviolations list
var etchVect = (from vio in AddPlas
where excelViolations.Any(vioId => vio.Key.Formatted.Equals(vioId))
let firstOrDefault = vio.Shapes.FirstOrDefault()
select new
{
EtchVectors = firstOrDefault.Formatted
}).ToList();
Now, since this is a list within a list, I would like to do something like add in each of the properties
so for example:
where excelViolations.VioID.Any(vioId => vio.Key.Formatted.Equals(vioId))
However that is not possible, but you see that I'm trying to access the property of VioID that is in the excelViolations and match it to the Key which is in vio list
Just change this line
where excelViolations.Any(vioId => vio.Key.Formatted.Equals(vioId))
to
where excelViolations.Any(excelVio => vio.Key.Formatted.Equals(excelVio.VioID))
then i thought it will works
I have below code in c# 4.0.
//Dictionary object with Key as string and Value as List of Component type object
Dictionary<String, List<Component>> dic = new Dictionary<String, List<Component>>();
//Here I am trying to do the loping for List<Component>
foreach (List<Component> lstComp in dic.Values.ToList())
{
// Below I am trying to get first component from the lstComp object.
// Can we achieve same thing using LINQ?
// Which one will give more performance as well as good object handling?
Component depCountry = lstComp[0].ComponentValue("Dep");
}
Try:
var firstElement = lstComp.First();
You can also use FirstOrDefault() just in case lstComp does not contain any items.
http://msdn.microsoft.com/en-gb/library/bb340482(v=vs.100).aspx
Edit:
To get the Component Value:
var firstElement = lstComp.First().ComponentValue("Dep");
This would assume there is an element in lstComp. An alternative and safer way would be...
var firstOrDefault = lstComp.FirstOrDefault();
if (firstOrDefault != null)
{
var firstComponentValue = firstOrDefault.ComponentValue("Dep");
}
[0] or .First() will give you the same performance whatever happens.
But your Dictionary could contains IEnumerable<Component> instead of List<Component>, and then you cant use the [] operator. That is where the difference is huge.
So for your example, it doesn't really matters, but for this code, you have no choice to use First():
var dic = new Dictionary<String, IEnumerable<Component>>();
foreach (var components in dic.Values)
{
// you can't use [0] because components is an IEnumerable<Component>
var firstComponent = components.First(); // be aware that it will throw an exception if components is empty.
var depCountry = firstComponent.ComponentValue("Dep");
}
You also can use this:
var firstOrDefault = lstComp.FirstOrDefault();
if(firstOrDefault != null)
{
//doSmth
}
for the linq expression you can use like this :
List<int> list = new List<int>() {1,2,3 };
var result = (from l in list
select l).FirstOrDefault();
for the lambda expression you can use like this
List list = new List() { 1, 2, 3 };
int x = list.FirstOrDefault();
You can do
Component depCountry = lstComp
.Select(x => x.ComponentValue("Dep"))
.FirstOrDefault();
Alternatively if you are wanting this for the entire dictionary of values, you can even tie it back to the key
var newDictionary = dic.Select(x => new
{
Key = x.Key,
Value = x.Value.Select( y =>
{
depCountry = y.ComponentValue("Dep")
}).FirstOrDefault()
}
.Where(x => x.Value != null)
.ToDictionary(x => x.Key, x => x.Value());
This will give you a new dictionary. You can access the values
var myTest = newDictionary[key1].depCountry
Try this to get all the list at first, then your desired element (say the First in your case):
var desiredElementCompoundValueList = new List<YourType>();
dic.Values.ToList().ForEach( elem =>
{
desiredElementCompoundValue.Add(elem.ComponentValue("Dep"));
});
var x = desiredElementCompoundValueList.FirstOrDefault();
To get directly the first element value without a lot of foreach iteration and variable assignment:
var desiredCompoundValue = dic.Values.ToList().Select( elem => elem.CompoundValue("Dep")).FirstOrDefault();
See the difference between the two approaches: in the first one you get the list through a ForEach, then your element. In the second you can get your value in a straight way.
Same result, different computation ;)
There are a bunch of such methods:
.First .FirstOrDefault .Single .SingleOrDefault
Choose which suits you best.
var firstObjectsOfValues = (from d in dic select d.Value[0].ComponentValue("Dep"));
I would to it like this:
//Dictionary object with Key as string and Value as List of Component type object
Dictionary<String, List<Component>> dic = new Dictionary<String, List<Component>>();
//from each element of the dictionary select first component if any
IEnumerable<Component> components = dic.Where(kvp => kvp.Value.Any()).Select(kvp => (kvp.Value.First() as Component).ComponentValue("Dep"));
but only if it is sure that list contains only objects of Component class or children
I parse some data
var result = xml.Descendants("record").Select(x => new F130Account
{
Account = x.Descendants("Account").First().Value,
});
Then I try to some update
foreach (var item in result)
item.Quantity = 1;
After this I have result.Sum(a => a.Quantity) is zero... Why?
That's because your result collection is evaluated again each time you start enumerating it, so Sum runs on new set of F130Account objects, different then foreach loop. That's how LINQ and it's laziness works.
Initialize results to List<F130Account> first:
var result = xml.Descendants("record").Select(x => new F130Account
{
Account = x.Descendants("Account").First().Value,
}).ToList();
And after that both foreach and Sum will run on the same collection of objects.