C# LINQ - Comparing a IEnumerable<string> against an anonmyous list? - c#

The basic question
I have:
IEnumerable<string> listA
var listB (this is an anonymous list generated by a LINQ query)
I want to query a list of objects that contain listA to see if they match to listB:
someObjectList.Where(x => x.listA == listB)
The comparison doesn't work - so how do I ensure that both lists are the same type for comparison?
The detailed question
I am grouping a larger list into a subset that contains a name and related date(s).
var listGroup = from n in list group n by new
{ n.NAME } into d
select new
{
NAME = d.Key.NAME, listOfDates = from x in d select new
{ Date = x.DATE } };
I have a object to hold the values for further processing:
class SomeObject
{
public SomeObject()
{
_listOfDates = new List<DateTime>();
}
private IEnumerable<DateTime> _listOfDates;
public IEnumerable<DateTime> ListOfDates
{
get { return _listOfDates; }
set { _listOfDates = value; }
}
}
I am then iterating over the listGroup and adding into a generic List<> of SomeObject:
foreach(var item in listGroup)
{
SomeObject so = new SomeObject();
// ...do some stuff
if (some match occurs then add into List<SomeObject>)
}
As I iterate through then I want to check the existing List<SomeOjbect> for matches:
var record = someObjectList.Where(x => x.NAME == item.NAME &&
x.ListOfDates == item.listOfDates)
.SingleOrDefault();
The problem is that comparing x.ListOfDates against item.listOfDates doesn't work.
There is no compiler error but I suspect that the returned value lists are different. How to I get the lists to commonize so they can be compared?
Update #1
This seems to work to get the listOfDates into a similar format:
IEnumerable<DateTime> tempList = item.listOfDates.Select(x => x.DATE).ToList()
Then I followed the 'SequenceEqual' suggestion from #Matt Burland

You can just compare one IEnumerable<DateTime> to another IEnumerable<DateTime>, you need to compare the sequence. Luckily, there's Enumerable.SequenceEquals (in both static and extension method flavors) which should work here.
So something like:
var record = someObjectList
.Where(x => x.NAME == item.NAME && x.ListOfDates.SequenceEquals(item.listOfDates))
.SingleOrDefault();

Related

Joining two lists from each statement

if (Settings.Default.All)
{
List = new ObservableCollection<LexisNexis>(UnitOfWork.Query.Lexis.LexisForApprove2().OrderBy(x => x.TxnID).Reverse());
}
if (Settings.Default.MLhuillier)
{
List = new ObservableCollection<LexisNexis>(UnitOfWork.Query.Lexis.LexisForApprove2().Where(x => x.ServiceMode == "MLhuillier").OrderBy(x => x.TxnID).Reverse());
}
if (Settings.Default.BPI)
{
List = new ObservableCollection<LexisNexis>(UnitOfWork.Query.Lexis.LexisForApprove2().Where(x => x.ServiceMode == "BPI").OrderBy(x => x.TxnID).Reverse());
}
I want to combine each list from each if statement that returns true. my program just return the last list. TYIA
Simplifying the code
The following should do what you want with little duplication and with at most one traversal through LexisForApprove2.
var orFilters = Settings.Default.All ? null : new List<string>();
if (!Settings.Default.All)
{
if (Settings.Default.MLhuillier) orFilters.Add("MLhuillier");
if (Settings.Default.BPI) orFilters.Add("BPI");
}
var l = orFilters == null
? UnitOfWork.Query.Lexis.LexisForApprove2() // Everything
: orFilters.Any()
? UnitOfWork.Query.Lexis.LexisForApprove2().Where(x => orFilters.Contains(x.ServiceMode))
: new List<LexisNexis>(); // Not 'All' but no others allowed
List = new ObservableCollection<LexisNexis>(l.OrderByDescending(y => y.TxnID));
Distinct
Just for the record, and not recommened for this case, you could use List's AddRange or Linq's Union followed by Distinct, which would work if the LexisNexis objects are good at comparing themselves with others :)

C# How to split a List in two using LINQ [duplicate]

This question already has answers here:
Can I split an IEnumerable into two by a boolean criteria without two queries?
(6 answers)
Closed 2 years ago.
I am trying to split a List into two Lists using LINQ without iterating the 'master' list twice. One List should contain the elements for which the LINQ condition is true, and the other should contain all the other elements. Is this at all possible?
Right now I just use two LINQ queries, thus iterating the (huge) master List twice.
Here's the (pseudo) code I am using right now:
List<EventModel> events = GetAllEvents();
List<EventModel> openEvents = events.Where(e => e.Closer_User_ID == null);
List<EventModel> closedEvents = events.Where(e => e.Closer_User_ID != null);
Is it possible to yield the same results without iterating the original List twice?
You can use ToLookup extension method as follows:
List<Foo> items = new List<Foo> { new Foo { Name="A",Condition=true},new Foo { Name = "B", Condition = true },new Foo { Name = "C", Condition = false } };
var lookupItems = items.ToLookup(item => item.Condition);
var lstTrueItems = lookupItems[true];
var lstFalseItems = lookupItems[false];
You can do this in one statement by converting it into a Lookup table:
var splitTables = events.Tolookup(event => event.Closer_User_ID == null);
This will return a sequence of two elements, where every element is an IGrouping<bool, EventModel>. The Key says whether the sequence is the sequence with null Closer_User_Id, or not.
However this looks rather mystical. My advice would be to extend LINQ with a new function.
This function takes a sequence of any kind, and a predicate that divides the sequence into two groups: the group that matches the predicate and the group that doesn't match the predicate.
This way you can use the function to divide all kinds of IEnumerable sequences into two sequences.
See Extension methods demystified
public static IEnumerable<IGrouping<bool, TSource>> Split<TSource>(
this IEnumerable<TSource> source,
Func<TSource,bool> predicate)
{
return source.ToLookup(predicate);
}
Usage:
IEnumerable<Person> persons = ...
// divide the persons into adults and non-adults:
var result = persons.Split(person => person.IsAdult);
Result has two elements: the one with Key true has all Adults.
Although usage has now become easier to read, you still have the problem that the complete sequence is processed, while in fact you might only want to use a few of the resulting items
Let's return an IEnumerable<KeyValuePair<bool, TSource>>, where the Boolean value indicates whether the item matches or doesn't match:
public static IEnumerable<KeyValuePair<bool, TSource>> Audit<TSource>(
this IEnumerable<TSource> source,
Func<TSource,bool> predicate)
{
foreach (var sourceItem in source)
{
yield return new KeyValuePair<bool, TSource>(predicate(sourceItem, sourceItem));
}
}
Now you get a sequence, where every element says whether it matches or not. If you only need a few of them, the rest of the sequence is not processed:
IEnumerable<EventModel> eventModels = ...
EventModel firstOpenEvent = eventModels.Audit(event => event.Closer_User_ID == null)
.Where(splitEvent => splitEvent.Key)
.FirstOrDefault();
The where says that you only want those Audited items that passed auditing (key is true).
Because you only need the first element, the rest of the sequence is not audited anymore
GroupBy and Single should accomplish what you're looking for:
var groups = events.GroupBy(e => e.Closer_User_ID == null).ToList(); // As others mentioned this needs to be materialized to prevent `events` from being iterated twice.
var openEvents = groups.SingleOrDefault(grp => grp.Key == true)?.ToList() ?? new List<EventModel>();
var closedEvents = groups.SingleOrDefault(grp => grp.Key == false)?.ToList() ?? new List<EventModel>();
One line solution by using ForEach method of List:
List<EventModel> events = GetAllEvents();
List<EventModel> openEvents = new List<EventModel>();
List<EventModel> closedEvents = new List<EventModel>();
events.ForEach(x => (x.Closer_User_ID == null ? openEvents : closedEvents).Add(x));
You can do without LINQ. Switch to conventional loop approach.
List<EventModel> openEvents = new List<EventModel>();
List<EventModel> closedEvents = new List<EventModel>();
foreach(var e in events)
{
if(e.Closer_User_ID == null)
{
openEvents.Add(e);
}
else
{
closedEvents.Add(e);
}
}

Compare two List elements and replace if id is equals

I have two lists with Classes
public class Product
{
int id;
string url;
ect.
}
I need compare in the old list (10k+ elements) a new list(10 elements) by ID
and if an id is same just replace data from new List to old list
I think it will be good using LINQ.
Can you help me how can I use LINQ or there are batter library?
Do you need to modify the collection in place or return a new collection?
If you are returning a new collection you could
var query = from x in oldItems
join y in newItems on y.Id equals x.Id into g
from z in g.DefaultIfEmpty()
select z ?? x;
var new List = query.ToList();
This method will ignore entries in newItems that do not exist in old items.
If you are going to be modifying the collection in place you would be better off working with a dictionary and referencing that everywhere.
You can create a dictionary from the list by doing
var collection = items.ToDictionary(x => x.Id, x => x);
Note modifying the dictionary doesn't alter the source collection, the idea is to replace your collection with the dictionary object.
If you are using the dictionary you can then iterate over new collection and check the key.
foreach (var item in newItems.Where(x => collection.ContainsKey(x.Id))) {
collection[item.Id] = item;
}
Dictionaries are iterable so you can loop over the Values collection if you need to. Adds and removes are fast because you can reference by key. The only problem I can think you may run into is if you rely on the ordering of the collection.
If you are stuck needing to use the original collection type then you could use the ToDictionary message on your newItems collection. This makes your update code look like this.
var converted = newItems.ToDictionary(x => x.Id, x => x);
for (var i = 0; i < oldItems.Count(); i++) {
if (converted.ContainsKey(oldItems[i].Id)) {
oldItems[i] = converted[oldItems[i].Id];
}
}
This has the advantage the you only need to loop the newitems collection once, from then on it's key lookups, so it's less cpu intensive. The downside is you've created an new collection of keys for newitems so it consumes more memory.
Send you a sample function that joins the two list by id property of both lists and then update original Product.url with the newer one
void ChangeItems(IList<Product> original, IList<Product> newer){
original.Join(newer, o => o.id, n => n.id, (o, n) => new { original = o, newer = n })
.ToList()
.ForEach(j => j.original.Url = j.newer.Url);
}
Solution :- : The LINQ solution you're look for will be something like this
oldList = oldList.Select(ele => { return (newList.Any(i => i.id == ele.id) ? newList.FirstOrDefault(newObj => newObj.id == ele.id) : ele); }).ToList();
Note :- Here we are creating the OldList based on NewList & OldList i.e we are replacing OldList object with NewList object.If you only want some of the new List properties you can create a copy Method in your class
EG for copy constructor
oldList = oldList.Select(ele => { return (newList.Any(i => i.id == ele.id) ? ele.Copy(newList.FirstOrDefault(newObj => newObj.id == ele.id)) : ele); }).ToList();
//Changes in your class
public void Copy(Product prod)
{
//use req. property of prod. to be replaced the old class
this.id = prod.id;
}
Read
It is not a good idea to iterate over 10k+ elements even using linq as such it will still affect your CPU performance*
Online sample for 1st solution
As you have class
public class Product
{
public int id;
public string url;
public string otherData;
public Product(int id, string url, string otherData)
{
this.id = id;
this.url = url;
this.otherData = otherData;
}
public Product ChangeProp(Product newProd)
{
this.url = newProd.url;
this.otherData = newProd.otherData;
return this;
}
}
Note that, now we have ChangeProp method in data class, this method will accept new class and modify old class with properties of new class and return modified new class (as you want your old class be replaced with new classes property (data). So at the end Linq will be readable and clean.
and you already have oldList with lots of entries, and have to replace data of oldList by data of newList if id is same, you can do it like below.
suppose they are having data like below,
List<Product> oldList = new List<Product>();
for (int i = 0; i < 10000; i++)
{
oldList.Add(new Product(i, "OldData" + i.ToString(), "OldData" + i.ToString() + "-other"));
}
List<Product> newList = new List<Product>();
for (int i = 0; i < 5; i++)
{
newList.Add(new Product(i, "NewData" + i.ToString(), "NewData" + i.ToString() + "-other"));
}
this Linq will do your work.
oldList.Where(x => newList.Any(y => y.id == x.id))
.Select(z => oldList[oldList.IndexOf(z)].ChangeProp(newList.Where(a => a.id == z.id).FirstOrDefault())).ToList();
foreach(var product in newList)
{
int index = oldList.FindIndex(x => x.id == product.id);
if (index != -1)
{
oldList[index].url = product.url;
}
}
This will work and i think it's a better solution too.
All the above solution are creating new object in memory and creating new list with 10k+
records is definitely a bad idea.
Please make fields in product as it won't be accessible.

how do I make this LINQ query faster?

modelData has 100,000 items in the list.
I am doing 2 "Selects" within 2 loops.
Could it be structured differently - as it take a long time - 10 mins
public class ModelData
{
public string name;
public DateTime DT;
public int real;
public int trade;
public int position;
public int dayPnl;
}
List<ModelData> modelData;
var dates = modelData.Select(x => x.DT.Date).Distinct();
var names = modelData.Select(x => x.name).Distinct();
foreach (var aDate in dates)
{
var dateRealTrades = modelData.Select(x => x)
.Where(x => x.DT.Date.Equals(aDate) && x.real.Equals(1));
foreach (var aName in names)
{
var namesRealTrades = dateRealTrades.Select(x => x)
.Where(x => x.name.Equals(aName));
// DO MY PROCESSING
}
}
I believe what you want can be achieved with two queries using group by. One to create a lookup by the date and the other to give you the name-date grouped items.
var data = modelData.Where(x => x.real.Equals(1))
.GroupBy(x => new { x.DT.Date, x.name });
var byDate = modelData.Where(x => x.real.Equals(1))
.ToLookup(x => x.DT.Date);
foreach(var item in data)
{
var aDate = item.Key.Date;
var aName = item.Key.name;
var namesRealTrades = item.ToList();
var dateRealTrades = byDate[aDate].ToList();
// DO MY PROCESSING
}
The first query will give you items grouped by the name and date to iterate over and the second will give you a lookup to get all the items associated with a given date. The second uses a lookup so that the list is iterated once and gives you fast access to the resulting list of items.
This should greatly reduce the number of times you iterate over modelData from what you currently have.
You could rewrite your for loop like this:
foreach (var namesRealTrades in names.Select(aName => dateRealTrades.Where(x => x.name.Equals(aName))))
{
//DO STUFF
}
Depending on your data this could reduce the number of queries you have to make
Did you try to compile your query as suggested on MSDN WebSite?
When you have an application that executes structurally similar
queries many times, you can often increase performance by compiling
the query one time and executing it several times with different
parameters. For example, an application might have to retrieve all the
customers who are in a particular city, where the city is specified at
runtime by the user in a form. LINQ to SQL supports the use of
compiled queries for this purpose.
https://msdn.microsoft.com/en-us/library/bb399335(v=vs.110).aspx
A couple of things:
use .ToList() to calculate a sequence once, so you can keep it for later.
use .GroupBy() to avoid re-searching modelData for things you have already found.
// Collections of models having the same Date or Name.
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.GroupBy(x => x.Name);
foreach (var modelsWithDate in dates)
{
var aDate = modelsWithDate.Key;
var dateRealTrades = modelsWithDate.Where(x => x.real == 1).ToList();
foreach (var modelsWithName in names)
{
var aName = modelsWithName.Key;
var namesRealTrades = modelsWithName.ToList();
// DO MY PROCESSING
}
}
There are two ways the code is ineffective.
names has deffered evaluation. Every time You iterate over it, it has to go though the whole data to find all the distinct names again. You should save the result.
You find distinct values from collection and then You go through collection again for every distinct value and look fot its occurences. You should use grouping.
the rewritten code can look like this
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.Select(x => x.name).Distinct().ToArray();
foreach (var date in dates)
{
var dateRealTrades = date.Where(x => x.real.Equals(1)).ToArray();
var namesRealTradesLookup = dateRealTrades.ToLookup(x => x.name);
foreach (var aName in names)
{
var namesRealTrades = namesRealTradesLookup[aName];
// DO MY PROCESSING
// var aDate = date.Key;
}
}
In case You are not interestested in date/name combination with no real trade, it can be done in much more straightforward way
var realModelData = modelData.Where(x => x.real.Equals(1));
foreach (var dateRealTrades in realModelData.ToLookup(x => x.DT.Date))
{
foreach (var namesRealTrades in dateRealTrades.ToLookup(x => x.name))
{
// DO MY PROCESSING
//var aDate = dateRealTrades.Key;
//var aName = namesRealTrades.Key;
//foreach(var trade in namesRealTrades) { ...
//foreach(var trade in dateRealTrades) { ...
}
}

LINQ: Collapsing a series of strings into a set of "ranges"

I have an array of strings similar to this (shown on separate lines to illustrate the pattern):
{ "aa002","aa003","aa004","aa005","aa006","aa007", // note that aa008 is missing
"aa009"
"ba023","ba024","ba025"
"bb025",
"ca002","ca003",
"cb004",
...}
...and the goal is to collapse those strings into this comma-separated string of "ranges":
"aa002-aa007,aa009,ba023-ba025,bb025,ca002-ca003,cb004, ... "
I want to collapse them so I can construct a URL. There are hundreds of elements, but I can still convey all the information if I collapse them this way - putting them all into a URL "longhand" (it has to be a GET, not a POST) isn't feasible.
I've had the idea to separate them into groups using the first two characters as the key - but does anyone have any clever ideas for collapsing those sequences (without gaps) into ranges? I'm struggling with it, and everything I've come up with looks like spaghetti.
So the first thing that you need to do is parse the strings. It's important to have the alphabetic prefix and the integer value separately.
Next you want to group the items on the prefix.
For each of the items in that group, you want to order them by number, and then group items while the previous value's number is one less than the current item's number. (Or, put another way, while the previous item plus one is equal to the current item.)
Once you've grouped all of those items you want to project that group out to a value based on that range's prefix, as well as the first and last number. No other information from these groups is needed.
We then flatten the list of strings for each group into just a regular list of strings, since once we're all done there is no need to separate out ranges from different groups. This is done using SelectMany.
When that's all said and done, that, translated into code, is this:
public static IEnumerable<string> Foo(IEnumerable<string> data)
{
return data.Select(item => new
{
Prefix = item.Substring(0, 2),
Number = int.Parse(item.Substring(2))
})
.GroupBy(item => item.Prefix)
.SelectMany(group => group.OrderBy(item => item.Number)
.GroupWhile((prev, current) =>
prev.Number + 1 == current.Number)
.Select(range =>
RangeAsString(group.Key,
range.First().Number,
range.Last().Number)));
}
The GroupWhile method can be implemented like so:
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(
this IEnumerable<T> source, Func<T, T, bool> predicate)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
List<T> list = new List<T>() { iterator.Current };
T previous = iterator.Current;
while (iterator.MoveNext())
{
if (!predicate(previous, iterator.Current))
{
yield return list;
list = new List<T>();
}
list.Add(iterator.Current);
previous = iterator.Current;
}
yield return list;
}
}
And then the simple helper method to convert each range into a string:
private static string RangeAsString(string prefix, int start, int end)
{
if (start == end)
return prefix + start;
else
return string.Format("{0}{1}-{0}{2}", prefix, start, end);
}
Here's a LINQ version without the need to add new extension methods:
var data2 = data.Skip(1).Zip(data, (d1, d0) => new
{
value = d1,
jump = d1.Substring(0, 2) == d0.Substring(0, 2)
? int.Parse(d1.Substring(2)) - int.Parse(d0.Substring(2))
: -1,
});
var agg = new { f = data.First(), t = data.First(), };
var query2 =
data2
.Aggregate(new [] { agg }.ToList(), (a, x) =>
{
var last = a.Last();
if (x.jump == 1)
{
a.RemoveAt(a.Count() - 1);
a.Add(new { f = last.f, t = x.value, });
}
else
{
a.Add(new { f = x.value, t = x.value, });
}
return a;
});
var query3 =
from q in query2
select (q.f) + (q.f == q.t ? "" : "-" + q.t);
I get these results:

Categories