Why does IEnumerable.ToList produce new objects

Why does IEnumerable.ToList produce new objects - c#

public class TestClass
{
public int Num { get; set; }
public string Name { get; set; }
}
IEnumerable<TestClass> originalEnumerable
= Enumerable.Range(1, 1).
Select(x => new TestClass() { Num = x, Name = x.ToString() });
List<TestClass> listFromEnumerable = originalEnumerable.ToList();
// false
bool test = ReferenceEquals(listFromEnumerable[0], originalEnumerable.ElementAt(0));
If I modify the copied object the original does not change.
I understand from this that System.Linq.IEnumerable<T> collections will always produce new objects even though they are reference types when calling ToList()/ToArray()? Am I missing something as other SO questions are insistent only references are copied.

Actually, the reason this is false is because your enumerable actually gets enumerated twice, once in the call originalEnumerable.ToList(), and a second time when you invoke originalEnumerable.ElementAt(0).
Since it enumerates twice, it ends up creating new objects each time it is enumerated.

You're missing the fact, that Enumerable.Range is lazy, so this line actually produce nothing but in-memory query definition:
IEnumerable<TestClass> originalEnumerable
= Enumerable.Range(1, 1).
Select(x => new TestClass() { Num = x, Name = x.ToString() });
Calling ToList() fires the query and creates list with new items:
List<TestClass> listFromEnumerable = originalEnumerable.ToList();
Calling originalEnumerable.ElementAt(0) here
bool test = ReferenceEquals(listFromEnumerable[0], originalEnumerable.ElementAt(0));
fires the query again, and produces yet another new items.
Update
So you should say, that Enumerable.Range produces new items, not ToList().
If your source collection would be already evaluated (e.g. TestClass[] array) ToList() won't create new items:
IEnumerable<TestClass> originalEnumerable
= Enumerable.Range(1, 1).
Select(x => new TestClass() { Num = x, Name = x.ToString() })
.ToArray();
List<TestClass> listFromEnumerable = originalEnumerable.ToList();
// true
bool test = ReferenceEquals(listFromEnumerable[0], originalEnumerable[0]);

Related

Compare two List elements and replace if id is equals

I have two lists with Classes
public class Product
{
int id;
string url;
ect.
}
I need compare in the old list (10k+ elements) a new list(10 elements) by ID
and if an id is same just replace data from new List to old list
I think it will be good using LINQ.
Can you help me how can I use LINQ or there are batter library?

Do you need to modify the collection in place or return a new collection?
If you are returning a new collection you could
var query = from x in oldItems
join y in newItems on y.Id equals x.Id into g
from z in g.DefaultIfEmpty()
select z ?? x;
var new List = query.ToList();
This method will ignore entries in newItems that do not exist in old items.
If you are going to be modifying the collection in place you would be better off working with a dictionary and referencing that everywhere.
You can create a dictionary from the list by doing
var collection = items.ToDictionary(x => x.Id, x => x);
Note modifying the dictionary doesn't alter the source collection, the idea is to replace your collection with the dictionary object.
If you are using the dictionary you can then iterate over new collection and check the key.
foreach (var item in newItems.Where(x => collection.ContainsKey(x.Id))) {
collection[item.Id] = item;
}
Dictionaries are iterable so you can loop over the Values collection if you need to. Adds and removes are fast because you can reference by key. The only problem I can think you may run into is if you rely on the ordering of the collection.
If you are stuck needing to use the original collection type then you could use the ToDictionary message on your newItems collection. This makes your update code look like this.
var converted = newItems.ToDictionary(x => x.Id, x => x);
for (var i = 0; i < oldItems.Count(); i++) {
if (converted.ContainsKey(oldItems[i].Id)) {
oldItems[i] = converted[oldItems[i].Id];
}
}
This has the advantage the you only need to loop the newitems collection once, from then on it's key lookups, so it's less cpu intensive. The downside is you've created an new collection of keys for newitems so it consumes more memory.

Send you a sample function that joins the two list by id property of both lists and then update original Product.url with the newer one
void ChangeItems(IList<Product> original, IList<Product> newer){
original.Join(newer, o => o.id, n => n.id, (o, n) => new { original = o, newer = n })
.ToList()
.ForEach(j => j.original.Url = j.newer.Url);
}

Solution :- : The LINQ solution you're look for will be something like this
oldList = oldList.Select(ele => { return (newList.Any(i => i.id == ele.id) ? newList.FirstOrDefault(newObj => newObj.id == ele.id) : ele); }).ToList();
Note :- Here we are creating the OldList based on NewList & OldList i.e we are replacing OldList object with NewList object.If you only want some of the new List properties you can create a copy Method in your class
EG for copy constructor
oldList = oldList.Select(ele => { return (newList.Any(i => i.id == ele.id) ? ele.Copy(newList.FirstOrDefault(newObj => newObj.id == ele.id)) : ele); }).ToList();
//Changes in your class
public void Copy(Product prod)
{
//use req. property of prod. to be replaced the old class
this.id = prod.id;
}
Read
It is not a good idea to iterate over 10k+ elements even using linq as such it will still affect your CPU performance*
Online sample for 1st solution

As you have class
public class Product
{
public int id;
public string url;
public string otherData;
public Product(int id, string url, string otherData)
{
this.id = id;
this.url = url;
this.otherData = otherData;
}
public Product ChangeProp(Product newProd)
{
this.url = newProd.url;
this.otherData = newProd.otherData;
return this;
}
}
Note that, now we have ChangeProp method in data class, this method will accept new class and modify old class with properties of new class and return modified new class (as you want your old class be replaced with new classes property (data). So at the end Linq will be readable and clean.
and you already have oldList with lots of entries, and have to replace data of oldList by data of newList if id is same, you can do it like below.
suppose they are having data like below,
List<Product> oldList = new List<Product>();
for (int i = 0; i < 10000; i++)
{
oldList.Add(new Product(i, "OldData" + i.ToString(), "OldData" + i.ToString() + "-other"));
}
List<Product> newList = new List<Product>();
for (int i = 0; i < 5; i++)
{
newList.Add(new Product(i, "NewData" + i.ToString(), "NewData" + i.ToString() + "-other"));
}
this Linq will do your work.
oldList.Where(x => newList.Any(y => y.id == x.id))
.Select(z => oldList[oldList.IndexOf(z)].ChangeProp(newList.Where(a => a.id == z.id).FirstOrDefault())).ToList();

foreach(var product in newList)
{
int index = oldList.FindIndex(x => x.id == product.id);
if (index != -1)
{
oldList[index].url = product.url;
}
}
This will work and i think it's a better solution too.
All the above solution are creating new object in memory and creating new list with 10k+
records is definitely a bad idea.
Please make fields in product as it won't be accessible.

Strange syntax in ef

var items = context.Items.Where(x => x.IsActive=true).ToList();
Why is correct syntax and working query?

This is a very subtle bug in the code. The Where Func needs to return a bool to be valid, but you are setting the value, rather than comparing it, so there's nothing to return, yes?
General Explanation
The code compiles because when you assign a value in c#, e.g. x = 1 that expression is evaluated, and therefore returned, as the value which was assigned (1).
People sometimes use this to lazily instantiate a readonly property, e.g.
private Foo myFoo;
public Foo FooInstance
{
// set myFoo to the existing instance or a new instance
// and return the result of the "myFoo ?? new Foo()" expression
get { return myFoo = myFoo ?? new Foo(); }
}
or assign the same value to multiple variables:
// set z to 1
// set y to result of "z = 1"
// set x to result of "y = z = 1"
x = y = z = 1;
Your Scenario
So what you are doing for each entry in the list is set IsActive to true and return that same true from the function. So then you end up with a new List containing all entries, and all of them have been changed to Active.
If you were using a property in the Where which is not a bool, such as an int, you would get a compilation error:
Cannot implicitly convert type 'int' to 'bool'.
See this as an example (https://dotnetfiddle.net/9S9NAV)
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
var foos = new List<Foo>()
{
new Foo(true, 1),
new Foo(false, 2)
};
// works
var actives = foos.Where(x => x.IsActive=true).ToList();
// returns 2, not 1!
Console.WriteLine(actives.Count);
// compile error
var ages = foos.Where(x => x.Age = 1).ToList();
}
}
public class Foo {
public Foo(bool active, int age)
{
this.IsActive = active;
this.Age = age;
}
public bool IsActive { get; set; }
public int Age { get; set; }
}

Your example is working as your are assigning the value and not comparing value which is syntactically correct and hence it compiles and executes correctly. Note that = is used for assignment and == is used for comparison.
When you say:
var items = context.Items.Where(x => x.IsActive=true).ToList();
In this you are not comparing rather assigning the true value to IsActive which is fine for the compiler and hence you find it working.

Why it is working?
I think works because 'x.IsActive = true' will always evaluate to true. So it's not syntactically incorrect.
In other words the code is saying:
Give me all Items
Set IsActive to true for each Item
Return all Items Where(x => x.IsActive)
So as all items are set to true it will return everything.

use == for checking condition
var items = context.Items.Where(x => x.IsActive==true).ToList();
Or
var items = context.Items.Where(x => x.IsActive).ToList();

Sort IEnumerable<Object> by property and ordered array of those properties

Given the following:
public class Foo
{
/* other properties */
public Int32 Id { get; set; }
}
var listOfFoo = new[]{
new Foo { Id = 1 },
new Foo { Id = 2 },
new Foo { Id = 3 }
};
var sortOrderIds = new[]{
2, 3, 1
};
If I wanted to sort listOfFoo to have the Ids end up in the same order as presented in sortOrderIds, what's the best way? I assume I could sort using something like:
Int32 SortUsingIdArrayAsReference(Foo x, Foo y)
{
// creative license on "IndexOf", bear with me
return Int32.CompareTo(sortOrderids.IndexOf(x.Id), sortOrderIds.indexOf(y.Id));
}
But is that really the best way to do this? I was hoping LINQ may have something better I could use, but if not oh well. Just looking for other input and see if anyone else has a better way.

You can use List.IndexOf
var ordered = listOfFoo.OrderBy(o => sortOrderIDs.IndexOf(o.Id));
Edit: Since sortOrderIDs is an array:
var ordered = listOfFoo.OrderBy(o => Array.IndexOf(sortOrderIds, o.Id));
Or, if you want to use the same for lists and arrays, cast it to IList:
var ordered = listOfFoo.OrderBy(o => ((IList)sortOrderIds).IndexOf(o.Id));

You could use something like this:
var ordered = listOfFoo.OrderBy(x => Array.IndexOf(sortOrderIds, x.Id));
This would sort them according to the order of the IDs in sortOrderIds. Foo objects whose IDs are not found will be at the very top of the resulting list.
If you want them to be at the bottom, change the code like this:
var ordered = listOfFoo.OrderBy(x =>
{
var idx = Array.IndexOf(sortOrderIds, x.Id);
return idx == -1 ? int.MaxValue : idx;
});

How to access to a field of a dynamic type?

I'm trying to wrap the results of a query with a class called QueryResultViewModel from a list of dynamic objects retrieved by LINQ. These contain a integer field called Worked. I should not use a non-dynamic type because depending on the query it has other fields. I tried that:
var query = new HoursQuery( .. parameters .. );
this.Result = new ObservableCollection<QueryResultViewModel>(
query.Execute().Select( x => new QueryResultViewModel( x.Worked )));
But I got "'object' does not contain a definition for 'Worked'" and I don't know If it can be fixed without changing query's return type.
The Execute code may be useful too:
var res = some_list.GroupBy(a => new { a.Employee, a.RelatedTask, a.Start.Month })
.Select(g => new { K = g.Key, Worked = g.Sum(s => s.Duration.TotalHours) });
EDIT: This worked great but maybe it's not very elegant.
public class HQueryDTO
{
public double Worked;
public object K;
}
public IEnumerable<dynamic> Execute()
{
var list = base.Execute();
return res = list.GroupBy(a => new { a.Employee, a.RelatedTask } )
.Select(g => new HQueryDTO { K = g.Key, Worked = g.Sum(s => s.Duration.TotalHours) });
}
Now that the result has a type it can be returned dynamic.

I'm assuming you get that error at compile-time, in which case simply introduce dynamic via a cast:
.Select(x => new QueryResultViewModel( ((dynamic)x).Worked ))

I assume that the signature of Execute is something like object Execute(). If you return dynamic, it should work.

LINQ Combine Queries

I have two collections of objects of different type. Lets call them type ALPHA and type BRAVO. Each of these types has a property that is the "ID" for the object. No ID is duplicated within the class, so for any given ID, there is at most one ALPHA and one BRAVO instance. What I need to do is divide them into 3 categories:
Instances of the ID in ALPHA which do not appear in the BRAVO collection;
Instances of the ID in BRAVO which do not appear in the ALPHA collection;
Instances of the ID which appear in both collections.
In all 3 cases, I need to have the actual objects from the collections at hand for subsequent manipulation.
I know for the #3 case, I can do something like:
var myCorrelatedItems = myAlphaItems.Join(myBravoItems, alpha => alpha.Id, beta => beta.Id, (inner, outer) => new
{
alpha = inner,
beta = outer
});
I can also write code for the #1 and #2 cases which look something like
var myUnmatchedAlphas = myAlphaItems.Where(alpha=>!myBravoItems.Any(bravo=>alpha.Id==bravo.Id));
And similarly for unMatchedBravos. Unfortunately, this would result in iterating the collection of alphas (which may be very large!) many times, and the collection of bravos (which may also be very large!) many times as well.
Is there any way to unify these query concepts so as to minimize iteration over the lists? These collections can have thousands of items.

If you are only interested in the IDs,
var alphaIds = myAlphaItems.Select(alpha => alpha.ID);
var bravoIds = myBravoItems.Select(bravo => bravo.ID);
var alphaIdsNotInBravo = alphaIds.Except(bravoIds);
var bravoIdsNotInAlpha = bravoIds.Except(alphaIds);
If you want the alphas and bravos themselves,
var alphaIdsSet = new HashSet<int>(alphaIds);
var bravoIdsSet = new HashSet<int>(bravoIds);
var alphasNotInBravo = myAlphaItems
.Where(alpha => !bravoIdsSet.Contains(alpha.ID));
var bravosNotInAlpha = myBravoItems
.Where(bravo => !alphaIdsSet.Contains(bravo.ID));
EDIT:
A few other options:
The ExceptBy method from MoreLinq.
The Enumerable.ToDictionary method.
If both types inherit from a common type (e.g. an IHasId interface), you could write your own IEqualityComparer<T> implementation; Enumerable.Except has an overload that accepts an equality-comparer as a parameter.

Sometimes LINQ is not the answer. This is the kind of problem where I would consider using a HashSet<T> with a custom comparer to reduce the work of performing set operations. HashSets are much more efficient at performing set operations than lists - and (depending on the data) can reduce the work considerably:
// create a wrapper class that can accomodate either an Alpha or a Bravo
class ABItem {
public Object Instance { get; private set; }
public int Id { get; private set; }
public ABItem( Alpha a ) { Instance = a; Id = a.Id; }
public ABItem( Bravo b ) { Instance = b; Id = b.Id; }
}
// comparer that compares Alphas and Bravos by id
class ABItemComparer : IComparer {
public int Compare( object a, object b ) {
return GetId(a).Compare(GetId(b));
}
private int GetId( object x ) {
if( x is Alpha ) return ((Alpha)x).Id;
if( x is Bravo ) return ((Bravo)x).Id;
throw new InvalidArgumentException();
}
}
// create a comparer based on comparing the ID's of ABItems
var comparer = new ABComparer();
var hashAlphas =
new HashSet<ABItem>(myAlphaItems.Select(x => new ABItem(x)),comparer);
var hashBravos =
new HashSet<ABItem>(myBravoItems.Select(x => new ABItem(x)),comparer);
// items with common IDs in Alpha and Bravo sets:
var hashCommon = new HashSet<Alpha>(hashAlphas).IntersectWith( hashSetBravo );
hashSetAlpha.ExceptWith( hashSetCommon ); // items only in Alpha
hashSetBravo.ExceptWith( hashSetCommon ); // items only in Bravo

Dictionary<int, Alpha> alphaDictionary = myAlphaItems.ToDictionary(a => a.Id);
Dictionary<int, Bravo> bravoDictionary = myBravoItems.ToDictionary(b => b.Id);
ILookup<string, int> keyLookup = alphaDictionary.Keys
.Union(bravoDictionary.Keys)
.ToLookup(x => alphaDictionary.ContainsKey(x) ?
(bravoDictionary.ContainsKey(x) ? "both" : "alpha") :
"bravo");
List<Alpha> alphaBoth = keyLookup["both"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoBoth = keyLookup["both"].Select(x => bravoDictionary[x]).ToList();
List<Alpha> alphaOnly = keyLookup["alpha"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoOnly = keyLookup["bravo"].Select(x => bravoDictionary[x]).ToList();

Here is one possible LINQ solution that performs a full outer join on both sets and appends a property to them showing which group they belong to. This solution might lose its luster, however, when you try to separate the groups into different variables. It all really depends on what kind of actions you need to perform on these objects. At any rate this ran at (I thought) an acceptable speed (.5 seconds) for me on lists of 5000 items:
var q =
from g in
(from id in myAlphaItems.Select(a => a.ID).Union(myBravoItems.Select(b => b.ID))
join a in myAlphaItems on id equals a.ID into ja
from a in ja.DefaultIfEmpty()
join b in myBravoItems on id equals b.ID into jb
from b in jb.DefaultIfEmpty()
select (a == null ?
new { ID = b.ID, Group = "Bravo Only" } :
(b == null ?
new { ID = a.ID, Group = "Alpha Only" } :
new { ID = a.ID, Group = "Both" }
)
)
)
group g.ID by g.Group;
You can remove the 'group by' query or create a dictionary from this (q.ToDictionary(x => x.Key, x => x.Select(y => y))), or whatever! This is simply a way of categorizing your items. I'm sure there are better solutions out there, but this seemed like a truly interesting question so I thought I might as well give it a shot!

I think LINQ is not the best answer to this problem if you want to traverse and compare the minimum amount of times. I think the following iterative solution is more performant. And I believe that code readability doesn't suffer.
var dictUnmatchedAlphas = myAlphaItems.ToDictionary(a => a.Id);
var myCorrelatedItems = new List<AlphaAndBravo>();
var myUnmatchedBravos = new List<Bravo>();
foreach (Bravo b in myBravoItems)
{
var id = b.Id;
if (dictUnmatchedAlphas.ContainsKey(id))
{
var a = dictUnmatchedAlphas[id];
dictUnmatchedAlphas.Remove(id); //to get just the unmatched alphas
myCorrelatedItems.Add(new AlphaAndBravo { a = a, b = b});
}
else
{
myUnmatchedBravos.Add(b);
}
}
Definition of AlphaAndBravo:
public class AlphaAndBravo {
public Alpha a { get; set; }
public Bravo b { get; set; }
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why does IEnumerable.ToList produce new objects - c#

Actually, the reason this is false is because your enumerable actually gets enumerated twice, once in the call originalEnumerable.ToList(), and a second time when you invoke originalEnumerable.ElementAt(0). Since it enumerates twice, it ends up creating new objects each time it is enumerated.

Related

Compare two List elements and replace if id is equals

Strange syntax in ef

Sort IEnumerable<Object> by property and ordered array of those properties

How to access to a field of a dynamic type?

LINQ Combine Queries

Categories

Resources