I have a list of tuples, the objects in the tuple are both the same type. The data structure of a list of tuples is needed until we get to error handling. To optimize error handling, I would like to flatten the tuples into a single list to allow for duplicate checking:
For instance if I had List<Tuple<string,string>>() (my types are more complex but the idea should hold):
[<"Tom","Dick">, <"Dick","Harry">, <"Bob","John">]
I would like to end up with:
["Tom", "Dick", "Harry", "Bob", "John"]
I know I could do this with:
List<string> stringList = List<string>();
Foreach(var item in tupleList){
stringList.Add(item.Item1);
stringList.Add(item.Item2);
}
stringList = stringList.Distinct();
But I am hoping for a more efficient way, perhaps something built into Linq. There is no guarantee of duplicates, but due to the performance cost of error handling, I would rather handle each only once.
If you need distinct items without order - use HashSet:
HashSet<string> stringList = new HashSet<string>();
foreach(var item in tupleList){
stringList.Add(item.Item1);
stringList.Add(item.Item2);
}
You can do similar code with LINQ, but it will not be faster (and probably not better looking as you need to convert Tuple to enumerable for most operations). You can try Aggregate if you really looking for LINQ.
Its simple by using linq SelectMany method
var tupleList = Enumerable.Range(0, 10).Select(i => new Tuple<string, string>(i.ToString(), "just for example")); // tuples
var trg = tupleList.SelectMany(t => new[] { t.Item1, t.Item2 }).Distinct();
One line, however still not a tetris in one line Ж)
As a slight variation on the above you could also do this:
HashSet<string> hash= new HashSet<string>();
tupleList.ForEach(l => hash.UnionWith(new string[] { l.Item1, l.Item2 }));
Although I like the SelectMany example in the comments of the question
Related
When I want to get an IEnumerable to eagrly materialize/yield all its results I usually use ToList() like this:
var myList= new List<int>();
IEnumerable<int> myXs = myList.Select(item => item.x).ToList();
I do this usually when locking a method returning the result of a Linq query.
In these kind of cases I am not actually interested in the collection becoming a list and I often don't want to know it's type. I am just using ToList() for it's side effect - yielding all the elements.
If for example if I will change the type from List to Array I will also have to remember to change the ToList() to ToArray() or suffer some performance hit.
I can do foreach( var e in myList ) { } but I am not sure if this will be optimized at some point ?
I am looking for something like myList.Select(item => item.x).yield()
What is the best way to do it ? is there a way to simply tell an a Linq result to yield all its elements which is better than ToList ?
If the point is just to exercise the list, and don't want to construct or allocate an array of any kind, you can use Last(), which will simply iterate over all the elements until it gets to the last one (see source).
If you are actually interested in the results, in most cases you should simply use ToList() and don't overthink it.
There is no way to avoid allocating some sort of storage if you want to retrieve the results later. There is no magic IEnumerable<T> container that has no concrete type; you have to choose one, and ToList() is the most obvious choice with low overhead.
Don't forget ToListAsync() if you'd rather not wait for it to finish.
Just a FYI, since maybe that is the issue
You don't have to write LINQ Operations in a one-liner you can extend it further and further:
For example:
var myList = new List<T>();
var result = myList.Select(x => x.Foo).Where(x => x.City == "Vienna").Where(x => x.Big == true).ToList();
Could be re-written to:
var myList = new List<T>();
//get an IEnumerable<Foo>
var foos = myList.Select(x => x.Foo);
//get an IEnumerable<Foo> which is filtered by the City Vienna
var foosByCity = foos.Where(x => x.City == "Vienna");
//get an IEnumerable<Foo> which is futher filtered by Big == true
var foosByCityByBig = foosByCity.Where(x => x.Big == true);
//now you could call to list on the last IEnumerable, but you dont have to
var result = foosByCityByBig.ToList();
So what-ever your real-goal is, maybe you can change your line
var myList= new List<int>();
IEnumerable<int> myXs = myList.Select(item => item.x).ToList();
To this:
var myList= new List<int>();
IEnumerable<int> myXs = myList.Select(item => item.x);
And continue your work with myXs as an IEnumerable<int>.
I have list of objects of a class for example:
class MyClass
{
string id,
string name,
string lastname
}
so for example: List<MyClass> myClassList;
and also I have list of string of some ids, so for example:
List<string> myIdList;
Now I am looking for a way to have a method that accept these two as paramets and returns me a List<MyClass> of the objects that their id is the same as what we have in myIdList.
NOTE: Always the bigger list is myClassList and always myIdList is a smaller subset of that.
How can we find this intersection?
So you're looking to find all the elements in myClassList where myIdList contains the ID? That suggests:
var query = myClassList.Where(c => myIdList.Contains(c.id));
Note that if you could use a HashSet<string> instead of a List<string>, each Contains test will potentially be more efficient - certainly if your list of IDs grows large. (If the list of IDs is tiny, there may well be very little difference at all.)
It's important to consider the difference between a join and the above approach in the face of duplicate elements in either myClassList or myIdList. A join will yield every matching pair - the above will yield either 0 or 1 element per item in myClassList.
Which of those you want is up to you.
EDIT: If you're talking to a database, it would be best if you didn't use a List<T> for the entities in the first place - unless you need them for something else, it would be much more sensible to do the query in the database than fetching all the data and then performing the query locally.
That isn't strictly an intersection (unless the ids are unique), but you can simply use Contains, i.e.
var sublist = myClassList.Where(x => myIdList.Contains(x.id));
You will, however, get significantly better performance if you create a HashSet<T> first:
var hash = new HashSet<string>(myIdList);
var sublist = myClassList.Where(x => hash.Contains(x.id));
You can use a join between the two lists:
return myClassList.Join(
myIdList,
item => item.Id,
id => id,
(item, id) => item)
.ToList();
It is kind of intersection between two list so read it like i want something from one list that is present in second list. Here ToList() part executing the query simultaneouly.
var lst = myClassList.Where(x => myIdList.Contains(x.id)).ToList();
you have to use below mentioned code
var samedata=myClassList.where(p=>p.myIdList.Any(q=>q==p.id))
myClassList.Where(x => myIdList.Contains(x.id));
Try
List<MyClass> GetMatchingObjects(List<MyClass> classList, List<string> idList)
{
return classList.Where(myClass => idList.Any(x => myClass.id == x)).ToList();
}
var q = myClassList.Where(x => myIdList.Contains(x.id));
In the following code:
var cats = new List<string>() {"cat1", "cat2"};
var dogs = new List<string>() {"dog1", "dog2"};
var animals = new List<Animal>();
animals = (cats.Select(x => new Animal() {Type = "Cat", Name = x}).ToList().AddRange(
dogs.Select(x => new Animal() {Type = "Dog", Name = x}).ToList())).ToList();
Calling the ToList() at the end is an error, because AddRange() returns void. This doesn't seem right nowadays when using Linq type queries.
I found I could change it to .Union() or .Concat() to fix the issue, but shouldn't AddRange() be updated, or is there a reason for it returning void?
AddRange changes the underlying List object. No LinQ method does that. So it's fundamentally different and should not be used in a LinQ concatination. It's return value of void reflects that.
You've answered your own question. If you want distinct values from the two lists use Union, if you want to just join the two lists use Concat. Once the two enumerable have been joined you can call ToList().
AddRange is a method on the List its self and isn't anything to do with LINQ.
AddRange is a method on List<T> that pre-dates LINQ. It mutates the current list in situ and so doesn't need to return it (nor does it follow the fluent syntax style you find a lot these days). List<T> is not immutable, so mutating method calls are fine.
There are linq methods that can join lists together (as you've noted in the question). I would tend to not have mutating actions embedded in a linq method chain as it goes against the general idea that linq is just a query / projection set-up and doesn't "update" things.
In your case, it is better to use Enumerable.Concat:
animals = cats.Select(x => new Animal() {Type = "Cat", Name = x})
.Concat(dogs.Select(x => new Animal() {Type = "Dog", Name = x})).ToList();
I have for example 5 List all of the same type. Can I simply do
List<T> newset = List1.Concat(List2).Concat(List3).Concat(List4).....
You can do this (although you need .ToList() at the end).
However, it would be (slightly) more efficient to generate a single list, and use AddRange to add in each list. Just initialize the list with the total size of all of your lists, then call AddRange repeatedly.
You might want to consider doing something like:
public List<T> ConcatMultiple<T>(this List<T> list, params[] ICollection<T> others)
{
List<T> results = new List<T>(list.Count + others.Sum(i => i.Count));
results.AddRange(list);
foreach(var l in others)
results.AddRange(l);
return results;
}
Then calling via:
List<MyClass> newset = List1.ConcatMultiple(List2, List3, List4);
Yes, you can do that.
List<Thing> newSet = List1.Concat(List2).Concat(List3).Concat(List4).ToList();
If you want to concatenate an arbitrary (previously unknown) number of lists, then you may need to concatenate a collection of lists. Probably the easiest way to do this would be to use the SelectMany operator (or nested from clauses in LINQ query):
IEnumerable<List<int>> lists = /* get list of lists */;
List<int> result = lists.SelectMany(e => e).ToList();
The SelectMany operator calls the given function for every element of the input list (which is a list) and then concatenates all the resulting lists (the actual lists from your input list of lists). Alternatively using the LINQ query syntax:
List<int> result = (from l in lists
from e in l select e).ToList();
I believe that the C# compiler may actually optimize this, so that it doesn't iterate over all the individual elements (and does the same thing as the explicit version above). If you have a known number of lists, you can of course write:
List<int> result = (from l in new[] { list1, list2, list3, list4 }
from e in l select e).ToList();
It is not as elegant as defining your own method exactly for this purpose, but it shows how powerful the LINQ query syntax is.
you can, but do not forget to append .ToList(); in the end.
also you can call newset.AddRange(ListX); i think it is better in terms of performance
For variable list count:
IEnumerable<T> enumerable = Enumerable.Empty<T>();
foreach (List<T> list in [whatever])
enumerable = enumerable.Concat(list);
At the end you could add a "ToList()" if you want a rely List:
List<T> list = enumerable.ToList();
However, this might not be neeeded.
You certainly can do that, though it may not be incredibly efficient.
As stated by other answers, don't forget to add .ToList() to the end of your line of code, or use List1.AddRange(List2); List1.AddRange(List3); ... for added efficiency.
of you can use an union in LINQ if it is a real union that you want to do ofcourse...
What am I missing here?
I want to do a simple call to Select() like this:
List<int> list = new List<int>();
//fill the list
List<int> selections = (List<int>)list.Select(i => i*i); //for example
And I keep having trouble casting it. What am I missing?
Select() will return you an IEnumerable<int> type, you have to use the ToList() operator:
List<int> selections = list.Select(i => i*i).ToList();
Select() doesn't return a List so of course you can't cast it to a list.
You can use the ToList method instead:
list.Select(i => i*i).ToList();
As others have said, Select returns an IEnumerable<T> which isn't actually a list - it's the result of a lazily-evaluated iterator block.
However, if you're dealing with lists and you want a list back out with nothing other than a projection, using List<T>.ConvertAll will be more efficient as it's able to create the new list with the right size immediately:
List<int> selections = list.ConvertAll(i => i*i);
Unless you particularly care about the efficiency, however, I'd probably stick to Select as it'll give you more consistency with other LINQ code.