C# Collections for Distinct Data - c#

I have a bunch of string data and I can loop through it one by one. What's a good collection (and how to implement it) so that I only get the distinct strings?
The client I am doing this for doesn't even use .NET 3.5 so .Distinct is out. They use .NET framework 2.0.
And I am reading the list one at a time and don't know how many records it will have until I'm done.

One way is using Distinct to make your strings unique:
List<string> a = new List<string>();
a.AddRange(new string[] { "a", "b", "a", "c", "d", "b" });
List<string> b = new List<string>();
b.AddRange(a.Distinct());
Another resource on LINQ's Distinct: http://blogs.msdn.com/b/charlie/archive/2006/11/19/linq-farm-group-and-distinct.aspx
Another way: use a HashSet as others suggested;
HashSet<string> hash = new HashSet<string>(inputStrings);
Have a look for this link, to see how to implement it in .net 2.0: https://stackoverflow.com/a/687042/284240
If you're not on 3.5, you also can do it manually:
List<string> newList = new List<string>();
foreach (string s in list)
{
if (!newList.Contains(s))
newList.Add(s);
}
// newList contains the unique values
Another solution (maybe a little faster):
Dictionary<string,bool> dic = new Dictionary<string,bool>();
foreach (string s in list)
{
dic[s] = true;
}
List<string> newList = new List<string>(dic.Keys);
// newList contains the unique values
https://stackoverflow.com/a/1205813/284240

If you're using .Net 3.5 or above, put the strings in a List<> and use the linq method Distinct().
using System.Linq;
IEnumerable<string> strs = new List<string>(new[] { "one", "two", "three", "one" });
var distinct = strs.Distinct();
In .Net 2.0 you have no choice but to do it manually.

Perhaps I'm being dense and not fully understanding the question but can't you just use a regular List and just use the .Contains method to check if each string exists in the list before adding it in the loop? You might need to keep an eye on performance if you have a lot of strings.

Related

Get unique elements in two lists where items don't match 100% (just partially)

My idea is to use a new list (List1) and compare it with another list (List2) and create a new list (List3) that exclude all common elements in both lists and results on the non common elements. The difficult thing (to me) is that List1 and List2 elements are not a true match. List1 elements might be part of List2 elements, but not a truly match. Using exclude does not seem to allow the use of IndexOf to compare the two list elements.
Does anyone have an idea how to achieve this?
Thanks in advance.
Assuming you have List1 and List2. Below is the simplest way to compare elements in two lists.
IList<string> List3 = new List<string>();
foreach (var item1 in List1)
{
foreach(var item2 in List3)
{
if (item1 == item2)
{
List3.Add(item1);
}
}
}
My idea is to use a new list (List1) and compare it with another list
(List2) and create a new list (List3) that exclude all common elements
in both lists and results on the non common elements.
From Comments
I need to compare each element in both lists List1 element exists in
List2 element (both strings).
One of the easiest ways to find unique from two lists
var List1 = new List<string>() { "a", "b", "c", "d" };
var List2 = new List<string>() { "a", "e", "f", "g", "c","z" };
var List3 = new List<string>();
List3.AddRange(List1.Except(List2));
List3.AddRange(List2.Except(List1));
List3.ForEach(l=>Console.WriteLine(l));
How about this:
List commonElements = new List<string>();
foreach (var smallString in SmallList)
{
if (large.Any(x => x.Contains(smallString)))
{
// Add to common elements
commonElements.Add(smallString);
}
}

Get get part of list after certain value

I have got a simple question I am having a list:
List<string> test = new List<string> {"one", "two", "three", "four"}
Now I want to take for example value "three" and get all elements after it, so it would be looking like:
List<string> test = new List<string> {"three", "four"}
But we do not know where list end so it can be list of many elements and we can not define end as const.
Is it possible?
It sounds like you're looking for SkipWhile from LINQ:
test = test.SkipWhile(x => x != "three").ToList();
That will skip everything until (but not including) the "three" value, then include everything else. It then converts it to a list again.
Since you assign the filtered list back to initial one, then just remove first items up to "three" one:
int count = test.IndexOf("three");
test.RemoveRange(0, count < 0 ? test.Count : count);
This implementation doesn't create additional list, but modifies existing one.
This might do the trick for you
var list2 = test.Skip(2).Take(test.Count).ToList();
or better
var list3 = test.Skip(2).ToList();
Without LINQ it could be done something like this
List<string> outtest = new List<string>();
bool drty = false;
foreach(string st in test)
{
if(st == "three") //or whatever is the input.
drty = true;
if(drty)
outtest.Add(st);
}

eliminating duplicated lists from a nested list c#

i have a nested list that contains a set of lists, some of these lists are duplicated, i wanna just make a second list without duplicated lists. i tried this :
List<List<string>> liste1 = new List<List<string>>();
List<List<string>> liste2 = new List<List<string>>();
List<string> l1 = new List<string> { "a", "b", "c" };
List<string> l2 = new List<string> { "h", "x", "g" };
List<string> l3 = new List<string> { "a", "b", "c" };
List<string> l4 = new List<string> { "z", "t", "n" };
liste1.Add(l1);
liste1.Add(l2);
liste1.Add(l3);
liste1.Add(l4);
foreach (List<string> lis in liste1)
{
if(!liste2.Contains(lis))
{
liste2.Add(lis);
}
}
it seems easy but its not working, any help will be appreciated. Thx.
Using Linq, you could achieve this.
You could take help of extension methods and look for SequentialEqual of two lists. If the order is not important use Except extension (something like ...s.Except(x).Any()).
var liste2= liste1.Where((x,i)=> !liste1.Skip(i+1).Any(s=>s.SequenceEqual(x)));
Check this Demo
You are checking for reference equality. Instead of using Contains try this substitution
//if (!liste2.Contains(lis))
if(!liste2.Any(subList => subList.SequenceEqual(lis)))
SequenceEqual is an extension method on IEnumerable<T>. I think you will need a using statement importing the System.Linq namespace.
If you dont want to test that the child lists are not sequence-equal, but set-equal (i.e. order is not important), then consider using an implementation of ISet<T> like HashSet<int> instead of List<int>.

Merge two (or more) lists into one, in C# .NET

Is it possible to convert two or more lists into one single list, in .NET using C#?
For example,
public static List<Product> GetAllProducts(int categoryId){ .... }
.
.
.
var productCollection1 = GetAllProducts(CategoryId1);
var productCollection2 = GetAllProducts(CategoryId2);
var productCollection3 = GetAllProducts(CategoryId3);
You can use the LINQ Concat and ToList methods:
var allProducts = productCollection1.Concat(productCollection2)
.Concat(productCollection3)
.ToList();
Note that there are more efficient ways to do this - the above will basically loop through all the entries, creating a dynamically sized buffer. As you can predict the size to start with, you don't need this dynamic sizing... so you could use:
var allProducts = new List<Product>(productCollection1.Count +
productCollection2.Count +
productCollection3.Count);
allProducts.AddRange(productCollection1);
allProducts.AddRange(productCollection2);
allProducts.AddRange(productCollection3);
(AddRange is special-cased for ICollection<T> for efficiency.)
I wouldn't take this approach unless you really have to though.
Assuming you want a list containing all of the products for the specified category-Ids, you can treat your query as a projection followed by a flattening operation. There's a LINQ operator that does that: SelectMany.
// implicitly List<Product>
var products = new[] { CategoryId1, CategoryId2, CategoryId3 }
.SelectMany(id => GetAllProducts(id))
.ToList();
In C# 4, you can shorten the SelectMany to: .SelectMany(GetAllProducts)
If you already have lists representing the products for each Id, then what you need is a concatenation, as others point out.
you can combine them using LINQ:
list = list1.Concat(list2).Concat(list3).ToList();
the more traditional approach of using List.AddRange() might be more efficient though.
List.AddRange will change (mutate) an existing list by adding additional elements:
list1.AddRange(list2); // list1 now also has list2's items appended to it.
Alternatively, in modern immutable style, you can project out a new list without changing the existing lists:
Concat, which presents an unordered sequence of list1's items, followed by list2's items:
var concatenated = list1.Concat(list2).ToList();
Not quite the same, Union projects a distinct sequence of items:
var distinct = list1.Union(list2).ToList();
Note that for the 'value type distinct' behaviour of Union to work on reference types, that you will need to define equality comparisons for your classes (or alternatively use the built in comparators of record types).
You could use the Concat extension method:
var result = productCollection1
.Concat(productCollection2)
.Concat(productCollection3)
.ToList();
I know this is an old question I thought I might just add my 2 cents.
If you have a List<Something>[] you can join them using Aggregate
public List<TType> Concat<TType>(params List<TType>[] lists)
{
var result = lists.Aggregate(new List<TType>(), (x, y) => x.Concat(y).ToList());
return result;
}
Hope this helps.
list4 = list1.Concat(list2).Concat(list3).ToList();
// I would make it a little bit more simple
var products = new List<List<product>> {item1, item2, item3 }.SelectMany(id => id).ToList();
This way it is a multi dimensional List and the .SelectMany() will flatten it into a IEnumerable of product then I use the .ToList() method after.
I've already commented it but I still think is a valid option, just test if in your environment is better one solution or the other. In my particular case, using source.ForEach(p => dest.Add(p)) performs better than the classic AddRange but I've not investigated why at the low level.
You can see an example code here: https://gist.github.com/mcliment/4690433
So the option would be:
var allProducts = new List<Product>(productCollection1.Count +
productCollection2.Count +
productCollection3.Count);
productCollection1.ForEach(p => allProducts.Add(p));
productCollection2.ForEach(p => allProducts.Add(p));
productCollection3.ForEach(p => allProducts.Add(p));
Test it to see if it works for you.
Disclaimer: I'm not advocating for this solution, I find Concat the most clear one. I just stated -in my discussion with Jon- that in my machine this case performs better than AddRange, but he says, with far more knowledge than I, that this does not make sense. There's the gist if you want to compare.
To merge or Combine to Lists into a One list.
There is one thing that must be true: the type of both list will be
equal.
For Example: if we have list of string so we can add add another list to the
existing list which have list of type string otherwise we can't.
Example:
class Program
{
static void Main(string[] args)
{
List<string> CustomerList_One = new List<string>
{
"James",
"Scott",
"Mark",
"John",
"Sara",
"Mary",
"William",
"Broad",
"Ben",
"Rich",
"Hack",
"Bob"
};
List<string> CustomerList_Two = new List<string>
{
"Perter",
"Parker",
"Bond",
"been",
"Bilbo",
"Cooper"
};
// Adding all contents of CustomerList_Two to CustomerList_One.
CustomerList_One.AddRange(CustomerList_Two);
// Creating another Listlist and assigning all Contents of CustomerList_One.
List<string> AllCustomers = new List<string>();
foreach (var item in CustomerList_One)
{
AllCustomers.Add(item);
}
// Removing CustomerList_One & CustomerList_Two.
CustomerList_One = null;
CustomerList_Two = null;
// CustomerList_One & CustomerList_Two -- (Garbage Collected)
GC.Collect();
Console.WriteLine("Total No. of Customers : " + AllCustomers.Count());
Console.WriteLine("-------------------------------------------------");
foreach (var customer in AllCustomers)
{
Console.WriteLine("Customer : " + customer);
}
Console.WriteLine("-------------------------------------------------");
}
}
In the special case: "All elements of List1 goes to a new List2": (e.g. a string list)
List<string> list2 = new List<string>(list1);
In this case, list2 is generated with all elements from list1.
You need to use Concat operation
When you got few list but you don't know how many exactly, use this:
listsOfProducts contains few lists filled with objects.
List<Product> productListMerged = new List<Product>();
listsOfProducts.ForEach(q => q.ForEach(e => productListMerged.Add(e)));
If you have an empty list and you want to merge it with a filled list, do not use Concat, use AddRange instead.
List<MyT> finalList = new ();
List<MyT> list = new List<MyT>() { a = 1, b = 2, c = 3 };
finalList.AddRange(list);

Adding only new items from one list to another

I have two lists:
List<string> _list1;
List<string> _list2;
I need add all _list2 different items on _list1...
How can I do that using LINQ?
Thanks
// Add all items from list2 except those already in list1
list1.AddRange(list2.Except(list1));
You would use the IEnumerable<T>.Union method:
var _list1 = new List<string>(new[] { "one", "two", "three", "four" });
var _list2 = new List<string>(new[] { "three", "four", "five" });
_list1 = _list1.Union(_list2);
// _distinctItems now contains one, two, three, four, five
EDIT
You could also use the method the other post uses:
_list1.AddRange(_list2.Where(i => !_list1.Contains(i));
Both of these methods are going to have added overhead.
The first method uses a new List to store the Union results (and then assigns those back to _list1).
The second method is going to create an in-memory representation of Where and then add those to the original List.
Pick your poison. Union makes the code a bit clearer in my opinion (and thus worth the added overhead, at least until you can prove that it is becoming an issue).
_list1.AddRange( _list2.Where(x => !_list1.Contains(x) ) );

Categories