Find duplicates in a collection of lists - c#

If I have a dictionary containing 2 or more lists, how can I quickly find shared items between these lists and add these shared items to a list external to the dictionary?
For example:
list1:
eng;English
lir;Liberian English
list2:
eng;English
bav;Vengo
list3:
lat;Latin
extList:
eng;English
This shared item is then removed from the lists inside the dictionary.
I have added list3 to show that a superfluous item may be ignored, and that I have specified 2 or more lists.

As I understand you have two lists and need to find intersection between those lists and add this intersection to the third list:
var list1 = new[] { "eng;English", "lir;Liberian", "English" };
var list2 = new[] { "eng;English", "bav;Vengo", "English" };
extList.AddRange(list1.Intersect(list2));

Suppose we have a list of lists (or a dictionary, which would add a Key):
List<List<string>> lists = new List<List<string>>()
{
new List<string> {"Hello", "World", "7"},
new List<string> {"Hello", "7", "Person"},
new List<string> {"7", "7", "Hello"}
};
You can find items that are present in all lists:
List<string> extList = lists.Cast<IEnumerable<string>>()
.Aggregate((a, b) => a.Intersect(b)).ToList();
If you want to get strings that are common to just a few lists, you can use:
var counts = from str in lists.SelectMany(list => list)
group str by str into g
where g.Count() > 1
select new { Value = g.Key, Count = g.Count() };
You can drop the last line if you don't care how many times each word appears. Note that this will not tell you in which list the word is.

Here's a function that will take a dictionary, remove any string that is in more than one list in the dictionary, and return the list of strings it removed:
static List<string> FindAndRemoveDuplicates(Dictionary<string, List<string>> data)
{
// find duplicates
var dupes = new HashSet<string>(
from list1 in data.Values
from list2 in data.Values
where list1 != list2
from item in list1.Intersect(list2)
select item);
// remove dupes from lists in the dictionary
foreach (var list in data.Values)
list.RemoveAll(str => dupes.Contains(str));
// return a list of the duplicates
return dupes.ToList();
}

Related

Find string from a list

I have a list with several ingredients:
List<string> vegList = new List<string>();
where there are several strings like: "cherry", "butter", "bread".
My idea was to have a dictionary or something like that having a recipe string, the name of that recipe is returned if it had one or more elements from the list.
I tried to make a dictionary, and array too, but I didn't get what I wanted.
can anyone help?
i tried this but i don't know what to do anymore
myString = "receit pineapple cake, cherry, bread ..."
foreach(string item in vegList )
{
if(item.Contains(myString))
return item;
}
Try Linq Any:
List<string> vegList = new List<string>()
{ "cherry", "butter", "bread"};
string recipe = "butterbread";
bool showRecipe = vegList.Any(s => recipe.Contains(s));
ShowRecipe returns true.
Edit 1:
The following code searches for recipes in recipes list and returns a matchedRecipes list with recipes that match the the words from vegList.
List<string> vegList = new List<string>()
{ "cherry", "butter", "bread"};
List<string> recipes = new List<string>()
{ "steak", "butterbread", "cherrycake"};
List<string> matchedRecipes = recipes.Where(x => vegList.Any(s => x.Contains(s))).ToList();
matchedRecipes list contains "butterbread" and "cherrycake"
Edit 2: It would also be helpful to ignore case of letters.
List<string> vegList = new List<string>()
{ "cherry", "butter", "bread"};
List<string> recipes = new List<string>()
{ "steak", "Butterbread", "Cherrycake"};
List<string> matchedRecipes = recipes.Where(x => vegList
.Any(s => x.ToLower().Contains(s.ToLower())))
.ToList();
If each recipe has a vegList, or more specifically, you have a Recipe class with a List<string> vegList field, and let's say you have a List<Recipe> recipes somewhere:
var ingredient = "cherry";
var matches = recipes.FindAll(r => r.vegList.contains(ingredient));
matches will be a List<Recipe> with recipes containing "cherry" as an ingredient.
With the new scenario as the question was updated, first make sure your string has ingredients separated all by a space (let's not make it harder).
myIngredients = myString.split(" ");
Then to find the recipes:
var matches = recipes.FindAll(r => r.vegList.Any(l => myIngredients.contains(l)));
In this case a recipe is selected if its vegList contains at least one of the ingredients in myIngredients.

C#: How do i get 2 lists into one 2-tuple list in

I have 2 Lists. First one is Type string. The second is type object. Now I want to get both lists into a Tuple<string,object>.
like this
var List = new(string list1, object list2)[]
How do I do this?
I had to create 2 seperate lists, because I am Serializing the object and the serializing wouldn't work if i had a List<Tuple<string,object>> in the first place.
Both lists can get big so maybe with foreach loop?
You can use the Zip method to create one list of the two:
var lst = new List<string>() { "a", "b" };
var obj = new List<object>() { 1, 2 };
var result = lst.Zip(obj, (x, y) => new Tuple<string, object>(x, y))
.ToList();
You can use the Linq Zip method:
List<string> list1 = GetStrings();
List<object> list2 = GetObjects();
var merged = list1.Zip(list2, (a, b) => Tuple.Create(a, b));

Intersect 2 list if first list contains part of another

I have 2 list of string, list A and list B. list A is a list of strings containing paths, and the other contains strings of folder. Examples:
List<string> listA = new List<string>{ "c:\myPath\FolderA\blabla\", "c:\myPath\FolderB\blabla2\", "c:\myPath\FolderA\blabla3\" "c:\myPath\FolderC\blabla\"};
List<string> listB = new List<string> { "FolderA, FolderC"};
I want to have a method that compares the 2 list. If listA contains any of listB it is valid, else I don't want it. So based on this logic I'd have:
List<string> listReturn = new List<string>{ "c:\myPath\FolderA\blabla\", "c:\myPath\FolderA\blabla3\" "c:\myPath\FolderC\blabla\"};
So far all I've done is a method that iterates through the first list and does a Contain call on the string with a Linq Any call, like this:
private static List<string> FilterList(List<string> listA, List<string> listB)
{
List<string> listReturn = new List<string>();
foreach (string val in listA)
{
if (listB.Any(item => val.Contains(item)))
{
listReturn.Add(val);
}
}
return listReturn;
}
It's not bad, but I want to use a Linq approach or a .NET approach if there's an Intersect method available for this. Thank you.
Use Where() against the listA to filter items in this list and Exists() on listB for the filter condition:
List<string> listA = new List<string> {#"c:\myPath\FolderA\blabla\", #"c:\myPath\FolderA\blabla2\", #"c:\myPath\Folder\blabla3\", #"c:\myPath\FolderC\blabla\"};
List<string> listB = new List<string> { "FolderA", "FolderC" };
var intersect = listA.Where(a => listB.Exists(b => a.Contains(b)));
Try this
var result = listA.Where(i => listB.Any(y => i.Contains(y)).ToList();

Concatenating strings in two lists to create a third list

I have two lists of items, can you please guide me how I can concatenate values of both and add concatenated value into third list as a value.
For example if List<string> From has A,B,C and List<string> To has 1,2,3 then List<string> All should have A1,B2,C3. I'd preferably like to use a lambda expression.
Use Linq's Zip extension method:
using System.Linq;
...
var list1 = new List<string> { "A", "B", "C" };
var list2 = new List<string> { "1", "2", "3" };
var list3 = list1.Zip(list2, (x, y) => x + y).ToList(); // { "A1", "B2", "C3" }
That's not concatenation - that's matching two sequences pairwise. You do it with LINQ's Zip method:
Zip applies a specified function to the corresponding elements of two sequences, producing a sequence of the results.
var res = from.Zip(to, (a,b) => a + b).ToList();
If item's count are equal in both lists then you can do:
var list3 = list1.Select((item, index) => item + list2[index]).ToList();

how do I get a subset of strings from list 1 which are not present in list2?

I have two lists
list 1 = { "fred", "fox", "jumps", "rabbit"};
list2 ={"fred", "jumps"}
Now I need to get a list3 which contains elements of list1 which are not present in list2.
so list 3 should be
list3 = {"fox", "rabbit"};
I can do this manually by using loops but I was wondering if there is something like list3 = list1 - list2 or some other better way than using loops.
Thanks
If you are using .NET 3.5 or newer then you can use Enumerable.Except:
var result = list1.Except(list2);
If you want it as a list:
List<string> list3 = list1.Except(list2).ToList();
For older versions of .NET you could insert the strings from list1 as keys in a dictionary and then remove the strings from list2, then the keys that are left in dictionary is the result.
Dictionary<string, object> d = new Dictionary<string, object>();
foreach (string x in list1)
d[x] = null;
foreach (string x in list2)
d.Remove(x);
List<string> list3 = new List<string>(d.Keys);
list1.Except(list2);

Categories