In C# I have a list of type string. This list contains strings with the length of 8. Now I need to find entries, where the characters from position 4 to 7 are the same and fill a second list with those entries. How would I do that?
Example content of existing list:
tmr523fw
tmr5287g
tmx523fu
tmy4741g
The new list should now contain those entries:
tmr523fw
tmx523fu
Linq GroupBy and Substring should do the job here
List<string> items = new List<string>() { "tmr523fw", "tmr5287g", "tmx523fu", "tmy4741g" };
List<string> result = items.GroupBy(x => x.Substring(3, 4))
.Where(x => x.Count() > 1)
.SelectMany(x => x)
.ToList();
assuming that double entries means the appearance x.Count() > 1
Related
This question already has answers here:
Remove duplicate items from list in C#, according to one of their properties
(6 answers)
Remove Duplicates and Original from C# List
(2 answers)
Closed 1 year ago.
I can find a bunch of different ways of removing duplicates such as LINQ "Distinct().ToList()", but that still leaves the value that had duplicates, but I want to remove all values which have duplicates. If you have a list of "1, 2, 3, 3, 4" I want "1, 2, 4" to be left. Thanks!
If you prefer Linq solution, you can try GroupBy instead of Distinct:
var result = list
.GroupBy(item => item) // group items
.Where(group => group.Count() == 1) // keep groups with single item only
.SelectMany(group => group) // flatten groups to sequence of items
.ToList();
If you are looking for in place solution (you want to remove duplicates from existing list):
HashSet<int> duplicates = new HashSet<int>(list
.GroupBy(item => item)
.Where(group => group.Count() > 1)
.Select(group => group.Key));
int p = 0;
for (int i = 0; i < list.Count; ++i)
if (!duplicates.Contains(list[i]))
list[p++] = list[i];
list.RemoveRange(p, list.Count - p);
Via GroupBy, you can find all duplicates. With Except, you find all entries that are not duplicates:
List<int> output = input.Except(input.GroupBy(i => i).Where(g => g.Count() > 1).Select(g => g.Key)).ToList();
Online demo: https://dotnetfiddle.net/vTo4Cx
I am having a list of string which contains some value and I want to compare values of 2 positions from list and remove matching items from list.
Code :
var list = new List<string>();
list.Add("Employee1");
list.Add("Account");
list.Add("100.5600,A+ ,John");
list.Add("1.00000,A+ ,John");
list.Add("USA");
Now i want to compare 2nd and 3rd position :
list.Add("100.5600,A+ ,John");
list.Add("1.00000,A+ ,John");
Compare above 2 records and remove matching records like below:
Expected output :
list.Add("100.5600");
list.Add("1.00000");
This is how i am trying to do :
var source = list[2].Split(',').Select(p => p.Trim());
var target = list[3].Split(',').Select(p => p.Trim());
var result = source.Except(target);
But the problem is I am only getting 100.5600 as output.
Is it possible to compare and update non matching records in existing list?
How about this "beauty"
var list = new List<string>();
list.Add("Employee1");
list.Add("Account");
list.Add("100.5600,A+ ,John");
list.Add("1.00000,A+ ,John");
list.Add("USA");
//prepare the list, I decided to make a tuple with the original string in the list and the splitted array
var preparedItems = list.Select(x => (x, x.Split(',')));
//group the prepared list to get matching items for the 2nd and 3rd part of the split, I therefor used .Skip(1) on the previously prepared array
var groupedItems = preparedItems.GroupBy(x => string.Join(",", x.Item2.Skip(1).Select(y => y.Trim())));
//"evaluate" the group by saying if the items in the group is > 1 only use the first part of the prepared array and if it doesnt have more than one entry use the orignal string
var evaluatedItems = groupedItems.SelectMany(x => x.Count() > 1 ? x.Select(y => y.Item2[0]) : x.Select(y => y.Item1));
//replace the orignal list with the new result
list = evaluatedItems.ToList();
Edit - preserve original order:
//extended the prepare routine with a third part the index to Keep track of the ordering of the original list
//so the tuple now consits of 3 parts instead of 2 - ([item], [index], [splittedArray])
var preparedItems = list.Select((x, i) => (x, i, x.Split(',')));
//changed to use Item3 intead of Item2 - since the Array now is on third position
var groupedItems = preparedItems.GroupBy(x => string.Join(",", x.Item3.Skip(1).Select(y => y.Trim())));
//instead of returning the simple string here already, return a tuple with the index (y.Item2) and the correct string
var evaluatedItems = groupedItems.SelectMany(x => x.Count() > 1 ? x.Select(y => (y.Item2, y.Item3[0])) : x.Select(y => (y.Item2, y.Item1)));
//now order by the new tuple x.Item1 and only return x.Item2
var orderedItems = evaluatedItems.OrderBy(x => x.Item1).Select(x => x.Item2);
list = orderedItems.ToList();
//one-liner - isn't that a beauty
list = list.Select((x, i) => (x, i, x.Split(','))).GroupBy(x => string.Join(",", x.Item3.Skip(1).Select(y => y.Trim()))).SelectMany(x => x.Count() > 1 ? x.Select(y => (y.Item2, y.Item3[0])) : x.Select(y => (y.Item2, y.Item1))).OrderBy(x => x.Item1).Select(x => x.Item2).ToList();
You may get it easily by checking if items in one is not contained in the other:
var result = source.Where(x => !target.Contains(x));
To update your old list:
var source = string.Join(",", source.Where(x => !target.Contains(x)));
So far, I have this:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)));
Configuration folder will contain pairs of files:
abc.json
abc-input.json
def.json
def-input.json
GetReportName() method strips off the "-input" and title cases the filename, so you end up with a grouping of:
Abc
abc.json
abc-input.json
Def
def.json
def-input.json
I have a ReportItem class that has a constructor (Name, str1, str2). I want to extend the Linq to create the ReportItems in a single statement, so really something like:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
**.Select(x => new ReportItem(x.Key, x[0], x[1]));**
Obviously last line doesn't work because the grouping doesn't support array indexing like that. The item should be constructed as "Abc", "abc.json", "abc-input.json", etc.
If you know that each group of interest contains exactly two items, use First() to get the item at index 0, and Last() to get the item at index 1:
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x)))
.Where(g => g.Count() == 2) // Make sure we have exactly two items
.Select(x => new ReportItem(x.Key, x.First(), x.Last()));
var v = Directory.EnumerateFiles(_strConfigurationFolder)
.GroupBy(x => GetReportName(Path.GetFileNameWithoutExtension(x))).Select(x => new ReportItem(x.Key, x.FirstOrDefault(), x.Skip(1).FirstOrDefault()));
But are you sure there will be exactly two items in each group? Maybe has it sence for ReportItem to accept IEnumerable, not just two strings?
I have an array of objects with property Number inside them. I need to group them by values i.e. objects contain those sample values:
1 2 3 3 3 4 5 6 6 6 7 7
I have to group them like this:
listOfUniqe = {1,2,4,5}
listOfDuplicates1 = {3,3,3}
listOfDuplicates2 = {6,6,6}
listOfDuplicates3 = {7,7}
...
I tried to use distinct, with First(). But this distincts me first occurences and remove duplicates. I want to erase also first occurence of object if it had duplicates and move them to another list.
List<Reports> distinct = new List<Reports>;
distinct = ArrayOfObjects.GroupBy(p => p.Number).Select(g => g.First()).ToList();
Any ideas how I could do this?
To get groups with just one element use that:
distinct = ArrayOfObjects.GroupBy(p => p.Number)
.Where(g => g.Count() == 1)
.ToList();
And to get list of groups with more elements use that:
nonDistinct = ArrayOfObjects.GroupBy(p => p.Number)
.Where(g => g.Count() > 1)
.Select(g => g.ToList())
.ToList();
First group the items:
var groups = values.GroupBy(p => p.Number).ToList();
The unique ones are the ones with a group count of one:
var unique = groups.Where(g => g.Count() == 1).Select(g => g.Single()).ToList();
The ones with duplicates are the other ones:
var nonUnique = groups.Where(g => g.Count() > 1).ToList();
I have a string comma separated list of some data. I have another list of strings of keywords that i want to search for in the first list. I want to have returned to me the index of all the elements in the first list that do no contain any of the keywords in the second list. For example:
List 1:
Student,101256,Active
Professor,597856,Active
Professor,697843,Inactive
Student,329741,Active
Student,135679,Inactive
Student,241786,Inactive
List 2:
697843
241786
My query on List 1 should be, give me all the index of all the elements that do not contain any of the elements of list 2. Therefore, the return list of indices should be 0,1,3,4. Is there any way to accomplish this?
Thanks in advance!
Edit: This is my try:
List<int> index = list1
.Select((s, i) => new { s, i })
.Where(e => !list2.Contains(e.s))
.Select(e => e.i).ToList();
You will need to reference System.Linq, this has now been edited to include the !Student filter
var list1 = new List<string> {
{"Student,101256,Active"},
{"Professor,597856,Active"},
{"Professor,697843,Inactive"},
{"Student,329741,Active"},
{"Student,135679,Inactive"},
{"Student,241786,Inactive"}
};
var list2 = new List<string> {{"697843"}, {"241786"}};
var result = list1
.Select((item,i)=> new {index=i,value=item})
.Where(item => !item.value.StartsWith("Student"))
.Where(item => !item.value.Split(',').Any(j => list2.Contains(j)))
.Select(item=>item.index)
.ToList();
The first select extracts the index before filtering, the pre-edited version calculated the index after the filter and so was incorrect.