Find the count of duplicate items in a C# List - c#

I am using List in C#. Code is as mentioned below:
TestCase.cs
public class TestCase
{
private string scenarioID;
private string error;
public string ScenarioID
{
get
{
return this.scenarioID;
}
set
{
this.scenarioID = value;
}
}
public string Error
{
get
{
return this.error;
}
set
{
this.error = value;
}
}
public TestCase(string arg_scenarioName, string arg_error)
{
this.ScenarioID = arg_scenarioName;
this.Error = arg_error;
}
}
List I am createing is:
private List<TestCase> GetTestCases()
{
List<TestCase> scenarios = new List<TestCase>();
TestCase scenario1 = new TestCase("Scenario1", string.Empty);
TestCase scenario2 = new TestCase("Scenario2", string.Empty);
TestCase scenario3 = new TestCase("Scenario1", string.Empty);
TestCase scenario4 = new TestCase("Scenario4", string.Empty);
TestCase scenario5 = new TestCase("Scenario1", string.Empty);
TestCase scenario6 = new TestCase("Scenario6", string.Empty);
TestCase scenario7 = new TestCase("Scenario7", string.Empty);
scenarios.Add(scenario1);
scenarios.Add(scenario2);
scenarios.Add(scenario3);
scenarios.Add(scenario4);
scenarios.Add(scenario5);
scenarios.Add(scenario6);
scenarios.Add(scenario7);
return scenarios;
}
Now I am iterating through the list. I want to find the how many duplicate testcases are there in a list with same ScenarioID. Is there any way to solve it using Linq or any inbuilt method for List?
Regards,
Priyank

Try this:
var numberOfTestcasesWithDuplicates =
scenarios.GroupBy(x => x.ScenarioID).Count(x => x.Count() > 1);

As a first idea:
int dupes = list.Count() - list.Distinct(aTestCaseComparer).Count();

To just get the duplicate count:
int duplicateCount = scenarios.GroupBy(x => x.ScenarioID)
.Sum(g => g.Count()-1);

var groups = scenarios.GroupBy(test => test.ScenarioID)
.Where(group => group.Skip(1).Any());
That will give you a group for each ScenarioID that has more than one items. The count of the groups is the number of duplicate groups, and the count of each group internally is the number of duplicates of that single item.
Additional note, the .Skip(1).Any() is there because a .Count() in the Where clause would need to iterate every single item just to find out that there is more than one.

Something like this maybe
var result= GetTestCases()
.GroupBy (x =>x.ScenarioID)
.Select (x =>new{x.Key,nbrof=x.Count ()} );

To get total number of duplicates, yet another:
var set = new HashSet<string>();
var result = scenarios.Count(x => !set.Add(x.ScenarioID));
To get distinct duplicates:
var result = scenarios.GroupBy(x => x.ScenarioID).Count(x => x.Skip(1).Any());

Related

Why is the return is List<char>?

I am trying to pull file names that match the substring using "contains" method. However, return seem to be List<char> but I expect List<string>.
private void readAllAttribues()
{
using (var reader = new StreamReader(attribute_file))
{
//List<string> AllLines = new List<string>();
List<FileNameAttributeList> AllAttributes = new List<FileNameAttributeList>();
while (!reader.EndOfStream)
{
FileNameAttributeList Attributes = new FileNameAttributeList();
Attributes ImageAttributes = new Attributes();
Point XY = new Point();
string lineItem = reader.ReadLine();
//AllLines.Add(lineItem);
var values = lineItem.Split(',');
Attributes.ImageFileName = values[1];
XY.X = Convert.ToInt16(values[3]);
XY.Y = Convert.ToInt16(values[4]);
ImageAttributes.Location = XY;
ImageAttributes.Radius = Convert.ToInt16(values[5]);
ImageAttributes.Area = Convert.ToInt16(values[6]);
AllAttributes.Add(Attributes);
}
List<string> unique_raw_filenames = AllAttributes.Where(x => x.ImageFileName.Contains(#"non")).FirstOrDefault().ImageFileName.ToList();
List<string>var unique_reference_filenames = AllAttributes.Where(x => x.ImageFileName.Contains(#"ref")).FirstOrDefault().ImageFileName.ToList();
foreach (var unique_raw_filename in unique_raw_filenames)
{
var raw_attributes = AllAttributes.Where(x => x.ImageFileName == unique_raw_filename).ToList();
}
}
}
Datatype class
public class FileNameAttributeList
{ // Do not change the order
public string ImageFileName { get; set; }
public List<Attributes> Attributes { get; set; }
public FileNameAttributeList()
{
Attributes = new List<Attributes>();
}
}
Why is FirstOrDefault() does not work ? (It returns List<char> but I am expecting List<string> and fails.
The ToList() method converts collections that implement IEnumerable<SomeType> into lists.
Looking at the definition of String, you can see that it implements IEnumerable<Char>, and so ImageFileName.ToList() in the following code will return a List<char>.
AllAttributes.Where(x =>
x.ImageFileName.Contains(#"non")).FirstOrDefault().ImageFileName.ToList();
Although I'm guessing at what you want, it seems like you want to filter AllAttributes based on the ImageFileName, and then get a list of those file names. If that's the case, you can use something like this:
var unique_raw_filenames = AllAttributes.Where(x => x.ImageFileName.Contains(#"non")).Select(y=>y.ImageFileName).ToList();
In your code
List<string> unique_raw_filenames = AllAttributes.Where(x => x.ImageFileName.Contains(#"non")).FirstOrDefault().ImageFileName.ToList();
FirstOrDefault() returns the first, or default, FileNameAttributeList from the list AllAttributes where the ImageFileName contains the text non.
Calling ToList() on the ImageFileName then converts the string value into a list of chars because string is a collection of char.
I think that what you are intending can be achieved by switching out FirstOrDefault to Select. Select allows you to map one value onto another.
So your code could look like this instead.
List<string> unique_raw_filenames = AllAttributes.Where(x => x.ImageFileName.Contains(#"non")).Select(x => x.ImageFileName).ToList();
This then gives you a list of string.

How to remove duplicates from object list based on that object property in c#

I've got a problem with removing duplicates at runtime from my list of object.
I would like to remove duplicates from my list of object and then set counter=counter+1 of base object.
public class MyObject
{
MyObject(string name)
{
this.counter = 0;
this.name = name;
}
public string name;
public int counter;
}
List<MyObject> objects_list = new List<MyObject>();
objects_list.Add(new MyObject("john"));
objects_list.Add(new MyObject("anna"));
objects_list.Add(new MyObject("john"));
foreach (MyObject my_object in objects_list)
{
foreach (MyObject my_second_object in objects_list)
{
if (my_object.name == my_second_object.name)
{
my_object.counter = my_object.counter + 1;
objects_list.remove(my_second_object);
}
}
}
It return an error, because objects_list is modified at runtime. How can I get this working?
With a help of Linq GroupBy we can combine duplicates in a single group and process it (i.e. return an item which represents all the duplicates):
List<MyObject> objects_list = ...
objects_list = objects_list
.GroupBy(item => item.name)
.Select(group => { // given a group of duplicates we
var item = group.First(); // - take the 1st item
item.counter = group.Sum(g => g.counter); // - update its counter
return item; // - and return it instead of group
})
.ToList();
The other answer seem to be correct, though I think it will do scan of the whole list twice, depending on your requirement this might or might not be good enough. Here is how you can do it in one go:
var dictionary = new Dictionary<string, MyObject>();
foreach(var obj in objects_list)
{
if(!dictionary.ContainsKey(obj.name)
{
dictionary[obj.name] = obj;
obj.counter++;
}
else
{
dictionary[obj.name].counter++;
}
}
Then dictionary.Values will contain your collection

Sort a List in which each element contains 2 Values

I have a text file that contains Values in this Format: Time|ID:
180|1
60 |2
120|3
Now I want to sort them by Time. The Output also should be:
60 |2
120|3
180|1
How can I solve this problem? With this:
var path = #"C:\Users\admin\Desktop\test.txt";
List<string> list = File.ReadAllLines(path).ToList();
list.Sort();
for (var i = 0; i < list.Count; i++)
{
Console.WriteLine(list[i]);
}
I got no success ...
3 steps are necessary to do the job:
1) split by the separator
2) convert to int because in a string comparison a 6 comes after a 1 or 10
3) use OrderBy to sort your collection
Here is a linq solution in one line doing all 3 steps:
list = list.OrderBy(x => Convert.ToInt32(x.Split('|')[0])).ToList();
Explanation
x => lambda expression, x denotes a single element in your list
x.Split('|')[0] splits each string and takes only the first part of it (time)
Convert.ToInt32(.. converts the time into a number so that the ordering will be done in the way you desire
list.OrderBy( sorts your collection
EDIT:
Just to understand why you got the result in the first place here is an example of comparison of numbers in string representation using the CompareTo method:
int res = "6".CompareTo("10");
res will have the value of 1 (meaning that 6 is larger than 10 or 6 follows 10)
According to the documentation->remarks:
The CompareTo method was designed primarily for use in sorting or alphabetizing operations.
You should parse each line of the file content and get values as numbers.
string[] lines = File.ReadAllLines("path");
// ID, time
var dict = new Dictionary<int, int>();
// Processing each line of the file content
foreach (var line in lines)
{
string[] splitted = line.Split('|');
int time = Convert.ToInt32(splitted[0]);
int ID = Convert.ToInt32(splitted[1]);
// Key = ID, Value = Time
dict.Add(ID, time);
}
var orderedListByID = dict.OrderBy(x => x.Key).ToList();
var orderedListByTime = dict.OrderBy(x => x.Value).ToList();
Note that I use your ID reference as Key of dictionary assuming that ID should be unique.
Short code version
// Key = ID Value = Time
var orderedListByID = lines.Select(x => x.Split('|')).ToDictionary(x => Convert.ToInt32(x[1]), x => Convert.ToInt32(x[0])).OrderBy(x => x.Key).ToList();
var orderedListByTime = lines.Select(x => x.Split('|')).ToDictionary(x => Convert.ToInt32(x[1]), x => Convert.ToInt32(x[0])).OrderBy(x => x.Value).ToList();
You need to convert them to numbers first. Sorting by string won't give you meaningful results.
times = list.Select(l => l.Split('|')[0]).Select(Int32.Parse);
ids = list.Select(l => l.Split('|')[1]).Select(Int32.Parse);
pairs = times.Zip(ids, (t, id) => new{Time = t, Id = id})
.OrderBy(x => x.Time)
.ToList();
Thank you all, this is my Solution:
var path = #"C:\Users\admin\Desktop\test.txt";
List<string> list = File.ReadAllLines(path).ToList();
list = list.OrderBy(x => Convert.ToInt32(x.Split('|')[0])).ToList();
for(var i = 0; i < list.Count; i++)
{
Console.WriteLine(list[i]);
}
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class TestClass {
public static void main(String[] args) {
List <LineItem> myList = new ArrayList<LineItem>();
myList.add(LineItem.getLineItem(500, 30));
myList.add(LineItem.getLineItem(300, 20));
myList.add(LineItem.getLineItem(900, 100));
System.out.println(myList);
Collections.sort(myList);
System.out.println("list after sort");
System.out.println(myList);
}
}
class LineItem implements Comparable<LineItem>{
int time;
int id ;
#Override
public String toString() {
return ""+ time + "|"+ id + " ";
}
#Override
public int compareTo(LineItem o) {
return this.time-o.time;
}
public static LineItem getLineItem( int time, int id ){
LineItem l = new LineItem();
l.time=time;
l.id=id;
return l;
}
}

Elegant way to check if a list contains an object where one property is the same, and replace only if the date of another property is later

I have a class as follows :
Object1{
int id;
DateTime time;
}
I have a list of Object1. I want to cycle through another list of Object1, search for an Object1 with the same ID and replace it in the first list if the time value is later than the time value in the list. If the item is not in the first list, then add it.
I'm sure there is an elegant way to do this, perhaps using linq? :
List<Object1> listOfNewestItems = new List<Object1>();
List<Object1> listToCycleThrough = MethodToReturnList();
foreach(Object1 object in listToCycleThrough){
if(listOfNewestItems.Contains(//object1 with same id as object))
{
//check date, replace if time property is > existing time property
} else {
listOfNewestItems.Add(object)
}
Obviously this is very messy (and that's without even doing the check of properties which is messier again...), is there a cleaner way to do this?
var finalList = list1.Concat(list2)
.GroupBy(x => x.id)
.Select(x => x.OrderByDescending(y=>y.time).First())
.ToList();
here is the full code to test
public class Object1
{
public int id;
public DateTime time;
}
List<Object1> list1 = new List<Object1>()
{
new Object1(){id=1,time=new DateTime(1991,1,1)},
new Object1(){id=2,time=new DateTime(1992,1,1)}
};
List<Object1> list2 = new List<Object1>()
{
new Object1(){id=1,time=new DateTime(2001,1,1)},
new Object1(){id=3,time=new DateTime(1993,1,1)}
};
and OUTPUT:
1 01.01.2001
2 01.01.1992
3 01.01.1993
This is how to check:
foreach(var object in listToCycleThrough)
{
var currentObject = listOfNewestItems
.SingleOrDefault(obj => obj.Id == object.Id);
if(currentObject != null)
{
if (currentObject.Time < object.Time)
currentObject.Time = object.Time
}
else
listOfNewestItems.Add(object)
}
But if you have large data, would be suggested to use Dictionary in newest list, time to look up will be O(1) instead of O(n)
You can use LINQ. Enumerable.Except to get the set difference(the newest), and join to find the newer objects.
var listOfNewestIDs = listOfNewestItems.Select(o => o.id);
var listToCycleIDs = listToCycleThrough.Select(o => o.id);
var newestIDs = listOfNewestIDs.Except(listToCycleIDs);
var newestObjects = from obj in listOfNewestItems
join objID in newestIDs on obj.id equals objID
select obj;
var updateObjects = from newObj in listOfNewestItems
join oldObj in listToCycleThrough on newObj.id equals oldObj.id
where newObj.time > oldObj.time
select new { oldObj, newObj };
foreach (var updObject in updateObjects)
updObject.oldObj.time = updObject.newObj.time;
listToCycleThrough.AddRange(newestObjects);
Note that you need to add using System.Linq;.
Here's a demo: http://ideone.com/2ASli
I'd create a Dictionary to lookup the index for an Id and use that
var newItems = new List<Object1> { ...
IList<Object1> itemsToUpdate = ...
var lookup = itemsToUpdate.
Select((i, o) => new { Key = o.id, Value = i }).
ToDictionary(i => i.Key, i => i.Value);
foreach (var newItem in newitems)
{
if (lookup.ContainsKey(newitem.ID))
{
var i = lookup[newItem.Id];
if (newItem.time > itemsToUpdate[i].time)
{
itemsToUpdate[i] = newItem;
}
}
else
{
itemsToUpdate.Add(newItem)
}
}
That way, you wouldn't need to reenumerate the list for each new item, you'd benefit for the hash lookup performance.
This should work however many times an Id is repeated in the list of new items.

Sort ObservableCollection by Date

I have a collection
private ObservableCollection<ContentItemViewModel> _contentTree;
public ObservableCollection<ContentItemViewModel> ContentTree
{
get { return _contentTree; }
}
class ContentItemViewModel has property:
private string _published;
public string Published
{
get
{
return _published;
}
set
{
_published = value;
NotifyPropertyChanged("Published");
}
}
that is - node.Published= Convert.ToDateTime(date.Value).ToString("dd MMM yyyy", new DateTimeFormatInfo());
I need to sort ContentTree collection by date? how can I do this?
I do not know if there is any better way but you can do the following. The idea here is to create an ordered list and first foreach item in the ordered list, removing and re-adding the related item from content tree.
var tempList = _contentTree.OrderBy(p => DateTime.Parse(p.DateAndTime));
tempList.ToList().ForEach(q =>
{
_contentTree.Remove(q);
_contentTree.Add(q);
});
Or you can employ Comparison;
Comparison<ContentItemViewModel> comparison = new Comparison<ContentItemViewModel>(
(p,q) =>
{
DateTime first = DateTime.Parse(p.DateAndTime);
DateTime second = DateTime.Parse(q.DateAndTime);
if (first == second)
return 0;
if (first > second)
return 1;
return -1;
});
List<ContentItemViewModel> tempList = _contentTree.ToList();
tempList.Sort(comparison);
_contentTree = new ObservableCollection<ContentItemViewModel>(tempList);
you can use this:
ConceptItems = new ObservableCollection<DataConcept>(ConceptItems.OrderBy(i => i.DateColumn));
Change it to list ToList();
Sort it using OrderBy or...
Make a new ObservableCollection using that list.

Categories