Array sorting by two parameters - c#

I'm having a little difficulty with the array.sort. I have a class and this class has two fields, one is a random string the other one is a random number. If i want to sort it with one parameter it just works fine. But i would like to sort it with two parameters. The first one is the SUM of the numbers(from low to high), and THEN if these numbers are equal by the random string that is give to them(from low to high).
Can you give some hint and tips how may i can "merge" these two kinds of sort?
Array.Sort(Phonebook, delegate(PBook user1, PBook user2)
{ return user1.Sum().CompareTo(user2.Sum()); });
Console.WriteLine("ORDER");
foreach (PBook user in Phonebook)
{
Console.WriteLine(user.name);
}
That's how i order it with one parameter.

i think this is what you are after:
sourcearray.OrderBy(a=> a.sum).ThenBy(a => a.random)

Here is the general algorithm that you'll use for comparing multiple fields in a CompareTo method:
public int compare(MyClass first, MyClass second)
{
int firstComparison = first.FirstValue.CompareTo(second.SecondValue);
if (firstComparison != 0)
{
return firstComparison;
}
else
{
return first.SecondValue.CompareTo(second.SecondValue);
}
}
However, LINQ does make the syntax for doing this much easier, allowing you to only write:
Phonebook = Phonebook.OrderBy(book=> book.Sum())
.ThenBy(book => book.OtherProperty)
.ToArray();

You can do this in-place by using a custom IComparer<PBook>. The following should order your array as per your original code, but if two sums are equal it should fall back on the random string (which I've called RandomString):
public class PBookComparer : IComparer<PBook>
{
public int Compare(PBook x, PBook y)
{
// Sort null items to the top; you can drop this
// if you don't care about null items.
if (x == null)
return y == null ? 0 : -1;
else if (y == null)
return 1;
// Comparison of sums.
var sumCompare = x.Sum().CompareTo(y.Sum());
if (sumCompare != 0)
return sumCompare;
// Sums are the same; return comparison of strings
return String.Compare(x.RandomString, y.RandomString);
}
}
You call this as
Array.Sort(Phonebook, new PBookComparer());
You could just do this inline but it gets a bit hard to follow:
Array.Sort(Phonebook, (x, y) => {
int sc = x.Sum().CompareTo(y.Sum());
return sc != 0 ? sc : string.Compare(x.RandomString, y.RandomString); });
... Actually, that isn't too bad, although I have dropped the null checks.

Related

Why removing objects from a list with duplicate properties of types double in C# does not give consistent result using different methods?

I am trying to find the quickest way to remove duplicate entries in a list.
My list contains objects which have properties X and Y which are both of type double.
I need to remove any objects which contain the same X and Y values.
My first attempt is very slow.
It will take a list that contains 81403 objects and spit out a new list with 25900 but it takes over a minute to run. Had this run quickly I would have compared the difference in order to add some rounding but it's too slow.
private List<DelaunayPoint> DeleteDuplicatesSlowWay(List<DelaunayPoint> points)
{
List<DelaunayPoint> distinctPoints = new();
int i = 0;
foreach (DelaunayPoint p in points)
{
if (i == 0)
{
distinctPoints.Add(p);
}
else
{
if (distinctPoints.Any(pnt => pnt.X == p.X) == false ||
distinctPoints.Any(pnt => pnt.Y == p.Y) == false)
{
distinctPoints.Add(p);
}
}
i++;
}
return distinctPoints;
}
The following method will take the same list of 81403 objects but it will spit out a list containing 73385 objects, however, it takes less than a second to run.
private List<DelaunayPoint> DeleteDuplicatesFast(List<DelaunayPoint> points)
{
return points
.GroupBy(p => new { p.X, p.Y })
.Select(output => output.First())
.ToList();
}
Why do the above two methods give different results?
Assuming the difference is a rounding error between the two methods, how can I add rounding to the second DeleteDuplicatesFast method so I can compare the two?
I would need any rounding to not apply the rounding to the output list.
To answer the first part of your question: points are only equal if both their X and Y values are equal. You're testing for either X or Y being equal.
About the filtering of duplicates. The fastest way is to make your DelaunayPoint class implement IEquatable<DelaunayPoint> and then add the collection to a HashSet:
class DelaunayPoint : IEquatable<DelaunayPoint>
{
public DelaunayPoint(double x, double y)
{
X = x;
Y = y;
}
public double X { get; }
public double Y { get; }
public bool Equals(DelaunayPoint other)
{
return other != null && this.X == other.X && this.Y == other.Y;
}
public override int GetHashCode()
{
return System.HashCode.Combine(X,Y);
}
}
var set = new HashSet<DelaunayPoint>(points);
Now set contains distinct points. I tested it to be approx. 7 times faster than GroupBy.

SortedSet with element duplication - can't remove element

I'm working on an implementation of the A-star algorithm in C# in Unity.
I need to evaluate a collection of Node :
class Node
{
public Cell cell;
public Node previous;
public int f;
public int h;
public Node(Cell cell, Node previous = null, int f = 0, int h = 0)
{
this.cell = cell;
this.previous = previous;
this.f = f;
this.h = h;
}
}
I have a SortedSet which allows me to store several Node, sorted by h property. Though, I need to be able to store two nodes with the same h property. So I've implemented a specific IComparer, in a way that allow me sorting by h property, and triggerring equality only when two nodes are representing the exact same cell.
class ByHCost : IComparer<Node>
{
public int Compare(Node n1, Node n2)
{
int result = n1.h.CompareTo(n2.h);
result = (result == 0) ? 1 : result;
result = (n1.cell == n2.cell) ? 0 : result;
return result;
}
}
My problem : I have a hard time to remove things from my SortedSet (I named it openSet).Here is an example:
At some point in the algorithm, I need to remove a node from the list based on some criteria (NB: I use isCell127 variable to focus my debug on an unique cell)
int removedNodesNb = openSet.RemoveWhere((Node n) => {
bool isSame = n.cell == candidateNode.cell;
bool hasWorseCost = n.f > candidateNode.f;
if(isCell127)
{
Debug.Log(isSame && hasWorseCost); // the predicate match exactly one time and debug.log return true
}
return isSame && hasWorseCost;
});
if(isCell127)
{
Debug.Log($"removed {removedNodesNb}"); // 0 nodes where removed
}
Here, the removeWhere method seems to find a match, but doesn't remove the node.
I tried another way :
Node worseNode = openSet.SingleOrDefault(n => {
bool isSame = n.cell == candidateNode.cell;
bool hasWorseCost = n.f > candidateNode.f;
return isSame && hasWorseCost;
});
if(isCell127)
{
Debug.Log($"does worseNode exists ? {worseNode != null}"); // Debug returns true, it does exist.
}
if(worseNode != null)
{
if(isCell127)
{
Debug.Log($"openSet length {openSet.Count}"); // 10
}
openSet.Remove(worseNode);
if(isCell127)
{
Debug.Log($"openSet length {openSet.Count}"); // 10 - It should have been 9.
}
}
I think the problem is related to my pretty unusual IComparer, but I can't figure whats exatcly the problem.
Also, I would like to know if there is a significative performance improvment about using an auto SortedSet instead of a manually sorted List, especially in the A-star algorithm use case.
If i write your test you do:
n1.h < n2.h
n1.cell = n2.cell -> final result = 0
n1.h > n2.h
n1.cell = n2.cell -> final result = 0
n1.h = n2.h
n1.cell != n2.cell -> final result = 1
n1.h < n2.h
n1.cell != n2.cell -> final result = -1
n1.h > n2.h
n1.cell != n2.cell -> final result = 1
when you have equality on h value (test number 3) you choose to have always the same result -> 1. so its no good you have to have another test on cell to clarify the position bacause there is a confusion with other test which gives the same result (test number 5)
So i could test with sample, but i am pretty sure you break the Sort.
So if you clarify the test, i suggest you to use Linq with a list...its best performance.
I'll answer my own topic because I've a pretty complete one.
Comparison
The comparison of the IComparer interface needs to follow some rules. Like #frenchy said, my own comparison was broken. Here are math fundamentals of a comparison I totally forgot (I found them here):
1) A.CompareTo(A) must return zero.
2) If A.CompareTo(B) returns zero, then B.CompareTo(A) must return zero.
3) If A.CompareTo(B) returns zero and B.CompareTo(C) returns zero, then A.CompareTo(C) must return zero.
4) If A.CompareTo(B) returns a value other than zero, then B.CompareTo(A) must return a value of the opposite sign.
5) If A.CompareTo(B) returns a value x not equal to zero, and B.CompareTo(C) returns a value y of the same sign as x, then A.CompareTo(C) must return a value of the same sign as x and y.
6) By definition, any object compares greater than (or follows) null, and two null references compare equal to each other.
In my case, rule 4) - symetry - was broken.
I needed to store multiple node with the same h property, but also to sort by that h property. So, I needed to avoid equality when h property are the same.
What I decided to do, instead of a default value when h comparison lead to 0 (which broke 4th rule), is refine the comparison in a way that never lead to 0 with a unique value foreach node instance. Well, this implementation is probably not the best, maybe there is something better to do for a unique value, but here is what I did.
private class Node
{
private static int globalIncrement = 0;
public Cell cell;
public Node previous;
public int f;
public int h;
public int uid;
public Node(Cell cell, Node previous = null, int f = 0, int h = 0)
{
Node.globalIncrement++;
this.cell = cell;
this.previous = previous;
this.f = f;
this.h = h;
this.uid = Node.globalIncrement;
}
}
private class ByHCost : IComparer<Node>
{
public int Compare(Node n1, Node n2)
{
if(n1.cell == n2.cell)
{
return 0;
}
int result = n1.h.CompareTo(n2.h);
result = (result == 0) ? n1.uid.CompareTo(n2.uid) : result; // Here is the additional comparison which never lead to 0. Depending on use case and number of object, it would be better to use another system of unique values.
return result;
}
}
RemoveWhere method
RemoveWhere use a predicate to look into the collection so I didn't think it cares about comparison. But RemoveWhere use internally Remove method, which do care about the comparison. So, even if the RemoveWhere have found one element, if your comparison is inconstent, it will silently pass its way. That's a pretty weird implementation, no ?

Implementing Custom Int+Range List Solution

I'm wondering if anyone can come up with a way to implement an array of numbers in a more memory efficient manner that will auto-organise itself into ranges. Example;
List testList = new List{1,2,3,4,5,6,7...};
vs
List<Range> testList = new List<Range>{1-3000,3002,4000-5000...};
Previously, I have asked a question just to confirm about whether or not this would in fact be a more memory efficient alternative. This question however pertains to actual application, how to implement this range list solution.
Index Array Storage Memory
I imagine this would perhaps need to be a custom list solution that would be a mix of ints and ranges. I'm picturing being able to .Add([int]) to the list, at which point it would determine if the value would cause a range to be added or to simply add the int value to the list.
Example
RangeList rangeList = new RangeList{1, 4, 7-9};
rangeList.Add(2);
//rangeList -> 1-2, 4, 7-9
rangeList.Add(3);
//rangeList -> 1-3, 4, 7-9
Details specific to my implementation
In my particular case, I'm analysing a very large document, line by line. Lines that meet a certain criteria need to be identified and then the overall list of line indexes need to be presented to the user.
Obviously displaying "Lines 33-32019 identified" is preferable to "Lines 33,34,35...etc". For this case, numbers will always be positive.
The first thing I would do is make a class which represents your range. You can provide some convenience like formatting as a string, and having an implicit cast from an int (This helps later implementation of the range list)
public class Range
{
public int Start{get; private set;}
public int End{get; private set;}
public Range(int startEnd) : this(startEnd,startEnd)
{
}
public Range(int start, int end)
{
this.Start = start;
this.End = end;
}
public static implicit operator Range(int i)
{
return new Range(i);
}
public override string ToString()
{
if(Start == End)
return Start.ToString();
return String.Format("{0}-{1}",Start,End);
}
}
You can then begin a simple implementation of the RangeList. By providing an Add method you can use a list initializer similar to List<T>:
public class RangeList : IEnumerable<Range>
{
private List<Range> ranges = new List<Range>();
public void Add(Range range)
{
this.ranges.Add(range);
}
public IEnumerator<Range> GetEnumerator()
{
return this.ranges.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator(){
return this.GetEnumerator();
}
}
At this point you can write some test code:
var rangeList = new RangeList(){
new Range(1,10),
15
};
foreach(var range in rangeList)
Console.WriteLine(range);
// Outputs:
// 1-10
// 15
Live example at this point: http://rextester.com/NCZSA71850
The next thing to do is provide an overload of Add which takes an int and finds the right range or adds a new one. A naive implemntation might look like the below (Assuming the addition of an Update method on range)
public void Add(int i)
{
// is it within or contiguous to an existing range
foreach(var range in ranges)
{
if(i>=range.Start && i<=range.End)
return; // already in a range
if(i == range.Start-1)
{
range.Update(i,range.End);
return;
}
if(i == range.End + 1)
{
range.Update(range.Start,i);
return;
}
}
// not in any ranges
ranges.Add(i);
}
Live example at this point: http://rextester.com/CHX64125
However this suffers from a few deficiencies
Does not merge ranges (say you already have 1-10 and 12-20 and you Add(11))
Does not re-order so if you have 1-5 and 20-25 and Add(7) this will be at the end not in the middle.
You can solve both problems by applying a sort after each addition, and some logic to determine if you should merge ranges
private void SortAndMerge()
{
ranges.Sort((a,b) => a.Start - b.Start);
var i = ranges.Count-1;
do
{
var start = ranges[i].Start;
var end = ranges[i-1].End;
if(end == start-1)
{
// merge and remove
ranges[i-1].Update(ranges[i-1].Start,ranges[i].End);
ranges.RemoveAt(i);
}
} while(i-- >1);
}
This needs to be called after every change to the list.
public void Add(Range range)
{
this.ranges.Add(range);
SortAndMerge();
}
public void Add(int value)
{
// is it within or contiguous to an existing range
foreach(var range in ranges)
{
if(value>=range.Start && value<=range.End)
return; // already in a range
if(value == range.Start-1)
{
range.Update(value,range.End);
SortAndMerge();
return;
}
if(value == range.End + 1)
{
range.Update(range.Start,value);
SortAndMerge();
return;
}
}
// not in any ranges
ranges.Add(value);
SortAndMerge();
}
Live example here: http://rextester.com/SYLARF47057
There are still some possible edge cases with this, which I urge you to work through.
UPDATE
The below will get this working as expected. This will merge up any added ranges/ints as you would expect and returns them correctly sorted. I've only changed the Add(Range) method, I think this is a fairly clean way of doing this.
public void Add(Range rangeToAdd)
{
var mergableRange = new List<Range>();
foreach (var range in ranges)
{
if (rangeToAdd.Start == range.Start && rangeToAdd.End == range.End)
return; // already exists
if (mergableRange.Any())
{
if (rangeToAdd.End >= range.Start - 1)
{
mergableRange.Add(range);
continue;
}
}
else
{
if (rangeToAdd.Start >= range.Start - 1
&& rangeToAdd.Start <= range.End + 1)
{
mergableRange.Add(range);
continue;
}
if (range.Start >= rangeToAdd.Start
&& range.End <= rangeToAdd.End)
{
mergableRange.Add(range);
continue;
}
}
}
if (!mergableRange.Any()) //Standalone range
{
ranges.Add(rangeToAdd);
}
else //merge overlapping ranges
{
mergableRange.Add(rangeToAdd);
var min = mergableRange.Min(x => x.Start);
var max = mergableRange.Max(x => x.End);
foreach (var range in mergableRange) ranges.Remove(range);
ranges.Add(new Range(min, max));
}
SortAndMerge();
}
Finally, we need if (ranges.Count > 1) in the SortAndMerge() method to prevent an index error when the first range is added.
And with that, I think this fully satisfies my question.

Why is my OrderBy running forever with this comparator?

I have a class,
public class NullsAreLast : IComparer<int?>
{
public int Compare (int? x, int? y)
{
if(y == null)
return -1;
else if(x == null)
return 1;
else
return (int)x - (int)y;
}
}
which is self-explanatory on how it is supposed to work.
Whenever I run
arr.OrderBy(i => i, new NullsAreLast())
with at least two null values in arr it runs forever! Any idea why?
Keep in mind that a sorting algorithm may compare the same two values several times over the process of ordering the whole sequence. Because of this, it's very important to be aware of all three possible results: less than, greater than, and equal.
This is (mostly) fine for your integer comparison at the end (the subtraction operation). There are some weird/rare edge cases when working with floating point numbers instead of integers, and calling .CompareTo() is the preferred practice anyway, but subtraction is usually good enough in this case. However, the null checks here are a real problem.
Think about what happens as a list is nearly finished sorting. You have two null values that have both made their way to the front of the list; the algorithm just needs to verify they are in the correct position. Because both x and y are null, your function should return 0. They are equivalent (for this purpose, at least). Instead, the code always returns -1. The y value will always be less than then x value, and so the algorithm will always believe it still needs to swap them. It swaps, and tries to do the same thing again. And again. And again. And again. It can never finish.
Try this instead:
public class NullsAreLast : IComparer<int?>
{
public int Compare (int? x, int? y)
{
if(!y.HasValue)
{
if (!x.HasValue) return 0;
return -1;
}
if(!x.HasValue) return 1;
return x.Value.CompareTo(y.Value);
}
}
The minus operation at the end of your Compare method isn't appropriate for comparison. You need to handle exactly three possibilities - x is bigger, y is bigger, or they are the same.
MSDN
Compares two objects and returns a value indicating whether one is
less than, equal to, or greater than the other.
With this code, suppose X was 1000 and Y was 15. Your result would be 985, which doesn't make sense here.
Given your code and method name, I'm going to guess what you meant is this:
public class NullsAreLast : IComparer<int?>
{
public int Compare (int? x, int? y)
{
if(y == null)
return -1;
else if(x == null)
return 1;
else{
int diff = x - y;
if (diff == 0) return 0; //same
if (diff < 0) return 1; //y was bigger
if (diff > 0) return -1; //x was bigger
}
}
}
You could even smash it into a horrible one-liner:
return (y==null?-1:(x==null?1:(x-y==0?0:(x-y<0?1:-1))));

c# Linq Except Not Returning List of Different Values

I am trying to find the differences in two lists. List, "y" should have 1 unique value when compared to list "x". However, Except, does not return the difference. The, "differences" list's count always equals 0.
List<EtaNotificationUser> etaNotifications = GetAllNotificationsByCompanyIDAndUserID(PrevSelectedCompany.cmp_ID);
IEnumerable<string> x = etaNotifications.OfType<string>();
IEnumerable<string> y = EmailList.OfType<string>();
IEnumerable<string> differences = x.Except(y, new StringLengthEqualityComparer()).ToList();
foreach(string diff in differences)
{
addDiffs.Add(diff);
}
After reading a few posts and articles on the post, I created a custom comparer. The comparer looks at string length (kept it simple for testing) and obtains the Hashcode, since these are two objects of a different type (even though I convert their types to string), I thought it may have been the issue.
class StringLengthEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return x.Length == y.Length;
}
public int GetHashCode(string obj)
{
return obj.Length;
}
}
This is my first time using Except. Sounds like a great, optimized way of comparing two lists, but I can't get it to work.
Update
X - Should hold Email Addresses from the database.
GetAllNotificationsByCompanyIDAndUserID - brings back email values from the DB.
Y - Should hold all Email Addresses in the UI Grid.
What I am trying to do is detect if a new e-mail has been added to the grid. So at this point X will have the saved values from past entries. Y will have any new e-mail addresses add by the user and have not been saved yet.
I have verified this is all working correctly.
The problem is here:
IEnumerable<string> x = etaNotifications.OfType<string>();
but etaNotifications is a List<EtaNotificationUser>, none of which can be a string since string is sealed. OfType returns all instances that are of the given type - it does not "convert" each member to that type.
So x will always be empty.
Maybe you want:
IEnumerable<string> x = etaNotifications.Select(e => e.ToString());
if EtaNotificationUser has overridden ToString to give you the value you want to compare. If the value you want to compare is in a property you can use:
IEnumerable<string> x = etaNotifications.Select(e => e.EmailAddress);
or some other property.
You'll likely have to do something similar for y (unless EmailList is already a List<string> which I doubt).
Assuming you have verified that your two enumerables x and y actually contain the strings you expect them to, I believe your problem is with your string comparer. According to the docs, Enumerable.Except "Produces the set difference of two sequences. The set difference is the members of the first sequence that don't appear in the second sequence." But your equality comparer equates all strings with the same length. Thus, if a string in the first sequence happens to have the same length as a string in the second, it will not be found as different using your comparer.
Update: yup, I just tested it:
public class StringLengthEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
return x.Length == y.Length;
}
public int GetHashCode(string obj)
{
return obj.Length;
}
}
string [] array1 = new string [] { "foo", "bar", "yup" };
string[] array2 = new string[] { "dll" };
int diffCount;
diffCount = 0;
foreach (var diff in array1.Except(array2, new StringLengthEqualityComparer()))
{
diffCount++;
}
Debug.Assert(diffCount == 0); // No assert.
diffCount = 0;
foreach (var diff in array1.Except(array2))
{
diffCount++;
}
Debug.Assert(diffCount == 0); // Assert b/c diffCount == 3.
There is no assert with the custom comparer but there is with the standard.

Categories