How to group in Linq based on previous Value

How to group in Linq based on previous Value - c#

I want to group a pointcloud based on 2 conditions
simple on Y so I wrote pointcloudH.GroupBy(KVP => KVP.Value.Y) where KVP is an KeyValuePair<string,System.Drawing.Point>
and now I want to group it also by X if X == (previousX + 1)
as far as I know I should us ThenBy() but what do I have to write between the brackets?
and here an example for a better illustration what I want to achieve
Sample pointcloud
(x|y) (1|1),(2|1),(4|1),(1|2),(2|3),(3|3),(4|3),(5|8),(9|10)
after step 1. it looks like this
group1 (1|1),(2|1),(4|1)
group2 (1|2)
group3 (2|3),(3|3),(4|3)
group4 (5|8)
group5 (9|10)
after step 2. it should look like this
group1 (1|1),(2|1)
group2 (4|1)
group3 (1|2)
group4 (2|3),(3|3),(4|3)
group5 (5|8)
group6 (9|10)
current code
var Hgroup = pointcloudH.OrderBy(KVP => KVP.Value.Y) // order by Y
.GroupBy(KVP => KVP.Value.Y) // groub by Y
.ThenBy(KVP => KVP.Value.X); // group by X ???

I don't think LINQ is the best tool for this kind of job, but it can be achieved. The important part is to think of the relation between your Point.X and the index of the relative Point in the Point.Y group. Once you realize you want to group them by Point.X - Index, you can do:
var Hgroup = pointcloudH.OrderBy(p => p.Y)
.GroupBy(p => p.Y)
.SelectMany(yGrp =>
yGrp.Select((p, i) => new {RelativeIndex = p.X - i, Point = p})
.GroupBy(ip => ip.RelativeIndex, ip => ip.Point)
.Select(ipGrp => ipGrp.ToList()))
.ToList();
Note that this will probably perform worst than a regular iterative algorithm. My pointcloudH is an array, but you can just change the lambda to reflect your own list. Also, remove the ToList() if you want to defer execution. This was to ease the result inspection in the debugger.
If you want to group all points in a Point.Y group regardless of their index (ie order by Point.X as well. Add ThenBy(p => p.X) after the first OrderBy clause.

Your problem cannot be solved by doing 2 separate group by clauses. I have created a little sample which should work for your problem. These are the key things that are happening in the code:
Construct 'mirror' array and insert a copy of the first item at index 0, this is used to keep track of the previous point
Create a variable that is incremented whenever a 'chain' is broken. This is whenever the next value is not equal to the previous + 1. This way we can group by an unique key per 'chain'.
class Program
{
public struct Point
{
public static Point Create(int x, int y)
{
return new Point() { X = x, Y = y };
}
public int X { get; set; }
public int Y { get; set; }
public override string ToString()
{
return string.Format("({0}|{1})", X, Y);
}
}
static void Main(string[] args)
{
//helper to avoid to much keystrokes :)
var f = new Func<int, int, Point>(Point.Create);
//compose the point array
//(1|1),(2|1),(4|1),(1|2),(2|3),(3|3),(4|3),(5|8),(9|10)
var points = new[] { f(1, 1), f(2, 1), f(4, 1), f(1, 2), f(2, 3), f(3, 3), f(4, 3), f(5, 8), f(9, 10) }.OrderBy(p => p.Y).ThenBy(p => p.X);;
//create a 'previous point' array which is a copy of the source array with a item inserted at index 0
var firstPoint = points.FirstOrDefault();
var prevPoints = new[] { f(firstPoint.X - 1, firstPoint.Y) }.Union(points);
//keep track of a counter which will be the second group by key. The counter is raised whenever the previous X was not equal
//to the current - 1
int counter = 0;
//the actual group by query
var query = from point in points.Select((x, ix) => new { current = x, prev = prevPoints.ElementAt(ix) })
group point by new { point.current.Y, prev = (point.prev.X == point.current.X - 1 ? counter : ++counter) };
//method chaining equivalent
query = points.Select((x, ix) => new { current = x, prev = prevPoints.ElementAt(ix) })
.GroupBy(point => new { point.current.Y, prev = (point.prev.X == point.current.X - 1 ? counter : ++counter) });
//print results
foreach (var item in query)
Console.WriteLine(string.Join(", ", item.Select(x=> x.current)));
Console.Read();
}
}

Related

convert two loops over two dimensional array to Linq syntax

I have a two dimensional array containing objects of type MyObj.
private MyObj[,] myObjs = new MyObj[maxX, maxY];
I want to get the indices from the array when passing in a matching object. I want to get the x and y value from this array. I can return these two values as a Position object that takes a x and y coordinate.
private Position GetIndices(MyObj obj)
{
for (int x = 0; x < myObjs.GetLength(0); x++)
{
for (int y = 0; y < myObjs.GetLength(1); y++)
{
if (myObjs[x, y] == obj)
{
return new Position(x, y);
}
}
}
}
Is it possible to get this code shorten to some Linq code lines?

But I don't think, it looks nice :)
var result = Enumerable.Range(0, myObjs.GetLength(0))
.Select(x => Enumerable.Range(0, myObjs.GetLength(1)).Select(y => new { x, y }))
.SelectMany(o => o)
.FirstOrDefault(o => myObjs[o.x, o.y] == obj);

Here's another option, if you're interested. It uses an indexer inside the first select and does a little math to find where that index falls inside the two-dimensional array.
var o = new MyObj();
myObjs[1,2] = o;
var p = myObjs.Cast<MyObj>()
.Select((x,i) => Tuple.Create(x,i))
.Where(x => x.Item1 == o)
.Select(x => new Point(x.Item2 / myObjs.GetLength(1), x.Item2 % myObjs.GetLength(1)))
.SingleOrDefault();
Console.WriteLine(p); // prints {X=1,Y=2}
It sorta looks like you're considering the x-coordinate to be the height of the array, and the y-coordinate the width, in which case you'd want to switch it up slightly:
var p = myObjs.Cast<MyObj>()
.Select((x,i) => Tuple.Create(x,i))
.Where(x => x.Item1 == o)
.Select(x => new Point(x.Item2 % myObjs.GetLength(1), x.Item2 / myObjs.GetLength(1)))
.SingleOrDefault();
Console.WriteLine(p); // prints {X=2,Y=1}
I used Point instead of Position since it's built into .NET but you should be able to just swap one for the other.

Yes, that is possible. You can ask Resharper to do the job (loop to linq) for you. After installation, just use the feature.

Grouping data between ranges using LINQ in C#

I have made a following code to create me a range between two numbers, and data is separated in 7 columns:
private List<double> GetRangeForElements(double minPrice, double maxPrice)
{
double determineRange = Math.Round(maxPrice / 7.00d, 3);
var ranges = new List<double>();
ranges.Insert(0, Math.Round(minPrice, 3));
ranges.Insert(1, determineRange);
for (int i = 2; i < 8; i++)
{
ranges.Insert(i, Math.Round(determineRange * i, 3));
}
return ranges;
}
Now I have list of ranges when I call the method:
var ranges = GetRangeForElements(1,1500);
On the other side now I have the data (a list) that contains following data (properties):
public double TransactionPrice {get;set;}
public int SaleAmount {get;set;}
Input data would be:
Transaction price Sale amount
114.5 4
331.5 6
169.59 8
695.99 14
1222.56 5
Generated range for between 1 and 1500 is:
1
214.28
428.57
642.85
857.14
1071.43
1285.71
1500.00
And the desired output would be:
Range Sales
(1 - 214.28) 12
(214.28 - 428.57) 6
(428.57 - 642.85) 0
(642.85 - 857.14) 14
(857.14 - 1071.43) 0
(1071.43 - 1285.71) 5
(1285.71 - 1500) 0
I've tried something like this:
var priceGroups = _groupedItems.GroupBy(x => ranges.FirstOrDefault(r => r > x.TransactionPrice))
.Select(g => new { Price = g.Key, Sales = g.Sum(x=>x.Sales) })
.ToList();
But this doesn't gives me what I want, the results I receive are completely messed up (I was able to verify the data and results manually)...
Can someone help me out?
P.S. guys, the ranges that have no sales should simply have value set to 0...
#blinkenknight here's a pic of what I'm saying, min price is = 2.45 , max price = 2.45
and the output of the 2nd method you posted is:

Since GetRangeForElements returns a List<double>, you cannot group by it. However, you can group by range index, and then use that index to get the range back:
var rangePairs = ranges.Select((r,i) => new {Range = r, Index = i}).ToList();
var priceGroups = _groupedItems
.GroupBy(x => rangePairs.FirstOrDefault(r => r.Range >= x.TransactionPrice)?.Index ?? -1)
.Select(g => new { Price = g.Key >= 0 ? rangePairs[g.Key].Range : g.Max(x => x.TransactionPrice), Sales = g.Sum(x=>x.Sales) })
.ToList();
Assuming that _groupedItems is a list, you could also start with ranges, and produce the results directly:
var priceGroups = ranges.Select(r => new {
Price = r
, Sales = _groupedItems.Where(x=>ranges.FirstOrDefault(y=>y >= x.TransactionPrice) == r).Sum(x => x.Sales)
});
Note: Good chances are, your GetRangeForElements has an error: it assumes that minPrice is relatively small in comparison to maxPrice / 7.00d. To see this problem, consider what would happen if you pass minPrice=630 and maxPrice=700: you will get 630, 100, 200, 300, ... instead of 630, 640, 650, .... To fix this problem, compute (maxPrice - minPrice) / 7.00d and use it as a step starting at minPrice:
private List<double> GetRangeForElements(double minPrice, double maxPrice) {
double step = (maxPrice - minPrice) / 7.0;
return Enumerable.Range(0, 8).Select(i => minPrice + i*step).ToList();
}

Convert array of doubles to list object using linq

I have the following array of coordinates:
double[] points = { 1, 2, 3, 4, 5, 6 };
Then I have the following class:
public class clsPoint
{
public double X { get; set; }
public double Y { get; set; }
}
I need to copy the points into List objects. Where the first point in the array is the X and the second point in the array is the Y. Here is what I have so far but it is not correct:
List<clsPoint> lstPoints = points
.Select(coord => new clsPoint
{
X = coord[0],
Y = coord[1]
}).ToList();
Expected Results
clsPoint Objects List (lstPoints)
X = 1 , Y = 2
X = 3 , Y = 4
X = 5 , Y = 6
Any help would be appreciated. Thanks.

You can generate a sequence of consecutive values until the half your array, then you can project using those values as index to get the pairs.
var result=Enumerable.Range(0, points.Length / 2).Select(i=>new clsPoint{X=points[2*i],Y=points[2*i+1]});
Update
This is another solution using Zip extension method and one overload of Where extension method to get the index:
var r2 = points.Where((e, i) => i % 2 == 0)
.Zip(points.Where((e, i) => i % 2 != 0), (a, b) => new clsPoint{X= a, Y= b });

I think there is probably a better way for you to compose your points prior to feeding them into your class. A simple for loop may suffice better in this situation as well.
However, in LINQ, you would first use a projection to gather the index so that you could group based on pairs and then use a second projection from the grouping to populate the class.
It looks like this
points.Select((v,i) => new {
val = v,
i = i
}).GroupBy(o => o.i%2 != 0 ? o.i-1 : o.i).Select(g => new clsPoint() {
X = g.First().val,
Y = g.Last().val
});

Using the overload of Select that receives the current index you can set a grouping rule (in this case a different id for each 2 numbers), then group by it and eventually create your new clsPoint:
double[] points = { 1, 2, 3, 4, 5, 6 };
var result = points.Select((item, index) => new { item, index = index / 2 })
.GroupBy(item => item.index, item => item.item)
.Select(group => new clsPoint { X = group.First(), Y = group.Last() })
.ToList();
Doing it with a simple for loop would look like:
List<clsPoint> result = new List<clsPoint>();
for (int i = 0; i < points.Length; i += 2)
{
result.Add(new clsPoint { X = points[i], Y = points.ElementAtOrDefault(i+1) });
}

Remove duplicates in a list of XYZ points

Mylist.GroupBy(x => new{x.X, x.Y}).Select(g => g.First()).ToList<XYZ>();
The above code works fine for me. I only want to compare the points based on the round(5) of the point component.
For example x.X = 16.838974347323224 should be only compared as x.X = 16.83897 because I experienced some inaccuracy after the round 5. Any suggestions?
Solution:
Mylist.GroupBy(x => new { X = Math.Round(x.X,5), Y = Math.Round(x.Y,5) })
.Select(g => g.First()).ToList();

Using Round can create a situation where two numbers, even though incredibly close to each other, can end up being considered distinct.
Take this example:
var Mylist = new []
{
new { X = 1.0000051, Y = 1.0 },
new { X = 1.0000049, Y = 1.0 },
new { X = 1.1, Y = 1.0 },
new { X = 1.0, Y = 1.005 },
};
The first two values are very close - in fact they differ in the 6th decimal place.
By what if we run this code:
var result =
Mylist
.GroupBy(x => new
{
X = Math.Round(x.X,5, MidpointRounding.AwayFromZero),
Y = Math.Round(x.Y,5, MidpointRounding.AwayFromZero)
})
.Select(g => g.First())
.ToList();
The result is:
The rounding has allowed these two values to be kept.
The correct approach is to filter by distance. If a subsequent value is within a threshold of the previous values it should be discarded.
Here's the code that does that:
var threshold = 0.000001;
Func<double, double, double, double, double> distance
= (x0, y0, x1, y1) =>
Math.Sqrt(Math.Pow(x1 - x0, 2.0) + Math.Pow(y1 - y0, 2.0));
var result = Mylist.Skip(1).Aggregate(Mylist.Take(1).ToList(), (xys, xy) =>
{
if (xys.All(xy2 => distance(xy.X, xy.Y, xy2.X, xy2.Y) >= threshold))
{
xys.Add(xy);
}
return xys;
});
Now if we run that on the Mylist data we get this:
This is a better ideal for removing duplicates.

To do so use Math.Round:
var result = Mylist.GroupBy(x => new { X = Math.Round(x.X,5, MidpointRounding.AwayFromZero), Y = Math.Round(x.Y,5, MidpointRounding.AwayFromZero) })
.Select(g => g.First()).ToList();
However if what you want is to remove duplicates then instead of GroupBy go for one of these:
Select rounded and then Distinct:
var result = Mylist.Select(item => new XYZ { X = Math.Round(item.X,5, MidpointRounding.AwayFromZero),
Y = Math.Round(item.Y,5, MidpointRounding.AwayFromZero)})
.Distinct().ToList();
Distinct and override Equals and GetHashCode - (equals will do the rounding) - wouldn't suggest
Distinct and implement a custom IEqualityComparer:
public class RoundedXyzComparer : IEqualityComparer<XYZ>
{
public int RoundingDigits { get; set; }
public RoundedXyzComparer(int roundingDigits)
{
RoundingDigits = roundingDigits;
}
public bool Equals(XYZ x, XYZ y)
{
return Math.Round(x.X, RoundingDigits, MidpointRounding.AwayFromZero) == Math.Round(y.X, RoundingDigits, MidpointRounding.AwayFromZero) &&
Math.Round(x.Y,RoundingDigits, MidpointRounding.AwayFromZero) == Math.Round(y.Y, RoundingDigits, MidpointRounding.AwayFromZero);
}
public int GetHashCode(XYZ obj)
{
return Math.Round(obj.X, RoundingDigits, MidpointRounding.AwayFromZero).GetHashCode() ^
Math.Round(obj.Y, RoundingDigits, MidpointRounding.AwayFromZero).GetHashCode();
}
}
//Use:
myList.Distinct(new RoundedXyzComparer(5));

Finding maximum of Nth elements of a list of lists in C#

I have a list of lists in C#, where each sublist has three doubles, representing a 3D point:
{{x1, y1, z1},
{x2, y2, z2},
{x3, y3, z3}}
I want to find the 3D bounding box of this dataset, and that means finding minimum X, maximum X, minimum Y, maximum Y, etc.
With Python/Numpy, I would get it with, say, zmax = list_of_lists[:,2].max(), etc.
Is there an elegant way to do this in C#? I suspect Linq is the way to go, but I haven't understood how it works yet (if some answer includes Linq, please explain how it works, please :o)

Create your list like this, possibly use a specific class for 3D points rather than tuples if you require more functionality.
var points = new[] {
Tuple.Create(x1, y1, z1),
Tuple.Create(x2, y3, z2),
Tuple.Create(x3, y3, z3)
};
Then, like #dasblinkenlight writes, you can use linq to select Max() or Min():
var maxX = points.Select(pt => pt.Item1).Max();
Or even shorter:
var maxX = points.Max(pt => pt.Item1);
Edit: A simple class for ease of use:
class Point3D {
public double X { get; set; }
public double Y { get; set; }
public double Z { get; set; }
public Point3D(double x, double y, double z) {
this.X = x;
this.Y = y;
this.Z = z;
}
}
var points = new[] {
new Point3D(x1, y1, z1),
new Point3D(x2, y3, z2),
new Point3D(x3, y3, z3)
};
var maxX = points.Max(pt => pt.X);

Yes, you can do that in C# with LINQ, like this:
var orig = new List<List<double>>();
var maxX = orig.Select(pt => pt[0]).Max();
var maxY = orig.Select(pt => pt[1]).Max();
var maxZ = orig.Select(pt => pt[2]).Max();
The way this works in LINQ is that the list of points is traversed, for each point, the requested coordinate is selected, and then the Max is computed. There is also a Min function to get you the other corner of the bounding box.
This is somewhat suboptimal, because the list is traversed multiple times. A pair of nested loops would probably do the same thing more efficiently, while remaining just as readable:
var min = new List<double>{double.MaxValue, double.MaxValue, double.MaxValue};
var max = new List<double>{double.MinValue, double.MinValue, double.MinValue};
foreach (var point in orig) {
for (var i = 0 ; i != 3 ; i++) {
min[i] = Math.Min(min[i], point[i]);
max[i] = Math.Max(max[i], point[i]);
}
}

If this is really a IEnumerable of IEnumerables of doubles with always the same order of X, Y, Z, then you could do something like this:
var list = new []
{
new [] { 1, 2, 3 },
new [] { 4, 5, 6 },
new [] { 7, 8, 9 }
};
var maxX = list.Select(s => s.Skip(0).Take(1)).Max();
var minY = list.Select(s => s.Skip(1).Take(1)).Min();

If all elements are of the same type and the property you're interested in is named, say, Z, you can do this:
var max = list_of_points.Max(p => p.Z);
In your case, since you seem to be using a double[][], you can do this:
var max = list_of_lists.Max(p => p[0]);
So the full bounding box would be defined by two points, like this:
var topLeft = new double[]
{
list_of_lists.Min(p => p[0]),
list_of_lists.Min(p => p[1]),
list_of_lists.Min(p => p[2])
};
var bottomRight = new double[]
{
list_of_lists.Max(p => p[0]),
list_of_lists.Max(p => p[1]),
list_of_lists.Max(p => p[2])
};
However, this requires you to make 6 passes through the list (one for each coordinate for each point). You can do this for some performance improvement:
var topLeft = list_of_lists.Aggregate((s, p) => return double[] { Math.Min(s[0], p[0]), Math.Min(s[1], p[1]), Math.Min(s[2], p[2]) });
var bottomRight = list_of_lists.Aggregate((s, p) => return double[] { Math.Max(s[0], p[0]), Math.Max(s[1], p[1]), Math.Max(s[2], p[2]) });
This would make only 2 passes through the list.
NOTE: The same code can be done with List<List<double>> rather than double[][] by simply adding .ToList() after each array declaration.

Have you taken a look at the SelectMany linq operator? I think it may be helpful for your problem.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to group in Linq based on previous Value - c#

Related

convert two loops over two dimensional array to Linq syntax

Grouping data between ranges using LINQ in C#

Convert array of doubles to list object using linq

Remove duplicates in a list of XYZ points

Finding maximum of Nth elements of a list of lists in C#

Categories

Resources