Strange behavior of EqualityComparer with nullable fields

Strange behavior of EqualityComparer with nullable fields - c#

Assume there is this class:
public class Foo
{
public int Id { get; set; }
public int? NullableId { get; set; }
public Foo(int id, int? nullableId)
{
Id = id;
NullableId = nullableId;
}
}
I need to compare these objects by following rules:
If both objects have value for NullableId then we compare both Id
and NullableId
If some of the objects/both of them do not have NullableId then
ignore it and compare only Id.
To achieve it I have overwritten Equals and GetHashCode like this:
public override bool Equals(object obj)
{
var otherFoo = (Foo)obj;
var equalityCondition = Id == otherFoo.Id;
if (NullableId.HasValue && otherFoo.NullableId.HasValue)
equalityCondition &= (NullableId== otherFoo.NullableId);
return equalityCondition;
}
public override int GetHashCode()
{
var hashCode = 806340729;
hashCode = hashCode * -1521134295 + Id.GetHashCode();
return hashCode;
}
Further down I have two lists of Foo:
var first = new List<Foo> { new Foo(1, null) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3) };
Next, I want to join these lists. If I do it like this:
var result = second.Join(first, s => s, f => f, (f, s) => new {f, s}).ToList();
then the result would be as I expected and I will get 3 items.
But, if I change order and join first with second:
var result = first.Join(second, f => f, s => s, (f, s) => new {f, s}).ToList();
then the result would only have 1 item - new Foo(1, null) and new Foo(1 ,3)
I can not get what am I doing wrong. If try to put a break point in Equals method then I can see that it tries to compare items from same list (e. g. compare new Foo(1, 1) and new Foo(1 ,2)). For me it looks like that happens because of Lookup that is being created inside Join method.
Could someone clarify what happens there? What should I change to achieve desired behavior?

Your Equals method is reflexive and symmetric, but it is not transitive.
Your implementation doesn't meet the requirements specified in the docs:
If (x.Equals(y) && y.Equals(z)) returns true, then x.Equals(z) returns true.
from https://learn.microsoft.com/en-us/dotnet/api/system.object.equals?view=netframework-4.8
For example, suppose you have:
var x = new Foo(1, 100);
var y = new Foo(1, null);
var z = new Foo(1, 200);
You have x.Equals(y) and y.Equals(z) which implies that you should also have x.Equals(z), but your implementation does not do this. Since you don't meet the specification, you can't expect any algorithms reliant on your Equals method to behave correctly.
You ask what you can do instead. This depends on exactly what you need to do. Part of the problem is that it's not really clear what is intended in the corner-cases, if indeed they can appear. What should happen if one Id appears multiple times with the same NullableId in one or both lists? For a simple example, if new Foo(1, 1) exists in the first list three times, and the second list three times, what should be in the output? Nine items, one for each pairing?
Here's a naive attempt to solve your problem. This joins on only Id and then filters out any pairings that have incompatible NullableId. But you might not be expecting the duplicates when an Id appears multiple times in each list, as can be seen in the example output.
using System;
using System.Linq;
using System.Collections.Generic;
public class Foo
{
public int Id { get; set; }
public int? NullableId { get; set; }
public Foo(int id, int? nullableId)
{
Id = id;
NullableId = nullableId;
}
public override string ToString() => $"Foo({Id}, {NullableId?.ToString()??"null"})";
}
class MainClass {
public static IEnumerable<Foo> JoinFoos(IEnumerable<Foo> first, IEnumerable<Foo> second) {
return first
.Join(second, f=>f.Id, s=>s.Id, (f,s) => new {f,s})
.Where(fs =>
fs.f.NullableId == null ||
fs.s.NullableId == null ||
fs.f.NullableId == fs.s.NullableId)
.Select(fs => new Foo(fs.f.Id, fs.f.NullableId ?? fs.s.NullableId));
}
public static void Main (string[] args) {
var first = new List<Foo> { new Foo(1, null), new Foo(1, null), new Foo(1, 3) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3), new Foo(1, null) };
foreach (var f in JoinFoos(first, second)) {
Console.WriteLine(f);
}
}
}
Output:
Foo(1, 1)
Foo(1, 2)
Foo(1, 3)
Foo(1, null)
Foo(1, 1)
Foo(1, 2)
Foo(1, 3)
Foo(1, null)
Foo(1, 3)
Foo(1, 3)
It also might be too slow for you if you have tens of thousands of items with the same Id, because it builds up every possible pair with matching Id before filtering them out. If each list has 10,000 items with Id == 1 then that's 100,000,000 pairs to pick through.

My answer contains a program that I believe is better than the one proposed in Weeble's answer but first I would like to demonstrate how the Join method works and talk about problems I see in your approach.
As you can see here https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.join?view=netframework-4.8
the Join method
Correlates the elements of two sequences based on matching keys.
If the keys don't match then elements from both collections are not included. For example, remove your Equals and GetHashCode methods and try this code:
var first = new List<Foo> { new Foo(1, 1) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3) };
//This is your original code that returns no results
var result = second.Join(first, s => s, f => f, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => s, f => f, (f, s) => new { f, s }).ToList();
//This code is mine and it returns in both calls of the Join method one element in the resulting collection; the element contains two instances of Foo (1,1) - f and s
result = second.Join(first, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
But if you set your original data input that contains null with my code:
var first = new List<Foo> { new Foo(1, null) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3) };
var result = second.Join(first, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
the result variable will be empty in both cases since the key { 1, null } doesn't match any other key, i.e. { 1, 1 }, { 1, 2 }, { 1, 3 }.
Now returning to your question. I would suggest you reconsider your entire approach in cases like this and here is why. Let us imagine that your implementation of the Equals and GetHashCode methods worked as you expected and you even didn't post your question. Then your solution creates the following outcomes, as I see it:
To understand how your code calculates its output the user of your code has to have access to the code of the Foo type and spend time reviewing your implementation of the Equals and GetHashCode methods (or reading documentation).
With such implementation of the Equals and GetHashCode methods, you are trying to change the expected behavior of the Join method. The user may expect that the first element of the first collection Foo(1, null) will not be considered equal to the first element of the second collection Foo(1, 1).
Let us imagine that you have multiple classes to join, each is written by some individual, and each class has its own logic in the Equals and GetHashCode methods. To figure out how actually your joining works with each type the user instead of looking into a joining method implementation only once would need to check the source code of all those classes trying to understand how each type handles its own comparison facing different variations of things like this with magic numbers (taken from your code):
public override int GetHashCode()
{
var hashCode = 806340729;
hashCode = hashCode * -1521134295 + Id.GetHashCode();
return hashCode;
}
It may don't seem a big problem but imagine you are a new person on the
the project, you have a lot of classes with logic like this and limited time
to complete your task, e.g. you have an urgent change request, huge sets
of data input, and no unit tests.
If someone inherites from your class Foo and put an instance of Foo1 to the collection among with Foo instances:
public class Foo1 : Foo
{
public Foo1(int id, int? nullableId) : base (id, nullableId)
{
Id = id;
NullableId = nullableId;
}
public override bool Equals(object obj)
{
var otherFoo1 = (Foo1)obj;
return Id == otherFoo1.Id;
}
public override int GetHashCode()
{
var hashCode = 806340729;
hashCode = hashCode * -1521134295 + Id.GetHashCode();
return hashCode;
}
}
var first = new List<Foo> { new Foo1(1, 1) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3)};
var result = second.Join(first, s => s, f => f, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => s, f => f, (f, s) => new { f, s }).ToList();
then you have here a run-time exception in the Equals method of the type Foo1:
System.InvalidCastException, Message=Unable to cast object of type
'ConsoleApp1.Foo' to type 'ConsoleApp1.Foo1'. With the same input data, my code
would work fine in this situation:
var result = second.Join(first, s => s.Id, f => f.Id, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => s.Id, f => f.Id, (f, s) => new { f, s }).ToList();
With your implementation of the Equals and GetHashCode methods when someone modifies the joining code like this:
var result = second.Join(first, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
result = first.Join(second, s => new { s.Id, s.NullableId }, f => new { f.Id, f.NullableId }, (f, s) => new { f, s }).ToList();
then your logic in the Equals and GetHashCode methods will be ignored and
you will have a different result.
In my opinion, this approach (with overriding Equals and GetHashCode methods) may be a source of multiple bugs. I think it is better when your code performing joining has an implementation that can be understood without any extra information, the implementation of the logic is concentrated within one method, the implementation is clear, predictable, maintainable, and it is simple to understand.
Please also note that with your input data:
var first = new List<Foo> { new Foo(1, null) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3) };
the code in the Weeble's answer generates the following output:
Foo(1, 1)
Foo(1, 2)
Foo(1, 3)
while as far as I understand you asked for an implementation that with the input produces output that looks like this:
Foo(1, null), Foo(1, 1)
Foo(1, null), Foo(1, 2)
Foo(1, null), Foo(1, 3)
Please consider updating your solution with my code since it produces a result in the format you asked for, my code is easier to understand, and it has other advantages as you can see:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp40
{
public class Foo
{
public int Id { get; set; }
public int? NullableId { get; set; }
public Foo(int id, int? nullableId)
{
Id = id;
NullableId = nullableId;
}
public override string ToString() => $"Foo({Id}, {NullableId?.ToString() ?? "null"})";
}
class Program
{
static void Main(string[] args)
{
var first = new List<Foo> { new Foo(1, null), new Foo(1, 5), new Foo(2, 3), new Foo(6, 2) };
var second = new List<Foo> { new Foo(1, 1), new Foo(1, 2), new Foo(1, 3), new Foo(2, null) };
var result = second.Join(first, s=>s.Id, f=>f.Id, (f, s) => new { f, s })
.Where(o => !((o.f.NullableId != null && o.s.NullableId != null) &&
(o.f.NullableId != o.s.NullableId)));
foreach (var o in result) {
Console.WriteLine(o.f + ", " + o.s);
}
Console.ReadLine();
}
}
}
Output:
Foo(1, 1), Foo(1, null)
Foo(1, 2), Foo(1, null)
Foo(1, 3), Foo(1, null)
Foo(2, null), Foo(2, 3)

Related

Join multiple lists of objects in c#

I have three lists that contain objects with following structure:
List1
- Status
- ValueA
List2
- Status
- ValueB
List3
- Status
- ValueC
I want to joint the lists by status to get a final list that contains object with following structure:
- Status
- ValueA
- ValueB
- ValueC
Not every list has all the status. So a simple (left) join won't do it. Any ideas how to achieve the desired result? I tried with
var result = from first in list1
join second in list2 on first.Status equals second.Status into tmp1
from second in tmp1.DefaultIfEmpty()
join third in list3 on first.Status equals third.Status into tmp2
from third in tmp2.DefaultIfEmpty()
select new { ... };
But result is missing a status. Here is a full MRE:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var result = from f in first
join s in second on f.Status equals s.Status into tmp1
from s in tmp1.DefaultIfEmpty()
join t in third on f.Status equals t.Status into tmp2
from t in tmp2.DefaultIfEmpty()
select new
{
Status = f.Status,
ValueA = f.ValueA,
ValueB = s.ValueB,
ValueC = t.ValueC,
};
}
}
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);

Unfortunately it is unclear, what should happen, if a status occurs multiple times within one list, cause your aggregate can only hold one value per status.
One possibility to solve this issue would be:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var allStates = first.Select(a => a.Status)
.Concat(second.Select(b => b.Status))
.Concat(third.Select(c => c.Status))
.Distinct();
var result = allStates
.Select(Status => new
{
Status,
ValueA = first.FirstOrDefault(a => a.Status == Status),
ValueB = second.FirstOrDefault(b => b.Status == Status),
ValueC = third.FirstOrDefault(c => c.Status == Status),
});
foreach (var item in result)
{
Console.WriteLine(item);
}
}
}
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);
Depending on the amount of items that have to be aggregated and the premise that each status occurs only once or never it could make sense to convert your lists to a Dictionary<string, A>, Dictionary<string, B>, etc. to improve the lookup and do something like this in the aggregate:
ValueA = dictFirst.ContainsKey(Status) ? dictFirst[Status] : null
For further improvements (this line makes the lookup twice) you could also factor out a method like this
private static T GetValueOrDefault<T>(IReadOnlyDictionary<string, T> dict, string status)
{
dict.TryGetValue(status, out T value);
return value;
}
And within the .Select() method call it with
ValueA = GetValueOrDefault(firstDict, Status);
Creating the dictionary for the list could be done with:
var firstDict = first.ToDictionary(a => a.Status);

With assumption that status names are unique per list here is a solution
in a single query with help of switch expressions (available since C# 8.0):
using System;
using System.Linq;
using System.Collections.Generic;
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var result = first
// concat lists together
.Cast<object>()
.Concat(second)
.Concat(third)
// group on Status value with help of switch expression
.GroupBy(el => el switch {
A a => a.Status,
B b => b.Status,
C c => c.Status,
},
// project groups with anonymous type
(Status, group) => new {
Status,
ValueA = group.OfType<A>().Select(a => a.ValueA).Cast<int?>().FirstOrDefault(),
ValueB = group.OfType<B>().Select(b => b.ValueB).Cast<int?>().FirstOrDefault(),
ValueC = group.OfType<C>().Select(c => c.ValueC).Cast<int?>().FirstOrDefault()
});
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);

This can't using left join.First you must get all keies,then using all keies left join other lists:
var keys = first.Select(item => item.Status).ToList();
keys.AddRange(second.Select(item => item.Status));
keys.AddRange(third.Select(item => item.Status));
keys = keys.Distinct().ToList();
var result = (from k in keys JOIN
f in first on k equals f.Status into tmp0
from f in tmp0.DefaultIfEmpty()
join s in second on k equals s.Status into tmp1
from s in tmp1.DefaultIfEmpty()
join t in third on k equals t.Status into tmp2
from t in tmp2.DefaultIfEmpty()
select new {
Status = k,
ValueA = f?.ValueA,
ValueB = s?.ValueB,
ValueC = t?.ValueC,
}
).ToList();

How to combine two different GroupedStreams in Rx.NET?

This question is similar, but it does not apply to my case, since the user needed the merge observable streams from the same IGroupedObservable, while I want to combine streams from different groups.
I have the following structures and streams:
type A = {
Id: int
Value: int
}
type B = {
Id: int
Value: int
}
//subjects to test input, just any source of As and Bs
let subjectA: Subject<A> = Subject.broadcast
let subjectB: Subject<B> = Subject.broadcast
//grouped streams
let groupedA: IObservable<<IGroupedObservable<int, A>> = Observable.groupBy (fun a -> a.Id) subjectA
let groupedB: IObservable<<IGroupedObservable<int, B>> = Observable.groupBy (fun b -> b.Id) subjectB
I want to somehow merge the internal observables of A and B when groupedA.Key = groupedB.Key, and get an observable of (A, B) pairs where A.Id = B.Id
The signature I want is something like
IObservable<IGroupedObservable<int, A>> -> IObservable<IGroupedObservable<int, B>> -> IObservable<IGroupedObservable<int, (A, B)>> where for all (A, B), A.Id = B.Id
I tried a bunch of combineLatest, groupJoin, filters and maps variations, but with no success.
I'm using F# with Rx.Net and FSharp.Control.Reactive, but if you know the answer in C# (or any language, really) please post it

Here is a custom operator GroupJoin that you could use. It is based on the Select, Merge, GroupBy and Where operators:
/// <summary>
/// Groups and joins the elements of two observable sequences, based on common keys.
/// </summary>
public static IObservable<(TKey Key, IObservable<TLeft> Left, IObservable<TRight> Right)>
GroupJoin<TLeft, TRight, TKey>(
this IObservable<TLeft> left,
IObservable<TRight> right,
Func<TLeft, TKey> leftKeySelector,
Func<TRight, TKey> rightKeySelector,
IEqualityComparer<TKey> keyComparer = null)
{
// Arguments validation omitted
keyComparer ??= EqualityComparer<TKey>.Default;
return left
.Select(x => (x, (TRight)default, Type: 1, Key: leftKeySelector(x)))
.Merge(right.Select(x => ((TLeft)default, x, Type: 2, Key: rightKeySelector(x))))
.GroupBy(e => e.Key, keyComparer)
.Select(g => (
g.Key,
g.Where(e => e.Type == 1).Select(e => e.Item1),
g.Where(e => e.Type == 2).Select(e => e.Item2)
));
}
Usage example:
var subjectA = new Subject<A>();
var subjectB = new Subject<B>();
IObservable<IGroupedObservable<int, (A, B)>> query = subjectA
.GroupJoin(subjectB, a => a.Id, b => b.Id)
.SelectMany(g => g.Left.Zip(g.Right, (a, b) => (g.Key, a, b)))
.GroupBy(e => e.Key, e => (e.a, e.b));

I'm not clear if this is what you want. So it may be helpful to clarify first with runner code. Assuming the following runner code:
var aSubject = new Subject<A>();
var bSubject = new Subject<B>();
var groupedA = aSubject.GroupBy(a => a.Id);
var groupedB = bSubject.GroupBy(b => b.Id);
//Initiate solution
solution.Merge()
.Subscribe(t => Console.WriteLine($"(Id = {t.a.Id}, AValue = {t.a.Value}, BValue = {t.b.Value} )"));
aSubject.OnNext(new A() { Id = 1, Value = 1 });
aSubject.OnNext(new A() { Id = 1, Value = 2 });
bSubject.OnNext(new B() { Id = 1, Value = 10 });
bSubject.OnNext(new B() { Id = 1, Value = 20 });
bSubject.OnNext(new B() { Id = 1, Value = 30 });
Do you want to see the following output:
(Id = 1, AValue = 1, BValue = 10)
(Id = 1, AValue = 2, BValue = 10)
(Id = 1, AValue = 1, BValue = 20)
(Id = 1, AValue = 2, BValue = 20)
(Id = 1, AValue = 1, BValue = 30)
(Id = 1, AValue = 2, BValue = 30)
If that's the case, you can get to solution as follows:
var solution = groupedA.Merge()
.Join(groupedB.Merge(),
_ => Observable.Never<Unit>(),
_ => Observable.Never<Unit>(),
(a, b) => (a, b)
)
.Where(t => t.a.Id == t.b.Id)
.GroupBy(g => g.a.Id);
I'll caution that there are memory/performance impacts here if this is part of a long-running process. This keeps all A and B objects in memory indefinitely, waiting to see if they can be paired off. To shorten the amount of time they're kept in memory, change the Observable.Never() calls to appropriate windows for how long to keep each object in memory.

As a start, this has the signature you want:
let cartesian left right =
rxquery {
for a in left do
for b in right do
yield a, b
}
let mergeGroups left right =
rxquery {
for (leftGroup : IGroupedObservable<'key, 'a>) in left do
for (rightGroup : IGroupedObservable<'key, 'b>) in right do
if leftGroup.Key = rightGroup.Key then
let merged = cartesian leftGroup rightGroup
yield {
new IGroupedObservable<_, _> with
member __.Key = leftGroup.Key
member __.Subscribe(observer) = merged.Subscribe(observer)
}
}
However, in my testing, the groups are all empty. I don't have enough Rx experience to know why, but perhaps someone else does.

How to perform a recursive function in c# using linq

I have the following table:
Column_1 Column_2
val_1 | val_14
val_2 | val_17
val_1 | val_2
val_4 | null
val_1 | val_3
val_20 | val_4
val_17 | null
val_2 | val_20
val_14 | val_6
val_14 | null
Val_6 | null
val_3 | val_30
val_3 | val_19
I want to display Column_2 values
Eg: Select with Column_1 = val_1 will return (val_14, val_2, val_3) from Column_2.
Now, I want for each values in (val_14, val_2, val_3) to return also values from Column_2.
In summary:
val_1 => (val_14, val_2, val_3)
val_14 => (val_6, null)
val_6 => null
val_2 => (val_17, val_20)
val_17 => null
val_20 => (val_4)
val_4 => null
val_3 => (val_30, val_19)
etc...
Final output (val_14, val_2, val_3, val_6, val_17, val_20, val_4, val_30, val_19)
I have a function, with string parameter and list of all rows data
public List<string> MyFunction(string value)
{
return (from s in myListOfData where value.Contains(s.Column_1) select s).ToList();
}
This function return only the first level.
how can i do this query to display all children in linq? My attempts are unsuccessful.
Thank you

Desired order of records is a bit-tricky to get - looks like at first you want plain 1-st level and then traverse tree in down-left direction. It's a bit tricky.
If order is not important you can:
public List<string> MyFunction(string value)
{
return myListOfData
.Where(x => value.Contains(x.Column_1) && x.Column2 != null)
.Select(x => x.Column2)
.Aggregate(new List<string>(), (t, x) => {
t.Add(x);
t.AddRange(MyFunction(x));
return t; })
.ToList();
}
However, this results in lots of intermediate List creation. So better have enumerable:
public IEnumerable<string> MyFunction(string value)
{
foreach (var record in myListOfData.Where(x => value.Contains(x.Column_1) && x.Column2 != null)
{
yield return record.Column_2;
foreach (var child in MyFunction(record.Column_2))
yield return child;
}
}
And then take ToList() of this IEnumerable.
Still, if order is important you need two functions:
public List<string> MyFunction(string value)
{
.Where(x => value.Contains(x.Column_1) && x.Column2 != null)
.Select(x => new Tuple<string, IEnumerable<string>>(x.Column2, Traverse(x.Column2))
.Aggregate(new List<string>(), (t, x) => {
t.Add(x.Item1);
t.AddRange(x.Item2);
return t; })
.ToList();
}
public IEnumerable<string> Traverse(string value)
{
foreach (var record in myListOfData.Where(x => value.Contains(x.Column_1) && x.Column2 != null)
{
yield return record.Column_2;
foreach (var child in MyFunction(record.Column_2))
yield return child;
}
}

Assuming this is all in memory and nothing to do with an ORM.
You could use recursion. However, queues and stacks are safer and easier to debug.
Given some weird ill-defined class
public class Data
{
public int? Col1 { get; set; }
public int? Col2 { get; set; }
public Data(int? col1, int? col2)
{
Col1 = col1;
Col2 = col2;
}
}
You could use an iterator method and a Queue
public static IEnumerable<int> GetRecusive(List<Data> source,int val)
{
var q = new Queue<int>();
q.Enqueue(val);
while (q.Any())
{
var current = q.Dequeue();
var potential = source.Where(x => x.Col1 == current && x.Col2 != null);
foreach (var item in potential)
{
yield return item.Col2.Value;
q.Enqueue(item.Col2.Value);
}
}
}
Usage
// some ill-defined test data
var list = new List<Data>()
{
new Data(1, 14),
new Data(2, 17),
new Data(1, 2),
new Data(4, null),
new Data(1, 3),
new Data(20, 4),
new Data(17, null),
new Data(2, 20),
new Data(14, 6),
new Data(14, null),
new Data(6, null),
new Data(3, 30),
new Data(3, 19),
};
var results = GetRecusive(list,1);
// compose as a comma separated list
Console.WriteLine(string.Join(", ",results));
Output
14, 2, 3, 6, 17, 20, 30, 19, 4
Full Demo Here
If you like, you can turn it into an extension method to give you a LINQ Chain Method feel
public static IEnumerable<int> GetRecusive(this List<Data> source, int val)
Important Note : If you have a circular references then kiss your app goodbye. This will be the same for recursion or queues. If you need to protect against this, then I suggest using a HashSet of visited ids

How to remove duplicate pairs in a List

I got a List with pairs of integers. How do I remove pairs if they're duplicates? Distinct wont work cause the pair could be (2, 1) instead of (1, 2).
My list looks like this:
1, 2
2, 3
3, 1
3, 2
2, 4
4, 3
... I don't need (2, 3) and (3, 2)
I made a public struct FaceLine with public int A and B, then var faceline = new List<FaceLine>();.
I'm new to C# and lost.

You could use a custom IEqualityComparer<FaceLine>:
public class UnorderedFacelineComparer : IEqualityComparer<FaceLine>
{
public bool Equals(FaceLine x, FaceLine y)
{
int x1 = Math.Min(x.A, x.B);
int x2 = Math.Max(x.A, x.B);
int y1 = Math.Min(y.A, y.B);
int y2 = Math.Max(y.A, y.B);
return x1 == y1 && x2 == y2;
}
public int GetHashCode(FaceLine obj)
{
return obj.A ^ obj.B;
}
}
Then the query was very simple:
var comparer = new UnorderedFacelineComparer();
List<FaceLine> nonDupList = faceLine
.GroupBy(fl => fl, comparer)
.Where(g => g.Count() == 1)
.Select(g => g.First())
.ToList();
If you wanted to keep one of the duplicates you just need to remove the Where:
List<FaceLine> nonDupList = faceLine
.GroupBy(fl => fl, comparer)
.Select(g => g.First())
.ToList();

If you're happy using the common DistinctBy Linq extension (available via NuGet) you can do this fairly simply like so:
var result = list.DistinctBy(x => (x.A > x.B) ? (x.A, x.B) : (x.B, x.A));
Sample console app:
using System;
using System.Collections.Generic;
using MoreLinq;
namespace Demo
{
class Test
{
public Test(int a, int b)
{
A = a;
B = b;
}
public readonly int A;
public readonly int B;
public override string ToString()
{
return $"A={A}, B={B}";
}
}
class Program
{
static void Main()
{
var list = new List<Test>
{
new Test(1, 2),
new Test(2, 3),
new Test(3, 1),
new Test(3, 2),
new Test(2, 4),
new Test(4, 3)
};
var result = list.DistinctBy(x => (x.A > x.B) ? (x.A, x.B) : (x.B, x.A));
foreach (var item in result)
Console.WriteLine(item);
}
}
}

Using Linq :
List<List<int>> data = new List<List<int>>() {
new List<int>() {1, 2},
new List<int>() {2, 3},
new List<int>() {3, 1},
new List<int>() {3, 2},
new List<int>() {2, 4},
new List<int>() {4, 3}
};
List<List<int>> results =
data.Select(x => (x.First() < x.Last())
? new { first = x.First(), last = x.Last() }
: new { first = x.Last(), last = x.First() })
.GroupBy(x => x)
.Select(x => new List<int>() { x.First().first, x.First().last }).ToList();

Form a set of sets and you get the functionality for free (each smaller set contains exactly two integers).

How do I merge records using LINQ?

I'd like to merge two records using a condition for each column in the row. I'd give you a code sample but I don't know where to start.
class Foo
{
public int i {get;set;}
public int b{get;set;}
public string first{get;set;}
public string last{get;set;}
}
//...
var list = new List<Foo>() {
new Foo () { i=1, b=0, first="Vince", last="P"},
new Foo () { i=1, b=1, first="Vince", last="P"},
new Foo () { i=1, b=0, first="Bob", last="Z"},
new Foo () { i=0, b=1, first="Bob", last="Z"},
} ;
// This is how I'd like my result to look like
// Record 1 - i = 1, b = 1, first="Vince", last = "P"
// Record 2 - i = 1, b = 1, first="Bob", last = "Z"

You can group the result, then aggregate the fields from the items in the group:
var result = list.GroupBy(f => f.first).Select(
g => new Foo() {
b = g.Aggregate(0, (a, f) => a | f.b),
i = g.Aggregate(0, (a, f) => a | f.i),
first = g.Key,
last = g.First().last
}
);

You could use the Aggregate method in LINQ.
First add a method to Foo, say Merge that returns a new Foo based on your merging rules.
public Foo Merge (Foo other)
{
// Implement merge rules here ...
return new Foo {..., b=Math.Max(this.b, other,b), ...};
}
You could also, instead, create a helper method outside the Foo class that does the merging.
Now use Aggregate over your list, using the first element as the seed, merging each record with the current aggregate value as you go. Or, instead of using Aggregate (since it's a somewhat contrived use of LINQ in this case), just do:
Foo result = list.First();
foreach (var item in list.Skip(1)) result = result.Merge(item);
How are your merge rules specified?

I found a non-elegant solution that works
var result = list.GroupBy(i=>i.first);
foreach (IGrouping<string, Foo> grp in result)
{
grp.Aggregate ((f1, f2) => {
return new Foo() {
b = f1.b | f2.b,
i = f1.i | f2.i,
first = f1.first,
last = f1.last
};
});
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Strange behavior of EqualityComparer with nullable fields - c#

Related

Join multiple lists of objects in c#

How to combine two different GroupedStreams in Rx.NET?

How to perform a recursive function in c# using linq

How to remove duplicate pairs in a List

How do I merge records using LINQ?

Categories

Resources