most efficient way to search in sorted binary tree - c#

Suppose I have a binary tree, I am give the head of the tree (left values
are smaller than the values on the right), inside the tree there are ip address, e.g:
2.1.1.7
/ \
/ \
1.1.10.17 3.4.4.5
I need to write a function that searches for this specific address.
For now what I did it's Inorder traversal like:
private HashSet<string> adr = new HashSet<string>();
void Inorder(Node root){
if(root.Left != null)
Inorder(root.Left);
adr.Add(root.Data);// <----root.Data it's an ip address (string)
if(root.Right != null)
Inorder(root.Right);
}
Constractor:
private Node root;// <--- points to the root of the addresses tree
public MyClass{
Inorder(root);
}
Finction:
bool FindAddress(string address){
return adr.Contains(address);
}
But in my method I didn't used the fact that the tree is sorted, do you have an idea for better performance idea? with loop/recursion

You could write your FindAddress function as follows to take advantage of the fact the data is sorted:
var node = FindAddress(IPAddress.Parse(searchAddress), assembledTree, new IPAddressCompare());
static Node FindAddress(IPAddress address, Node root, IComparer<IPAddress> addressCompare)
{
if (root == null) return null;
var comp = addressCompare.Compare(IPAddress.Parse(root.Data), address);
if (comp == 0) return root;
if (comp < 0) return FindAddress(address, root.Left, addressCompare);
if (comp > 0) return FindAddress(address, root.Right, addressCompare);
return null;
}
Utilising a custom comparer to compare two different IP addresses by changing their representation to an Int32, considering the bytes at the start of the address most significant.
public class IPAddressCompare : IComparer<IPAddress>
{
public int Compare(IPAddress x, IPAddress y)
{
var intA = BitConverter.ToUInt32(x.GetAddressBytes().Reverse().ToArray(), 0);
var intB = BitConverter.ToUInt32(y.GetAddressBytes().Reverse().ToArray(), 0);
return intB.CompareTo(intA);
}
}
Full example: https://dotnetfiddle.net/viRy5b

I would go with the simple list and use the Binary search with a comparer. Avoids the hastle of creating your own tree and performance is best.
using System;
using System.Collections.Generic;
using System.Net;
class app
{
static void Main()
{
List<IPAddress> sortedIPs = new List<IPAddress>();
AddToList(sortedIPs, new byte[4] { 6, 10, 54, 100 });
AddToList(sortedIPs, new byte[4] { 143, 0, 254, 10 });
AddToList(sortedIPs, new byte[4] { 48, 0, 0, 1 });
AddToList(sortedIPs, new byte[4] { 0, 0, 82, 19 });
AddToList(sortedIPs, new byte[4] { 13, 0, 254, 1 });
AddToList(sortedIPs, new byte[4] { 63, 93, 4, 111 });
AddToList(sortedIPs, new byte[4] { 98, 3, 74, 1 });
AddToList(sortedIPs, new byte[4] { 98, 4, 74, 1 });
AddToList(sortedIPs, new byte[4] { 98, 3, 14, 1 });
AddToList(sortedIPs, new byte[4] { 98, 3, 14, 2 });
AddToList(sortedIPs, new byte[4] { 7, 175, 25, 65 });
AddToList(sortedIPs, new byte[4] { 46, 86, 21, 91 });
IPAddress findAddress = new IPAddress(new byte[4] { 48, 0, 0, 1 });
int index = sortedIPs.BinarySearch(findAddress, new IPAddressComparer());
}
private static void AddToList(List<IPAddress> list, byte[] address)
{
IPAddress a1 = new IPAddress(address);
IPAddressComparer ipc = new IPAddressComparer();
int index = list.BinarySearch(a1, ipc);
if (index >= 0) throw new Exception("IP address already exists in list");
list.Insert(~index, a1);
}
public class IPAddressComparer : IComparer<IPAddress>
{
public int Compare(IPAddress x, IPAddress y)
{
byte[] xb = x.GetAddressBytes();
byte[] yb = y.GetAddressBytes();
for (int i = 0; i < 4; i++)
{
int result = xb[i].CompareTo(yb[i]);
if (result != 0) return result;
}
return 0;
}
}
}

Related

How to subdivide a list of doubles into n chunks having first element of next series be the last of previous series

I have a list like:
1.-10
2.-11
3.-12
4.-13
5.-14
6.-15
7.-16
8.-17
9.-18
10.-19
11.-20
I want to split the list in n chunks, for instance n=4 would result in 3 lists:
first list
1.-10
2.-11
3.-12
4.-13
second list
1.-13
2.-14
3.-15
4.-16
third list
1.-16
2.-17
3.-18
4.-19
As this is an incomplete list it is discarded
1.-19
2.-20
I am doing
public static void Main()
{
var list = new List<double>()
{
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
};
var subLists = SplitList(list, 3);
}
public static List<List<T>> SplitList<T>(IList<T> source, int chunkSize)
{
var chunks = new List<List<T>>();
List<T> chunk = null;
var total = source.Count;
var discarded = total % chunkSize;
for (var i = 0; i < total - discarded; i++)
{
if (i % chunkSize == 0)
{
chunk = new List<T>(chunkSize);
chunks.Add(chunk);
}
chunk?.Add(source[i]);
}
return chunks;
}
But it gets:
1.-10
2.-11
3.-12
4.-13
1.-14
2.-15
3.-16
4.-17
Use skip and take linq functions:
public static void Main()
{
var list = new List<double>() { 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 };
List<List<double>> chunks = SplitList(list, 4);
}
public static List<List<T>> SplitList<T>(IList<T> source, int chunkSize)
{
List<List<T>> chunks = new List<List<T>>();
for (int i = 0; i < source.Count; i += (chunkSize - 1))
{
var subList = source.Skip(i).Take(chunkSize).ToList();
if (subList.Count == chunkSize)
{
chunks.Add(subList);
}
}
return chunks;
}
Based on this answer you can use for that task LINQ: for Split List into Sublists with LINQ:
using System.Collections.Generic;
using System.Linq;
namespace SplitExample
{
public class Program
{
public static void Main()
{
var list = new List<double>()
{
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
};
var subLists = Split<double>(list, 3);
}
public static List<List<T>> Split<T>(List<T> source, int chunkSize)
{
return source
.Select((x, i) => new { Index = i, Value = x })
.GroupBy(x => x.Index / chunkSize)
.Select(x => x.Select(v => v.Value).ToList())
.ToList();
}
}
}

List grouping dynamically

I want to group a list that includes integer List<int>.
List<CNode> cNodes
and the CNode is
public class CNode
{
public List<int> Elements;
// ...
}
I can group the cNodes like that
var groups = cNodes.GroupBy(node => node.Elements[0]);
foreach (var group in groups )
{
// ...
}
but as you see the groupping is depends the first element, I want to group it by all elements
For example if node.Elements.Count == 5 expected grouping result should be the same as for:
var groups = cNodes.GroupBy(node => new
{
A = node.Elements[0],
B = node.Elements[1],
C = node.Elements[2],
D = node.Elements[3],
E = node.Elements[4]
});
I couldn't find the solution.
Thanks.
You can use something like node.Take(5) with a proper IEqualityComparer like this:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp
{
class Program
{
static void Main(string[] args)
{
var cNodes = new List<CNode>
{
new CNode{Elements = new List<int>{ 0, 0, 1, 1, 1 } },
new CNode{Elements = new List<int>{ 0, 0, 0, 1, 1 } },
new CNode{Elements = new List<int>{ 0, 1, 1, 0 } },
new CNode{Elements = new List<int>{ 0, 1, 1, 0, 0 } },
new CNode{Elements = new List<int>{ 0, 0, 0, 0 } },
new CNode{Elements = new List<int>{ 0, 0, 0, 0 } }
};
Console.WriteLine("\tGroup by 2:");
foreach (var group in cNodes.GroupByElements(2))
Console.WriteLine($"{string.Join("\n", group)}\n");
Console.WriteLine("\tGroup by 3:");
foreach (var group in cNodes.GroupByElements(3))
Console.WriteLine($"{string.Join("\n", group)}\n");
Console.WriteLine("\tGroup by all:");
foreach (var group in cNodes.GroupByElements())
Console.WriteLine($"{string.Join("\n", group)}\n");
}
}
static class CNodeExtensions
{
public static IEnumerable<IGrouping<IEnumerable<int>, CNode>> GroupByElements(this IEnumerable<CNode> nodes) =>
nodes.GroupByElements(nodes.Min(node => node.Elements.Count));
public static IEnumerable<IGrouping<IEnumerable<int>, CNode>> GroupByElements(this IEnumerable<CNode> nodes, int count) =>
nodes.GroupBy(node => node.Elements.Take(count), new SequenceCompare());
private class SequenceCompare : IEqualityComparer<IEnumerable<int>>
{
public bool Equals(IEnumerable<int> x, IEnumerable<int> y) => x.SequenceEqual(y);
public int GetHashCode(IEnumerable<int> obj)
{
unchecked
{
var hash = 17;
foreach (var i in obj)
hash = hash * 23 + i.GetHashCode();
return hash;
}
}
}
}
internal class CNode
{
public List<int> Elements;
public override string ToString() => string.Join(", ", Elements);
}
}
Output is:
Group by 2:
0, 0, 1, 1, 1
0, 0, 0, 1, 1
0, 0, 0, 0
0, 0, 0, 0
0, 1, 1, 0
0, 1, 1, 0, 0
Group by 3:
0, 0, 1, 1, 1
0, 0, 0, 1, 1
0, 0, 0, 0
0, 0, 0, 0
0, 1, 1, 0
0, 1, 1, 0, 0
Group by all:
0, 0, 1, 1, 1
0, 0, 0, 1, 1
0, 1, 1, 0
0, 1, 1, 0, 0
0, 0, 0, 0
0, 0, 0, 0
You wrote:
I want to group it by all elements
The solution given by Alex will only group by a limited number of elements. You said you want to group it by all elements, even if you have a CNode with 100 elements. Besides: his solution also crashes if property Elements of one of the CNodes equals null.
So let's create a solution that meets your requirement.
The return value will be a sequence of groups, where every group has a Key, which is a sequence of CNodes. All elements in the group are all source CNodes that have a property Elements equal to the Key.
With equal you mean SequenceEqual. So Elements[0] == Key[0] and Elements[1] == Key[1], etc.
And of course, you want to decide when Elements[0] equals Key[0]: do you want to compare by reference (same object)? or are two CNodes equal if they have the same property values? Or do you want to specify a IEqualityComparer<CNode>, so that you can see they are equal if they have the same Name or Id?
// overload without IEqualityComparer, calls the overload with IEqualityComparer:
IEnumerable<IGrouping<IEnumerable<Cnode>, CNode>> GroupBy(
this IEnumerable<CNode> cNodes)
{
return GroupBy(cNodes, null);
}
// overload with IEqualityComparer; use default CNode comparer if paramer equals null
IEnumerable<IGrouping<IEnumerable<Cnode>, CNode>> GroupBy(
this IEnumerable<CNode> cNodes,
IEqualityComparer<CNode> cNodeComparer)
{
// TODO: check cNodes != null
if (cNodeComparer == null) cNodeComparer = EqualityComparer<CNode>.Default;
CNodeSequenceComparer nodeSequenceComparer = new CNodeSequenceComparer()
{
CNodeComparer = cNodeComparer,
}
return sequenceComparer.GroupBy(nodeSequenceComparer);
}
You've noticed I've transferred my problem to a new EqualityComparer: this compare takes two sequences of CNodes and declares them equal if they SequenceEqual, using the provided IEqualityComparer<CNode>:
class CNodeSequenceComparer : IEqualityComparer<IEnumerable<CNode>>
{
public IEqualityComparer<CNode> CNodeComparer {get; set;}
public bool Equals(IEnumerable<CNode> x, IEnumerable<CNode> y)
{
// returns true if same sequence, using CNodeComparer
// TODO: implement
}
}
One of the things we have to keep in mind, is that your property Elements might have a value null (after all, you didn't specify that this isn't the case)
public bool Equals(IEnumerable<CNode> x, IEnumerable<CNode> y)
{
if (x == null) return y == null; // true if both null
if (y == null) return false; // false because x not null
// optimizations: true if x and y are same object; false if different types
if (Object.ReferenceEquals(x, y) return true;
if (x.GetType() != y.GetType()) return false;
return x.SequenceEquals(y, this.CNodeComparer);
}

c# Linq - Get all elements that contains a list of integers

I have a list of Objects with a list of types inside of each Object, somthing very similar like this:
public class ExampleObject
{
public int Id {get; set;}
public IEnumerable <int> Types {get;set;}
}
For example:
var typesAdmited = new List<int> { 13, 11, 67, 226, 82, 1, 66 };
And inside the list of Object I have an object like this:
Object.Id = 288;
Object.Types = new List<int> { 94, 13, 11, 67, 254, 256, 226, 82, 1, 66, 497, 21};
But when I use linq to get all Object who has the types admited I get any results.
I am trying this:
var objectsAdmited = objects.Where(b => b.Types.All(t => typesAdmited.Contains(t)));
Example:
var typesAdmited = new List<int> { 13, 11, 67, 226, 82, 1, 66 };
var objectNotAdmited = new ExampleObeject {Id = 1, Types = new List<int> {13,11}};
var objectAdmited = new ExampleObject {Id = 288, Types = new List<int> { 94, 13, 11, 67, 254, 256, 226, 82, 1, 66, 497, 21}};
var allObjects = new List<ExampleObject> { objectNotAdmited, objectAdmited };
var objectsAdmited = allObjects.Where(b => b.Types.All(t => typesAdmited.Contains(t)));
I get:
objectsAdmited = { }
And it should be:
objectsAdmited = { objectAdmited }
You have to change both lists in your LINQ query interchangeably:
var objectsAdmited = allObjects.Where(b => typesAdmited.All(t => b.Types.Contains(t)));
You can solve this using Linq. See the small code block in the middle - the rest is boilerplate to make it a Minimal complete verifyabe answer:
using System;
using System.Collections.Generic;
using System.Linq;
public class ExampleObject
{
public int Id { get; set; }
public IEnumerable<int> Types { get; set; }
}
class Program
{
static void Main (string [] args)
{
var obs = new List<ExampleObject>
{
new ExampleObject
{
Id=1,
Types=new List<int> { 94, 13, 11, 67, 254, 256, 226, 82, 1, 66, 497, 21 }
},
new ExampleObject
{
Id=288,
Types=new List<int> { 94, 13, 11, 67, 256, 226, 82, 1, 66, 497, 21 }
},
};
var must_support = new List<int>{11, 67, 254, 256, 226, 82, }; // only Id 1 fits
var must_support2 = new List<int>{11, 67, 256, 226, 82, }; // both fit
// this is the actual check: see for all objects in obs
// if all values of must_support are in the Types - Listing
var supports = obs.Where(o => must_support.All(i => o.Types.Contains(i)));
var supports2 = obs.Where(o => must_support2.All(i => o.Types.Contains(i)));
Console.WriteLine ("new List<int>{11, 67, 254, 256, 226, 82, };");
foreach (var o in supports)
Console.WriteLine (o.Id);
Console.WriteLine ("new List<int>{11, 67, 256, 226, 82, };");
foreach (var o in supports2)
Console.WriteLine (o.Id);
Console.ReadLine ();
}
}
Output:
new List<int>{11, 67, 254, 256, 226, 82, };
1
new List<int>{11, 67, 256, 226, 82, };
1
288

Delete duplicates in a List of int arrays

having a List of int arrays like:
List<int[]> intArrList = new List<int[]>();
intArrList.Add(new int[3] { 0, 0, 0 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 1, 2, 5 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 12, 22, 54 });
intArrList.Add(new int[5] { 1, 2, 6, 7, 8 });
intArrList.Add(new int[4] { 0, 0, 0, 0 });
How would you remove duplicates (by duplicate I mean element of list has same length and same numbers).
On the example I would remove element { 20, 30, 10, 4, 6 } because it is found twice
I was thinking on sorting the list by element size, then loop each element against rest but I am not sure how to do that.
Other question would be, if using other structure like a Hash would be better... If so how to use it?
Use GroupBy:
var result = intArrList.GroupBy(c => String.Join(",", c))
.Select(c => c.First().ToList()).ToList();
The result:
{0, 0, 0}
{20, 30, 10, 4, 6}
{1, 2, 5}
{12, 22, 54}
{1, 2, 6, 7, 8}
{0, 0, 0, 0}
EDIT: If you want to consider {1,2,3,4} be equal to {2,3,4,1} you need to use OrderBy like this:
var result = intArrList.GroupBy(p => string.Join(", ", p.OrderBy(c => c)))
.Select(c => c.First().ToList()).ToList();
EDIT2: To help understanding how the LINQ GroupBy solution works consider the following method:
public List<int[]> FindDistinctWithoutLinq(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
{
string key = string.Join(",", item.OrderBy(c=>c));
if (!dic.ContainsKey(key))
{
dic.Add(key, item);
}
}
return dic.Values.ToList();
}
You can define your own implementation of IEqualityComparer and use it together with IEnumerable.Distinct:
class MyComparer : IEqualityComparer<int[]>
{
public int GetHashCode(int[] instance) { return 0; } // TODO: better HashCode for arrays
public bool Equals(int[] instance, int[] other)
{
if (other == null || instance == null || instance.Length != other.Length) return false;
return instance.SequenceEqual(other);
}
}
Now write this to get only distinct values for your list:
var result = intArrList.Distinct(new MyComparer());
However if you want different permutations also you should implement your comparer this way:
public bool Equals(int[] instance, int[] other)
{
if (ReferenceEquals(instance, other)) return true; // this will return true when both arrays are NULL
if (other == null || instance == null) return false;
return instance.All(x => other.Contains(x)) && other.All(x => instance.Contains(x));
}
EDIT: For a better GetashCode-implementation you may have a look at this post as also suggested in #MickĀ“s answer.
Well lifting code from here and here. A more generic implementation of GetHashCode would make this more generic, however I believe the implementation below is the most robust
class Program
{
static void Main(string[] args)
{
List<int[]> intArrList = new List<int[]>();
intArrList.Add(new int[3] { 0, 0, 0 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 1, 2, 5 });
intArrList.Add(new int[5] { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new int[3] { 12, 22, 54 });
intArrList.Add(new int[5] { 1, 2, 6, 7, 8 });
intArrList.Add(new int[4] { 0, 0, 0, 0 });
var test = intArrList.Distinct(new IntArrayEqualityComparer());
Console.WriteLine(test.Count());
Console.WriteLine(intArrList.Count());
}
public class IntArrayEqualityComparer : IEqualityComparer<int[]>
{
public bool Equals(int[] x, int[] y)
{
return ArraysEqual(x, y);
}
public int GetHashCode(int[] obj)
{
int hc = obj.Length;
for (int i = 0; i < obj.Length; ++i)
{
hc = unchecked(hc * 17 + obj[i]);
}
return hc;
}
static bool ArraysEqual<T>(T[] a1, T[] a2)
{
if (ReferenceEquals(a1, a2))
return true;
if (a1 == null || a2 == null)
return false;
if (a1.Length != a2.Length)
return false;
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = 0; i < a1.Length; i++)
{
if (!comparer.Equals(a1[i], a2[i])) return false;
}
return true;
}
}
}
Edit: a Generic implementation of IEqualityComparer for an arrays of any type:-
public class ArrayEqualityComparer<T> : IEqualityComparer<T[]>
{
public bool Equals(T[] x, T[] y)
{
if (ReferenceEquals(x, y))
return true;
if (x == null || y == null)
return false;
if (x.Length != y.Length)
return false;
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = 0; i < x.Length; i++)
{
if (!comparer.Equals(x[i], y[i])) return false;
}
return true;
}
public int GetHashCode(T[] obj)
{
int hc = obj.Length;
for (int i = 0; i < obj.Length; ++i)
{
hc = unchecked(hc * 17 + obj[i].GetHashCode());
}
return hc;
}
}
Edit2: If ordering of the integers within the arrays doesn't matter I would
var test = intArrList.Select(a => a.OrderBy(e => e).ToArray()).Distinct(comparer).ToList();
List<int[]> CopyString1 = new List<int[]>();
CopyString1.AddRange(intArrList);
List<int[]> CopyString2 = new List<int[]>();
CopyString2.AddRange(intArrList);
for (int i = 0; i < CopyString2.Count(); i++)
{
for (int j = i; j < CopyString1.Count(); j++)
{
if (i != j && CopyString2[i].Count() == CopyString1[j].Count())
{
var cnt = 0;
for (int k = 0; k < CopyString2[i].Count(); k++)
{
if (CopyString2[i][k] == CopyString1[j][k])
cnt++;
else
break;
}
if (cnt == CopyString2[i].Count())
intArrList.RemoveAt(i);
}
}
}
Perf comparison of #S.Akbari's and #Mick's solutions using BenchmarkDotNet
EDIT:
SAkbari_FindDistinctWithoutLinq has redundant call to ContainsKey, so i added impoved and faster version: SAkbari_FindDistinctWithoutLinq2
Method | Mean | Error | StdDev |
--------------------------------- |---------:|----------:|----------:|
SAkbari_FindDistinctWithoutLinq | 4.021 us | 0.0723 us | 0.0676 us |
SAkbari_FindDistinctWithoutLinq2 | 3.930 us | 0.0529 us | 0.0495 us |
SAkbari_FindDistinctLinq | 5.597 us | 0.0264 us | 0.0234 us |
Mick_UsingGetHashCode | 6.339 us | 0.0265 us | 0.0248 us |
BenchmarkDotNet=v0.10.13, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.248)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical cores and 4 physical cores
Frequency=3515625 Hz, Resolution=284.4444 ns, Timer=TSC
.NET Core SDK=2.1.100
[Host] : .NET Core 2.0.5 (CoreCLR 4.6.26020.03, CoreFX 4.6.26018.01), 64bit RyuJIT
DefaultJob : .NET Core 2.0.5 (CoreCLR 4.6.26020.03, CoreFX 4.6.26018.01), 64bit RyuJIT
Benchmark:
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp1
{
public class Program
{
List<int[]> intArrList = new List<int[]>
{
new int[] { 0, 0, 0 },
new int[] { 20, 30, 10, 4, 6 }, //this
new int[] { 1, 2, 5 },
new int[] { 20, 30, 10, 4, 6 }, //this
new int[] { 12, 22, 54 },
new int[] { 1, 2, 6, 7, 8 },
new int[] { 0, 0, 0, 0 }
};
[Benchmark]
public List<int[]> SAkbari_FindDistinctWithoutLinq() => FindDistinctWithoutLinq(intArrList);
[Benchmark]
public List<int[]> SAkbari_FindDistinctWithoutLinq2() => FindDistinctWithoutLinq2(intArrList);
[Benchmark]
public List<int[]> SAkbari_FindDistinctLinq() => FindDistinctLinq(intArrList);
[Benchmark]
public List<int[]> Mick_UsingGetHashCode() => FindDistinctLinq(intArrList);
static void Main(string[] args)
{
var summary = BenchmarkRunner.Run<Program>();
}
public static List<int[]> FindDistinctWithoutLinq(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
{
string key = string.Join(",", item.OrderBy(c => c));
if (!dic.ContainsKey(key))
{
dic.Add(key, item);
}
}
return dic.Values.ToList();
}
public static List<int[]> FindDistinctWithoutLinq2(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
dic.TryAdd(string.Join(",", item.OrderBy(c => c)), item);
return dic.Values.ToList();
}
public static List<int[]> FindDistinctLinq(List<int[]> lst)
{
return lst.GroupBy(p => string.Join(", ", p.OrderBy(c => c)))
.Select(c => c.First().ToArray()).ToList();
}
public static List<int[]> UsingGetHashCode(List<int[]> lst)
{
return lst.Select(a => a.OrderBy(e => e).ToArray()).Distinct(new IntArrayEqualityComparer()).ToList();
}
}
public class IntArrayEqualityComparer : IEqualityComparer<int[]>
{
public bool Equals(int[] x, int[] y)
{
return ArraysEqual(x, y);
}
public int GetHashCode(int[] obj)
{
int hc = obj.Length;
for (int i = 0; i < obj.Length; ++i)
{
hc = unchecked(hc * 17 + obj[i]);
}
return hc;
}
static bool ArraysEqual<T>(T[] a1, T[] a2)
{
if (ReferenceEquals(a1, a2))
return true;
if (a1 == null || a2 == null)
return false;
if (a1.Length != a2.Length)
return false;
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = 0; i < a1.Length; i++)
{
if (!comparer.Equals(a1[i], a2[i])) return false;
}
return true;
}
}
}
Input list;
List<List<int>> initList = new List<List<int>>();
initList.Add(new List<int>{ 0, 0, 0 });
initList.Add(new List<int>{ 20, 30, 10, 4, 6 }); //this
initList.Add(new List<int> { 1, 2, 5 });
initList.Add(new List<int> { 20, 30, 10, 4, 6 }); //this
initList.Add(new List<int> { 12, 22, 54 });
initList.Add(new List<int> { 1, 2, 6, 7, 8 });
initList.Add(new List<int> { 0, 0, 0, 0 });
You can create a result list, and before adding elements you can check if it is already added. I simply compared the list counts and used p.Except(item).Any() call to check if the list contains that element or not.
List<List<int>> returnList = new List<List<int>>();
foreach (var item in initList)
{
if (returnList.Where(p => !p.Except(item).Any() && !item.Except(p).Any()
&& p.Count() == item.Count() ).Count() == 0)
returnList.Add(item);
}
You can use a HashSet.
HashSet is a collection used for guarantee uniqueness and you can compare items on collection, Intersect, Union. etc.
Pros: No duplicates, easy to manipulate groups of data, more efficient
Cons: You can't get a specific item in the collection, for example: list[0] doesn't work for HashSets. You can only Enumerating the items. e.g. foreach
Here is an example:
using System;
using System.Collections.Generic;
namespace ConsoleApp2
{
class Program
{
static void Main(string[] args)
{
HashSet<HashSet<int>> intArrList = new HashSet<HashSet<int>>(new HashSetIntComparer());
intArrList.Add(new HashSet<int>(3) { 0, 0, 0 });
intArrList.Add(new HashSet<int>(5) { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new HashSet<int>(3) { 1, 2, 5 });
intArrList.Add(new HashSet<int>(5) { 20, 30, 10, 4, 6 }); //this
intArrList.Add(new HashSet<int>(3) { 12, 22, 54 });
intArrList.Add(new HashSet<int>(5) { 1, 2, 6, 7, 8 });
intArrList.Add(new HashSet<int>(4) { 0, 0, 0, 0 });
// Checking the output
foreach (var item in intArrList)
{
foreach (var subHasSet in item)
{
Console.Write("{0} ", subHasSet);
}
Console.WriteLine();
}
Console.Read();
}
private class HashSetIntComparer : IEqualityComparer<HashSet<int>>
{
public bool Equals(HashSet<int> x, HashSet<int> y)
{
// SetEquals does't set anything. It's a method for compare the contents of the HashSet.
// Such a poor name from .Net
return x.SetEquals(y);
}
public int GetHashCode(HashSet<int> obj)
{
//TODO: implemente a better HashCode
return base.GetHashCode();
}
}
}
}
Output:
0
20 30 10 4 6
1 2 5
12 22 54
1 2 6 7 8
Note: Since 0 is repeated several times, HashSet considers the 0 only
once. If you need diferentiate between 0 0 0 0 and 0 0 0 then you can
replace HashSet<HashSet<int>> for HashSet<List<int>> and implement
a Comparer to the List instead.
You can use this link to learn how to compare a list:
https://social.msdn.microsoft.com/Forums/en-US/2ff3016c-bd61-4fec-8f8c-7b6c070123fa/c-compare-two-lists-of-objects?forum=csharplanguage
If you want to learn more about Collections and DataTypes this course is a perfect place to learn it:
https://app.pluralsight.com/player?course=csharp-collections&author=simon-robinson&name=csharp-collections-fundamentals-m9-sets&clip=1&mode=live
Using MoreLINQ this can be very simple with DistinctBy.
var result = intArrList.DistinctBy(x => string.Join(",", x));
Similar to the GroupBy answer if you want distinction to be irrespective of order just order in the join.
var result = intArrList.DistinctBy(x => string.Join(",", x.OrderBy(y => y)));
EDIT: This is how it's implemented
public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));
return _(); IEnumerable<TSource> _()
{
var knownKeys = new HashSet<TKey>(comparer);
foreach (var element in source)
{
if (knownKeys.Add(keySelector(element)))
yield return element;
}
}
}
So you if you don't need MoreLINQ for anything else you can just use a method like this:
private static IEnumerable<int[]> GetUniqueArrays(IEnumerable<int[]> source)
{
var knownKeys = new HashSet<string>();
foreach (var element in source)
{
if (knownKeys.Add(string.Join(",", element)))
yield return element;
}
}

What's the best way to apply a "Join" method generically similar to how String.Join(...) works?

If I have a string array, for example: var array = new[] { "the", "cat", "in", "the", "hat" }, and I want to join them with a space between each word I can simply call String.Join(" ", array).
But, say I had an array of integer arrays (just like I can have an array of character arrays). I want to combine them into one large array (flatten them), but at the same time insert a value between each array.
var arrays = new[] { new[] { 1, 2, 3 }, new[] { 4, 5, 6 }, new { 7, 8, 9 }};
var result = SomeJoin(0, arrays); // result = { 1, 2, 3, 0, 4, 5, 6, 0, 7, 8, 9 }
I wrote something up, but it is very ugly, and I'm sure that there is a better, cleaner way. Maybe more efficient?
var result = new int[arrays.Sum(a => a.Length) + arrays.Length - 1];
int offset = 0;
foreach (var array in arrays)
{
Buffer.BlockCopy(array, 0, result, offset, b.Length);
offset += array.Length;
if (offset < result.Length)
{
result[offset++] = 0;
}
}
Perhaps this is the most efficient? I don't know... just seeing if there is a better way. I thought maybe LINQ would solve this, but sadly I don't see anything that is what I need.
You can generically "join" sequences via:
public static IEnumerable<T> Join<T>(T separator, IEnumerable<IEnumerable<T>> items)
{
var sep = new[] {item};
var first = items.FirstOrDefault();
if (first == null)
return Enumerable.Empty<T>();
else
return first.Concat(items.Skip(1).SelectMany(i => sep.Concat(i)));
}
This works with your code:
var arrays = new[] { new[] { 1, 2, 3 }, new[] { 4, 5, 6 }, new { 7, 8, 9 }};
var result = Join(0, arrays); // result = { 1, 2, 3, 0, 4, 5, 6, 0, 7, 8, 9 }
The advantage here is that this will work with any IEnumerable<IEnumerable<T>>, and isn't restricted to lists or arrays. Note that this will insert a separate in between two empty sequences, but that behavior could be modified if desired.
public T[] SomeJoin<T>(T a, T[][] arrays){
return arrays.SelectMany((x,i)=> i == arrays.Length-1 ? x : x.Concat(new[]{a}))
.ToArray();
}
NOTE: The code works seamlessly because of using Array, otherwise we may lose some performance cost to get the Count of the input collection.
This may not be the most efficient, but it is quite extensible:
public static IEnumerable<T> Join<T>(this IEnumerable<IEnumerable<T>> source, T separator)
{
bool firstTime = true;
foreach (var collection in source)
{
if (!firstTime)
yield return separator;
foreach (var value in collection)
yield return value;
firstTime = false;
}
}
...
var arrays = new[] { new[] { 1, 2, 3 }, new[] { 4, 5, 6 }, new[] { 7, 8, 9 }};
var result = arrays.Join(0).ToArray();
// result = { 1, 2, 3, 0, 4, 5, 6, 0, 7, 8, 9 }

Categories