Remove duplicate rows from two dimensional list - c#

I have a two-dimensional list of strings (List<List<string>>).
Is there an easy way to remove the duplicate rows? That is the List<string> that are equal.

Build a custom IEqualityComparer based on SequenceEqual :
class ListComparer : IEqualityComparer<List<string>>
public bool Equals(List<string> x, List<string> y)
if (x == y)
return true ;
if (x == null || y == null)
return false ;
// Order if you need
return x.SequenceEqual(y) ;
public int GetHashCode(List<string> obj)
if (obj == null)
return 0;
return obj.Select(e => e.GetHashCode()).Aggregate(17, (a, b) => 23 * a + b);
Apply Distinct() with the comparer :
List<List<string>> original = ...
var sortedListOfList = original.Distinct(new ListComparer()).ToList() ;

You did not specify if the lists should be compared with or without ordering.
Without ordering it should be:
List<List<string>> source = *yourLists*;
var sortedList = source.Distinct();


Icomparer c# List

I have a list of image name like this {"1.jpg", "10.jpg", "2.jpg"}.
I would like to sort like this {"1.jpg", "2.jpg", "10.jpg"}.
I created this comparer. That means if x or y == "DSC_10.jpg", so if list is {"DSC_1.jpg", "DSC_10.jpg", "DSC_2.jpg", ...} don't sort and keep the list.
var comparer = new CompareImageName();
return imageUrls;
public class CompareImageName : IComparer<string>
public int Compare(string x, string y)
if (x == null || y == null) return 0;
var l = x.Split('/');
var l1 = y.Split('/');
int a, b;
var rs = int.TryParse(l[l.Length - 1].Split('.')[0], out a);
var rs2 = int.TryParse(l1[l1.Length - 1].Split('.')[0], out b);
if (!rs || !rs2) return 0;
if (a == b || a == 0 && b == 0) return 0;
return a > b ? 1 : -1;
This sort correctly with name {"1.jpg", "10.jpg", "2.jpg"}, but incorrectly if list is {"DSC_1.jpg", "DSC_10.jpg", "DSC_2.jpg", ...}.
I read in MSDN:
What wrong with my code?
I think you're better off doing a bit of Regex for this. Try this solution:
public class CompareImageName : IComparer<string>
public int Compare(string x, string y)
if (x == null || y == null) return 0;
var regex = new Regex(#"/(((?<prefix>\w*)_)|)((?<number>\d+))\.jpg$");
var mx = regex.Match(x);
var my = regex.Match(y);
var r = mx.Groups["prefix"].Value.CompareTo(my.Groups["prefix"].Value);
if (r == 0)
r = int.Parse(mx.Groups["number"].Value).CompareTo(int.Parse(my.Groups["number"].Value));
return r;
Apart from the Regex string itself this is easier to follow the logic.
Here is your solution check this example, following class will do the comparison
public class NumericCompare : IComparer<string>
public int Compare(string x, string y)
int input1,input2;
input2= int.Parse(y.Substring(y.IndexOf('_')+1).Split('.')[0]);
return Comparer<int>.Default.Compare(input1,input2);
You can make use of this class like the following:
var imageUrls = new List<string>() { "DSC_1.jpg", "DSC_10.jpg", "DSC_2.jpg" };
var comparer = new NumericCompare();
Try this with simple OrderBy
var SortedList = imageUrls.OrderBy(
Basically what you want to do is sort by the numeric part within the string. You are almost there. You just have to handle the part when you split a case like this DSC_2.jpg using a . then the first part is not all digits. So you need to get digits and then compare those. Here is the code. Please note I have made the assumption you will have backslash and if that is not the case then please handle it:
public int Compare(string x, string y)
if (x == null || y == null) return 0;
var nameX = x.Substring(x.LastIndexOf('/'));
var nameY = y.Substring(y.LastIndexOf('/'));
var nameXParts = nameX.Split('.');
var nameYParts = nameY.Split('.');
int a, b;
var rs = int.TryParse(nameXParts[0], out a);
var rs2 = int.TryParse(nameYParts[0], out b);
var nameXDigits = string.Empty;
if (!rs)
for (int i = 0; i < nameXParts[0].Length; i++)
if (Char.IsDigit(nameXParts[0][i]))
nameXDigits += nameXParts[0][i];
var nameYDigits = string.Empty;
if (!rs2)
for (int i = 0; i < nameYParts[0].Length; i++)
if (Char.IsDigit(nameYParts[0][i]))
nameYDigits += nameYParts[0][i];
int.TryParse(nameXDigits, out a);
int.TryParse(nameYDigits, out b);
if (a == b || a == 0 && b == 0) return 0;
return a > b ? 1 : -1;
Don't use imageUrls.Sort(comparer); on List because it doesn't accept 0 value as keeping the order of elements.
The Sort performs an unstable sort; that is, if two elements are equal, their order might not be preserved. In contrast, a stable sort preserves the order of elements that are equal.
Solution: Let's try to use OrderBy with your compare
var imageUrls1 = new List<string>() { "1.jpg", "10.jpg", "2.jpg" };
var imageUrls2 = new List<string>() { "DSC_1.jpg", "DSC_10.jpg", "DSC_2.jpg" };
var comparer = new CompareImageName();
//Sort normally
imageUrls1 = imageUrls1.OrderBy(p=>p, comparer).ToList();
//Keep the order as your expectation
imageUrls2 = imageUrls2.OrderBy(p=>p, comparer).ToList();
Maybe you can try doing this in a function instead of writing a comparator. I can't think of a good way to implement this logic as a comparator since there are different rules based on the contents (don't sort if the file name is not numeric).
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
namespace sortinglists
public class MainProgram
public static void Main()
var imageUrlsNumbers = new List<string>();
CustomSort(ref imageUrlsNumbers);
foreach (var imageUrl in imageUrlsNumbers)
var imageUrlsText = new List<string>();
CustomSort(ref imageUrlsText);
foreach (var imageUrl in imageUrlsText)
public static void CustomSort(ref List<string> imageUrls)
if (imageUrls
.Select(s => s.Substring(s.LastIndexOf("/", StringComparison.OrdinalIgnoreCase) + 1))
.Select(t => t.Substring(0, t.IndexOf(".", StringComparison.OrdinalIgnoreCase)))
.Where(u => new Regex("[A-Za-z_]").Match(u).Success)
imageUrls = imageUrls
.Select(x => x.Substring(x.LastIndexOf("/", StringComparison.OrdinalIgnoreCase) + 1))
imageUrls = imageUrls
.Select(v => v.Substring(v.LastIndexOf("/", StringComparison.OrdinalIgnoreCase) + 1))
.OrderBy(w => Convert.ToInt32(w.Substring(0, w.LastIndexOf(".", StringComparison.OrdinalIgnoreCase))))
The output for imageUrlsNumbers after sorting is:
And the output for imageUrlsText after sorting is:

How to getting distinct values by linq or lambda?

I have a list of items, and i try to getting unique items by distinct keys.
The class:
class TempClass
public string One { get; set; }
public string Two { get; set; }
public string Key
return "Key_" + One + "_" + Two;
I build the dummy list as follows:
List<TempClass> l = new List<TempClass>()
new TempClass(){ One="Da" , Two = "Mi"},
new TempClass(){ One="Da" , Two = "Mi"},
new TempClass(){ One="Da" , Two = "Mi"},
new TempClass(){ One="Mi" , Two = "Da"},
new TempClass(){ One="Mi" , Two = "Da"},
My question is - how get only 1 item? by check that does exist only unique key? unique item means that should to check that have there only one key that equals to "Key_Da_Mi" or "Key_Mi_Da"?
how to achieve that?
Group each of the items on a HashSet of strings containing both keys, use HashSet's set comparer to compare the items as sets (sets are unordered) and then pull out the first (or whichever) item from each group:
var distinct = l.GroupBy(item => new HashSet<string>() { item.One, item.Two },
.Select(group => group.First());
You should either implement equality comparison, or implement IEqualityComparer<T> with your specific logic:
class TempClassEqualityComparer : IEqualityComparer<TempClass>
public bool Equals(TempClass x, TempClass y)
if (Object.ReferenceEquals(x, y)) return true;
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
// For comparison check both combinations
return (x.One == y.One && x.Two == y.Two) || (x.One == y.Two && x.Two == y.One);
public int GetHashCode(TempClass x)
if (Object.ReferenceEquals(x, null)) return 0;
return x.One.GetHashCode() ^ x.Two.GetHashCode();
Then you can use this comparer in Distinct method:
var result = l.Distinct(new TempClassEqualityComparer());
Just order them before you create the key.
public string Key
List<string> l = new List<string>{One, Two};
l = l.OrderBy(x => x).ToList();
return "Key_" + string.Join("_", l);

Get Max() of alphanumeric value

I have a dictionary containg ID which are alphanumeric (e.g. a10a10 & d10a9) from which I want the biggest ID, meaning 9 < 10 < a ...
When I use the following code, d10a9 is MAX since 9 is sorted before 10
var lsd = new Dictionary<string, string>();
lsd.Add("a", "d10a10");
lsd.Add("b", "d10a9");
string max = lsd.Max(kvp => kvp.Value);
How can I get the Max value of the IDs with the Longest string combined?
I think you may try to roll your own IComparer<string>
class HumanSortComparer : IComparer<string>
public int Compare(string x, string y)
// your human sorting logic here
var last = collection.OrderBy(x => x.Value, new HumanSortComparer()).LastOrDefault();
if (last != null)
string max = last.Value;
this works like a charm assuming IDs always start with "d10a":
int max = lsd.Max(kvp => Convert.ToInt32(kvp.Value.Substring(4)));
Console.Write(string.Format("d10a{0}", max));
One way would be to do this
string max =lsd.Where(kvp=>kvp.Value.Length==lsd.Max(k=>k.Value.Length)).Max(kvp => kvp.Value);
however I think that this method would evalute the max length for each item so you may be better to extract it to a variable first
int maxLength=lsd.Max(kvp=>kvp.Value.Length);
string max = lsd.Where(kvp=>kvp.Value.Length == maxLength).Max(kvp => kvp.Value);
If you are going to have null strings in there you may need to perform null checks too
int maxLength=lsd.Max(kvp=>(kvp.Value??String.Empty).Length);
string max = lsd.Where(kvp=>(kvp.Value??String.Empty).Length == maxLength).Max(kvp => kvp.Value);
Alternatively treat your string as Base36 number and convert to long for the max function and then convert back again to get the max string.
string max =lsd.Max(tvp=>tvp.Value.FromBase36()).ToBase36();
public static class Base36 {
public static long FromBase36(this string src) {
return src.ToLower().Select(x=>(int)x<58 ? x-48 : x-87).Aggregate(0L,(s,x)=>s*36+x);
public static string ToBase36(this long src) {
StringBuilder result=new StringBuilder();
while(src>0) {
var digit=(int)(src % 36);
digit=(digit<10) ? digit+48 :digit+87;
src=src / 36;
return result.ToString();
Finally just just the Agregate extension method instead of Max as this lets you do all the comparison logic....
lsd.Agregate(string.Empty,(a,b)=> a.Length == b.Length ? (a>b ? a:b) : (a.Length>b.Length ? a:b));
This could doesn't have null checks but you easily add them in.
I think if you did this:
var max = lsd.OrderByDescending(x => x.Value)
.GroupBy(x => x.Value.Length)
.OrderByDescending(x => x.Key)
.SelectMany(x => x)
It may give you what you want.
You need StringComparer.OrdinalIgnoreCase.
Without the need to use linq, the function that do that is quite simple.
Complexity is, of course, O(n).
public static KeyValuePair<string, string> FindMax(IEnumerable<KeyValuePair<string, string>> lsd)
var comparer = StringComparer.OrdinalIgnoreCase;
var best = default(KeyValuePair<string, string>);
bool isFirst = true;
foreach (KeyValuePair<string, string> kvp in lsd)
if (isFirst || comparer.Compare(kvp.Value, best.Value) > 0)
isFirst = false;
best = kvp;
return best;
Okay - I think you need to first turn each key into a series of strings and numbers - since you need the whole number to be able to determine the comparison. Then you implement an IComparer - I've tested this with your two input strings as well as with a few others and it appears to do what you want. The performance could possibly be improved - but I was brainstorming it!
Create this class:
public class ValueChain
public readonly IEnumerable<object> Values;
public int ValueCount = 0;
private static readonly Regex _rx =
new Regex("((?<alpha>[a-z]+)|(?<numeric>([0-9]+)))",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
public ValueChain(string valueString)
Values = Parse(valueString);
private IEnumerable<object> Parse(string valueString)
var matches = _rx.Matches(valueString);
ValueCount = matches.Count;
foreach (var match in matches.Cast<Match>())
if (match.Groups["alpha"].Success)
yield return match.Groups["alpha"].Value;
else if (match.Groups["numeric"].Success)
yield return int.Parse(match.Groups["numeric"].Value);
Now this comparer:
public class ValueChainComparer : IComparer<ValueChain>
private IComparer<string> StringComparer;
public ValueChainComparer()
: this(global::System.StringComparer.OrdinalIgnoreCase)
public ValueChainComparer(IComparer<string> stringComparer)
StringComparer = stringComparer;
#region IComparer<ValueChain> Members
public int Compare(ValueChain x, ValueChain y)
//todo: null checks
int comparison = 0;
foreach (var pair in x.Values.Zip
(y.Values, (xVal, yVal) => new { XVal = xVal, YVal = yVal }))
//types match?
if (pair.XVal.GetType().Equals(pair.YVal.GetType()))
if (pair.XVal is string)
comparison = StringComparer.Compare(
(string)pair.XVal, (string)pair.YVal);
else if (pair.XVal is int) //unboxing here - could be changed
comparison = Comparer<int>.Default.Compare(
(int)pair.XVal, (int)pair.YVal);
if (comparison != 0)
return comparison;
else //according to your rules strings are always greater than numbers.
if (pair.XVal is string)
return 1;
return -1;
if (comparison == 0) //ah yes, but were they the same length?
//whichever one has the most values is greater
return x.ValueCount == y.ValueCount ?
0 : x.ValueCount < y.ValueCount ? -1 : 1;
return comparison;
Now you can get the max using OrderByDescending on an IEnumerable<ValueChain> and FirstOrDefault:
public void TestMethod1()
List<ValueChain> values = new List<ValueChain>(new []
new ValueChain("d10a9"),
new ValueChain("d10a10")
ValueChain max =
values.OrderByDescending(v => v, new ValueChainComparer()).FirstOrDefault();
So you can use this to sort the string values in your dictionary:
var maxKvp = lsd.OrderByDescending(kvp => new ValueChain(kvp.Value),
new ValueChainComparer()).FirstOrDefault();

Select unique items with LINQ

When I use the following code I get the same items multiple times.
XElement neededFiles = new XElement("needed",
from o in _9nFiles.Elements()
join t in addedToSitePull.Elements()
on o.Value equals
where o.Value == t.Value
select new XElement("pic", o.Value));
I'd like to get only unique items. I saw a Stack Overflow post, How can I do SELECT UNIQUE with LINQ?, that used it, and I tried to implement it, but the change had no affect.
The code:
XElement neededFiles = new XElement("needed",
(from o in _9nFiles.Elements()
join t in addedToSitePull.Elements()
on o.Value equals
where o.Value == t.Value
select new XElement("pic", o.Value)).Distinct() );
I imagine the reason this doesn't work is because XElement.Equals uses a simple reference equality check rather than comparing the Value properties of the two items. If you want to compare the values, you could change it to:
.Join(addedToSitePull, o => o.Value, t => t.Value, (o, t) => o.Value)
.Select(val => new XElement("pic", val));
You could also create your own IEqualityComparer<T> for comparing two XElements by their values. Note this assumes all values are non-null:
public class XElementValueEqualityComparer : IEqualityComparer<XElement>
public bool Equals(XElement x, XElement y)
return x.Value.Equals(y.Value);
public int GetHashCode(XElement x)
return x.Value.GetHashCode();
Then you could replace the existing call to Distinct with Distinct(new XElementValueEqualityComparer()).
Distinct doesn't work because XElements are compared by reference, not by value.
The solution is to use another overload of Distinct - Distinct(IEqualityComparer);
You need to implement IEqualityComparer for example as follows:
class XElementEqualityComparer : IEqualityComparer<XElement>
#region IEqualityComparer<XElement> Members
public bool Equals(XElement x, XElement y)
if (x == null ^ y == null)
return false;
if (x == null && y == null)
return true;
return x.Value == y.Value;
public int GetHashCode(XElement obj)
if (obj == null)
return 0;
return obj.Value.GetHashCode();
It's not a good solution - but really easy.
foreach (XElement pic in neededFiles.Elements())
List<string> temp = new List<string>();

Decorate-Sort-Undecorate, how to sort an alphabetic field in descending order

I've got a large set of data for which computing the sort key is fairly expensive. What I'd like to do is use the DSU pattern where I take the rows and compute a sort key. An example:
Qty Name Supplier
Row 1: 50 Widgets IBM
Row 2: 48 Thingies Dell
Row 3: 99 Googaws IBM
To sort by Quantity and Supplier I could have the sort keys: 0050 IBM, 0048 Dell, 0099 IBM. The numbers are right-aligned and the text is left-aligned, everything is padded as needed.
If I need to sort by the Quanty in descending order I can just subtract the value from a constant (say, 10000) to build the sort keys: 9950 IBM, 9952 Dell, 9901 IBM.
How do I quickly/cheaply build a descending key for the alphabetic fields in C#?
[My data is all 8-bit ASCII w/ISO 8859 extension characters.]
Note: In Perl, this could be done by bit-complementing the strings:
$subkey = $string ^ ( "\xFF" x length $string );
Porting this solution straight into C# doesn't work:
subkey = encoding.GetString(encoding.GetBytes(stringval).
Select(x => (byte)(x ^ 0xff)).ToArray());
I suspect because of the differences in the way that strings are handled in C#/Perl. Maybe Perl is sorting in ASCII order and C# is trying to be smart?
Here's a sample piece of code that tries to accomplish this:
System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
List<List<string>> sample = new List<List<string>>() {
new List<string>() { "", "apple", "table" },
new List<string>() { "", "apple", "chair" },
new List<string>() { "", "apple", "davenport" },
new List<string>() { "", "orange", "sofa" },
new List<string>() { "", "peach", "bed" },
foreach(List<string> line in sample)
StringBuilder sb = new StringBuilder();
string key1 = line[1].PadRight(10, ' ');
string key2 = line[2].PadRight(10, ' ');
// Comment the next line to sort desc, desc
key2 = encoding.GetString(encoding.GetBytes(key2).
Select(x => (byte)(x ^ 0xff)).ToArray());
line[0] = sb.ToString();
List<List<string>> output = sample.OrderBy(p => p[0]).ToList();
You can get to where you want, although I'll admit I don't know whether there's a better overall way.
The problem you have with the straight translation of the Perl method is that .NET simply will not allow you to be so laissez-faire with encoding. However, if as you say your data is all printable ASCII (ie consists of characters with Unicode codepoints in the range 32..127) - note that there is no such thing as '8-bit ASCII' - then you can do this:
key2 = encoding.GetString(encoding.GetBytes(key2).
Select(x => (byte)(32+95-(x-32))).ToArray());
In this expression I have been explicit about what I'm doing:
Take x (which I assume to be in 32..127)
Map the range to 0..95 to make it zero-based
Reverse by subtracting from 95
Add 32 to map back to the printable range
It's not very nice but it does work.
Just write an IComparer that would work as a chain of comparators.
In case of equality on each stage, it should pass eveluation to the next key part. If it's less then, or greater then, just return.
You need something like this:
int comparision = 0;
foreach(i = 0; i < n; i++)
comparision = a[i].CompareTo(b[i]) * comparisionSign[i];
if( comparision != 0 )
return comparision;
return comparision;
Or even simpler, you can go with:
The first call return IOrderedEnumerable<>, the which can sort by additional fields.
Answering my own question (but not satisfactorily). To construct a descending alphabetic key I used this code and then appended this subkey to the search key for the object:
if ( reverse )
subkey = encoding.GetString(encoding.GetBytes(subkey)
.Select(x => (byte)(0x80 - x)).ToArray());
Once I had the keys built, I couldn't just do this:
Because the default comparator isn't in ASCII order (which my 0x80 - x trick relies on). So then I had to write an IComparable<RowObject> that used the Ordinal sorting:
public int CompareTo(RowObject other)
return String.Compare(this.sortKey, other.sortKey,
This seems to work. I'm a little dissatisfied because it feels clunky in C# with the encoding/decoding of the string.
If a key computation is expensive, why compute a key at all? String comparision by itself is not free, it's actually expensive loop through the characters and is not going to perform any better then a custom comparision loop.
In this test custom comparision sort performs about 3 times better then DSU.
Note that DSU key computation is not measured in this test, it's precomputed.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace DSUPatternTest
public class DSUPatternPerformanceTest
public class Row
public int Qty;
public string Name;
public string Supplier;
public string PrecomputedKey;
public void ComputeKey()
// Do not need StringBuilder here, String.Concat does better job internally.
PrecomputedKey =
Qty.ToString().PadLeft(4, '0') + " "
+ Name.PadRight(12, ' ') + " "
+ Supplier.PadRight(12, ' ');
public bool Equals(Row other)
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return other.Qty == Qty && Equals(other.Name, Name) && Equals(other.Supplier, Supplier);
public override bool Equals(object obj)
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof (Row)) return false;
return Equals((Row) obj);
public override int GetHashCode()
int result = Qty;
result = (result*397) ^ (Name != null ? Name.GetHashCode() : 0);
result = (result*397) ^ (Supplier != null ? Supplier.GetHashCode() : 0);
return result;
public class RowComparer : IComparer<Row>
public int Compare(Row x, Row y)
int comparision;
comparision = x.Qty.CompareTo(y.Qty);
if (comparision != 0) return comparision;
comparision = x.Name.CompareTo(y.Name);
if (comparision != 0) return comparision;
comparision = x.Supplier.CompareTo(y.Supplier);
return comparision;
public void CustomLoopIsFaster()
var random = new Random();
var rows = Enumerable.Range(0, 5000).Select(i =>
new Row
Qty = (int) (random.NextDouble()*9999),
Name = random.Next().ToString(),
Supplier = random.Next().ToString()
foreach (var row in rows)
var dsuSw = Stopwatch.StartNew();
var sortedByDSU = rows.OrderBy(i => i.PrecomputedKey).ToList();
var dsuTime = dsuSw.ElapsedMilliseconds;
var customSw = Stopwatch.StartNew();
var sortedByCustom = rows.OrderBy(i => i, new RowComparer()).ToList();
var customTime = customSw.ElapsedMilliseconds;
CollectionAssert.AreEqual(sortedByDSU, sortedByCustom);
Assert.IsTrue(dsuTime > customTime * 2.5);
If you need to build a sorter dynamically you can use something like this:
var comparerChain = new ComparerChain<Row>()
.By(r => r.Qty, false)
.By(r => r.Name, false)
.By(r => r.Supplier, false);
var sortedByCustom = rows.OrderBy(i => i, comparerChain).ToList();
Here is a sample implementation of comparer chain builder:
public class ComparerChain<T> : IComparer<T>
private List<PropComparer<T>> Comparers = new List<PropComparer<T>>();
public int Compare(T x, T y)
foreach (var comparer in Comparers)
var result = comparer._f(x, y);
if (result != 0)
return result;
return 0;
public ComparerChain<T> By<Tp>(Func<T,Tp> property, bool descending) where Tp:IComparable<Tp>
Comparers.Add(PropComparer<T>.By(property, descending));
return this;
public class PropComparer<T>
public Func<T, T, int> _f;
public static PropComparer<T> By<Tp>(Func<T,Tp> property, bool descending) where Tp:IComparable<Tp>
Func<T, T, int> ascendingCompare = (a, b) => property(a).CompareTo(property(b));
Func<T, T, int> descendingCompare = (a, b) => property(b).CompareTo(property(a));
return new PropComparer<T>(descending ? descendingCompare : ascendingCompare);
public PropComparer(Func<T, T, int> f)
_f = f;
It works a little bit slower, maybe because of property binging delegate calls.
