List<> sorting by object property - c#

I need to sort a list of objects by one of the object's properties, but it's a string that needs to be sorted as if it's an integer. The objects are custom "Property" objects where the property name (property.Name) is a string, however 90% of the property names are actually numbers while the other 10% are names/letters (hence while the variable itself has to be a string and not an integer).
I know I can use
propertyList.OrderBy(x => x.Name)
...but that will sort it looking at it as if it's a string (i.e. 15000 is "greater" than 20).
For sorting purposes, I've already split the list into two separate lists (one that holds all the ones with property names that contain letters and another that contains the ones that can be converted to integers), but I don't know how to sort the "integer" list.
I've tried this and it doesn't work, but is there something like this I can use?
propertyList.OrderBy(x => Convert.ToInt32(x.Name))

You don't need to split the data into two lists; also note that you can perform complex operations inside lambda methods, you just need to use a different syntax:
IEnumerable<TItem> sorted = propertyList.OrderBy( x => {
Int32 asInt;
if( Int32.TryParse( x.Name, NumberStyles.Integer, CultureInfo.InvariantCulture, out asInt ) ) {
return asInt;
}
if( x.Name.Length > 0 ) return (Int32)x.Name[0];
return 0;
});
Note this code is a bit ugly and imperfect as it won't sort two textual names correctly if they start with the same character. I suggest using the more advanced overload of OrderBy instead:
class NameComparer : IComparer<String> {
public Int32 Compare(String x, String y) {
// put Name comparison logic here
}
}
IEnumerable<TItem> sorted = propertyList.OrderBy( x => x.Name, new NameComparer() );

It is not clear what you actually want here. Do you want an ordered view of the original data? Or do you actually want to modify the original data so that it's ordered?
If the former, then the OrderBy() method is what you want. It sounds like you're fine sorting the numeric names separately from the non-numeric ones, but in that case it's not clear what you mean by "it doesn't work" when you tried your second code example. That expression should work fine, and would provide an ordered view of your data. E.g.
foreach (var x in propertyList.OrderBy(x => int.Parse(x.Name))
{
// do something with the ordered elements
}
If you meant that you want to actually order the original data, then you can just use the List<T>.Sort() method:
propertyList.Sort((x, y) => int.Parse(x.Name).CompareTo(int.Parse(y.Name));
Note that in the List<T> example, the conversion from string to int is done repeatedly. This should not be a problem for relatively small collections, but if performance becomes an issue then using the LINQ OrderBy() method (which caches the keys for you) would be preferred. You can use ToList() with OrderBy() to materialize the result back to a List<T>.
Of course that involves overhead with the intermediate data structures, so there's little point in doing it that way unless you have a genuine demonstrated performance issue to address, and you have shown that that alternative fixes the issue.

Related

How does this "Sort((x, y) => x.CompareTo(y))" work? [duplicate]

This question already has answers here:
What does the '=>' syntax in C# mean?
(7 answers)
Closed 2 years ago.
Please look at this block of code:
List<int> list = new List<int>();
list.Add(1);
list.Add(6);
list.Add(2);
list.Add(3);
list.Add(8);
list.Sort((x, y) => x.CompareTo(y));
int[] arr = list.ToArray();
foreach (var i in arr)
{
Console.WriteLine(i);
}
I know that when executed, the code above will print the list in ascending order. If I switch the positions of x and y in "x.CompareTo(y)", then the list will be sorted in descending order instead. I know what CompareTo() does, but here how does it work with x and y to decide the order of sorting? what does x and y represent here?
Sort method signature is public void Sort(Comparison<T> comparison), and if you see the declaration of Comparison it is public delegate int Comparison<in T>(T x, T y).
Just see the comments for Comparison.
//
// Summary:
// Represents the method that compares two objects of the same type.
//
// Parameters:
// x:
// The first object to compare.
//
// y:
// The second object to compare.
//
// Type parameters:
// T:
// The type of the objects to compare.
//
// Returns:
// A signed integer that indicates the relative values of x and y, as shown in the
// following table.
// Value – Meaning
// Less than 0 –x is less than y.
// 0 –x equals y.
// Greater than 0 –x is greater than y.
So it does the comparison for all the elements in the order, and if you swap x and y then it will do just the reverse of it.
In simple terms, (x, y) => x.CompareTo(y) is like a one line version of a method - it goes by various names, usually "lambda" or "delegate". You can think of it as a method that is stripped down as much as possible. Here's the full method, I'll talk about stripping it down shortly:
int MyComparer(int x, int y){
return x.CompareTo(y);
}
The list will perform a sort and will call this method every time it wants to make a choice as to which of the two ints it's considering, is larger than the other. How it does the sort is irrelevant, probably quicksort ish, but ultimately it will often want to know whether x is larger, same or smaller than y
You could take that method I wrote up there and call sort with it instead:
yourList.Sort(MyComparer)
So c# has this notion of being able to pass methods round in the same way that it passes variable data around. Typically when you pass a method you're saying "here's some logic you should call when you need to know that thing you want to know" - like a custom sort algorithm trhat you can vary by varying the logic you pass in
In stripping MyComparer down, well.. we know this list is full of ints so we know that x and y are ints, so we can throw the "int" away. We can get rid of the method name too, and because it's just one line we can ditch the curly brackets. The return keyword is unnecessary too because by definition to be useful the single line is going to have to evaluate to something returnable and it shall be returned implicitly. The => separates the parameters from the body logic
(method,parameters) => logic_that_produces_a_value(
So that long winded method can be boiled down to
(x,y) => x.CompareTo(y)
x,y are whatever type the list holds, the result value is a -1, 0 or 1 which list will use to determine how x and y relate sorting wise. They could be called anything; just like when you declare a method MyMethod(int left, int right) and choose to call the parameters "left" and "right", here we do the same because this is just declaring name parameters - you could easily write (left,right) => left.CompareTo(right) and it'd work the same
You'll see it a lot in c#, because c# devs are quite focused on representing logic in a compact fashion. Often the most useful way of making something flexible and powerful is to implement some of the logic but then leave it possible for another dev to finish the logic off. Microsoft implemented sorting into list, but they leave it up to us to decide "which is greater or lesser" so we can supply logic that influences sort order. They just mandated "any time we want to know which way round two things are we'll ask you" - which means they don't need to have SortAscending and SortDescending - we just flip the logic and declare that 10 is less than 9 if we want 10 to come first.
LINQ relies on "supply me some logic" heavily - you might be looking for all ints less than 3 in your list:
myList.Where(item => item < 3)
LINQ will loop over the list calling your supplied logic (lambda) here on every item and delivering only those where your lambda is true. item is one of the list entries, in this case an int. if the list held Person objects, item would be a Person. To that end I'd recommend calling it p or person to make it clearer what it is. This will become important when you start using more complex lambdas like
string[] wantedNames = new [] { "John", "Luke" };
people.Where(p => wantedNames.Any(n => p.Name == n));
This declares multiple names you're looking for then searches the list of people for all those whose name is in the list of wanted names. For every person p, LINQ asks "wantedNames, do Any of your elements return true for this supplied logic" and then it loops over the wantedNames calling the logic you supplied (n, the item from wantedNames is equal to the person name?).
Here you've used two different methods that take custom logic that LINQ calls repeatedly - Where is a method that takes a delegate, and so is Any; the Any operates on the list of names, the Where operates on the list of people so keeping it clear which is a Person p and which is a name string n using variable names is better than using bland names like a and b.
Got more into on your Sort method see the docs:
See List.Sort(Comparer) for more info
https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1.sort?view=net-5.0#System_Collections_Generic_List_1_Sort_System_Comparison__0__

C# - How to sort list string number by linq?

I have a list, each element in the list is a string that contains date and integer in specific format: yyyyMMdd_number.
List<string> listStr = new List<string> { "20170822_10", "20170821_1", "20170823_4", "20170821_10", "20170822_11", "20170822_5",
"20170822_2", "20170821_3", "20170823_6", "20170823_21", "20170823_20", "20170823_2"};
When use method listStr.Sort();
Result as below:
20170821_1
20170821_10
20170821_3
20170822_10
20170822_11
20170822_2
20170822_5
20170823_2
20170823_20
20170823_21
20170823_4
20170823_6
Expected Output:
20170821_1
20170821_3
20170821_10
20170822_2
20170822_5
20170822_10
20170822_11
20170823_2
20170823_4
20170823_6
20170823_20
20170823_21
The way: i think every string(day_number) will split with an underline, then compare and sort by number.
But please suggest me LINQ solution or better way to sort in this case.
Since the dates are in the format that can be ordered lexicographically, you could sort by the date prefix using string ordering, and resolve ties by parsing the integer:
var sorted = listStr
.OrderBy(s => s.Split('_')[0])
.ThenBy(s => int.Parse(s.Split('_')[1]));
Demo.
I imagine any numeric ordering would first require converting the value to a numeric type. So you could split on the underscore, sort by the first value, then by the second value. Something like this:
list.OrderBy(x => x.Split('_')[0]).ThenBy(x => int.Parse(x.Split('_')[1]))
You could improve this, if necessary, by creating a class which takes the string representation on its constructor and provides the numeric representations (and the original string representation) as properties. Then .Select() into a list of that class and sort. That class could internally do type checking, range checking, etc.
The answers above are much easier to follow / understand, but purely as an alternative for academic interest, you could do the following:
var sorted = listStr.OrderBy(x => Convert.ToInt32(x.Split('_')[0])*100 + Convert.ToInt32(x.Split('_')[1]));
It works on the premise that the suffix part after the underscore is going to be less than 100, and turns the two elements of the string into an integer with the relative 'magnitude' preserved, that can then be sorted.
The other two methods are much, much easier to follow, but one thing going for my alternative is that it only needs to sort once, so would be a bit faster (although I doubt it is going to matter for any real-world scenario).

Remove duplicates from array of struct

I can't figured out in remove duplicates entries from an Array of struct
I have this struct:
public struct stAppInfo
{
public string sTitle;
public string sRelativePath;
public string sCmdLine;
public bool bFindInstalled;
public string sFindTitle;
public string sFindVersion;
public bool bChecked;
}
I have changed the stAppInfo struct to class here thanks to Jon Skeet
The code is like this: (short version)
stAppInfo[] appInfo = new stAppInfo[listView1.Items.Count];
int i = 0;
foreach (ListViewItem item in listView1.Items)
{
appInfo[i].sTitle = item.Text;
appInfo[i].sRelativePath = item.SubItems[1].Text;
appInfo[i].sCmdLine = item.SubItems[2].Text;
appInfo[i].bFindInstalled = (item.SubItems[3].Text.Equals("Sí")) ? true : false;
appInfo[i].sFindTitle = item.SubItems[4].Text;
appInfo[i].sFindVersion = item.SubItems[5].Text;
appInfo[i].bChecked = (item.SubItems[6].Text.Equals("Sí")) ? true : false;
i++;
}
I need that appInfo array be unique in sTitle and sRelativePath members the others members can be duplicates
EDIT:
Thanks to all for the answers but this application is "portable" I mean I just need the .exe file and I don't want to add another files like references *.dll so please no external references this app is intended to use in a pendrive
All data comes form a *.ini file what I do is: (pseudocode)
ReadFile()
FillDataFromFileInAppInfoArray()
DeleteDuplicates()
FillListViewControl()
When I want to save that data into a file I have these options:
Using ListView data
Using appInfo array (this is more faster¿?)
Any other¿?
EDIT2:
Big thanks to: Jon Skeet, Michael Hays thanks for your time guys!!
Firstly, please don't use mutable structs. They're a bad idea in all kinds of ways.
Secondly, please don't use public fields. Fields should be an implementation detail - use properties.
Thirdly, it's not at all clear to me that this should be a struct. It looks rather large, and not particularly "a single value".
Fourthly, please follow the .NET naming conventions so your code fits in with all the rest of the code written in .NET.
Fifthly, you can't remove items from an array, as arrays are created with a fixed size... but you can create a new array with only unique elements.
LINQ to Objects will let you do that already using GroupBy as shown by Albin, but a slightly neater (in my view) approach is to use DistinctBy from MoreLINQ:
var unique = appInfo.DistinctBy(x => new { x.sTitle, x.sRelativePath })
.ToArray();
This is generally more efficient than GroupBy, and also more elegant in my view.
Personally I generally prefer using List<T> over arrays, but the above will create an array for you.
Note that with this code there can still be two items with the same title, and there can still be two items with the same relative path - there just can't be two items with the same relative path and title. If there are duplicate items, DistinctBy will always yield the first such item from the input sequence.
EDIT: Just to satisfy Michael, you don't actually need to create an array to start with, or create an array afterwards if you don't need it:
var query = listView1.Items
.Cast<ListViewItem>()
.Select(item => new stAppInfo
{
sTitle = item.Text,
sRelativePath = item.SubItems[1].Text,
bFindInstalled = item.SubItems[3].Text == "Sí",
sFindTitle = item.SubItems[4].Text,
sFindVersion = item.SubItems[5].Text,
bChecked = item.SubItems[6].Text == "Sí"
})
.DistinctBy(x => new { x.sTitle, x.sRelativePath });
That will give you an IEnumerable<appInfo> which is lazily streamed. Note that if you iterate over it more than once, however, it will iterate over listView1.Items the same number of times, performing the same uniqueness comparisons each time.
I prefer this approach over Michael's as it makes the "distinct by" columns very clear in semantic meaning, and removes the repetition of the code used to extract those columns from a ListViewItem. Yes, it involves building more objects, but I prefer clarity over efficiency until benchmarking has proved that the more efficient code is actually required.
What you need is a Set. It ensures that the items entered into it are unique (based on some qualifier which you will set up). Here is how it is done:
First, change your struct to a class. There is really no getting around that.
Second, provide an implementation of IEqualityComparer<stAppInfo>. It may be a hassle, but it is the thing that makes your set work (which we'll see in a moment):
public class AppInfoComparer : IEqualityComparer<stAppInfo>
{
public bool Equals(stAppInfo x, stAppInfo y) {
if (ReferenceEquals(x, y)) return true;
if (x == null || y == null) return false;
return Equals(x.sTitle, y.sTitle) && Equals(x.sRelativePath,
y.sRelativePath);
}
// this part is a pain, but this one is already written
// specifically for your question.
public int GetHashCode(stAppInfo obj) {
unchecked {
return ((obj.sTitle != null
? obj.sTitle.GetHashCode() : 0) * 397)
^ (obj.sRelativePath != null
? obj.sRelativePath.GetHashCode() : 0);
}
}
}
Then, when it is time to make your set, do this:
var appInfoSet = new HashSet<stAppInfo>(new AppInfoComparer());
foreach (ListViewItem item in listView1.Items)
{
var newItem = new stAppInfo {
sTitle = item.Text,
sRelativePath = item.SubItems[1].Text,
sCmdLine = item.SubItems[2].Text,
bFindInstalled = (item.SubItems[3].Text.Equals("Sí")) ? true : false,
sFindTitle = item.SubItems[4].Text,
sFindVersion = item.SubItems[5].Text,
bChecked = (item.SubItems[6].Text.Equals("Sí")) ? true : false};
appInfoSet.Add(newItem);
}
appInfoSet now contains a collection of stAppInfo objects with unique Title/Path combinations, as per your requirement. If you must have an array, do this:
stAppInfo[] appInfo = appInfoSet.ToArray();
Note: I chose this implementation because it looks like the way you are already doing things. It has an easy to read for-loop (though I do not need the counter variable). It does not involve LINQ (wich can be troublesome if you aren't familiar with it). It requires no external libraries outside of what .NET framework provides to you. And finally, it provides an array just like you've asked. As for reading the file in from an INI file, hopefully you see that the only thing that will change is your foreach loop.
Update
Hash codes can be a pain. You might have been wondering why you need to compute them at all. After all, couldn't you just compare the values of the title and relative path after each insert? Well sure, of course you could, and that's exactly how another set, called SortedSet works. SortedSet makes you implement IComparer in the same way that I implemented IEqualityComparer above.
So, in this case, AppInfoComparer would look like this:
private class AppInfoComparer : IComparer<stAppInfo>
{
// return -1 if x < y, 1 if x > y, or 0 if they are equal
public int Compare(stAppInfo x, stAppInfo y)
{
var comparison = x.sTitle.CompareTo(y.sTitle);
if (comparison != 0) return comparison;
return x.sRelativePath.CompareTo(y.sRelativePath);
}
}
And then the only other change you need to make is to use SortedSet instead of HashSet:
var appInfoSet = new SortedSet<stAppInfo>(new AppInfoComparer());
It's so much easier in fact, that you are probably wondering what gives? The reason that most people choose HashSet over SortedSet is performance. But you should balance that with how much you actually care, since you'll be maintaining that code. I personally use a tool called Resharper, which is available for Visual Studio, and it computes these hash functions for me, because I think computing them is a pain, too.
(I'll talk about the complexity of the two approaches, but if you already know it, or are not interested, feel free to skip it.)
SortedSet has a complexity of O(log n), that is to say, each time you enter a new item, will effectively go the halfway point of your set and compare. If it doesn't find your entry, it will go to the halfway point between its last guess and the group to the left or right of that guess, quickly whittling down the places for your element to hide. For a million entries, this takes about 20 attempts. Not bad at all. But, if you've chosen a good hashing function, then HashSet can do the same job, on average, in one comparison, which is O(1). And before you think 20 is not really that big a deal compared to 1 (after all computers are pretty quick), remember that you had to insert those million items, so while HashSet took about a million attempts to build that set up, SortedSet took several million attempts. But there is a price -- HashSet breaks down (very badly) if you choose a poor hashing function. If the numbers for lots of items are unique, then they will collide in the HashSet, which will then have to try again and again. If lots of items collide with the exact same number, then they will retrace each others steps, and you will be waiting a long time. The millionth entry will take a million times a million attempts -- HashSet has devolved into O(n^2). What's important with those big-O notations (which is what O(1), O(log n), and O(n^2) are, in fact) is how quickly the number in parentheses grows as you increase n. Slow growth or no growth is best. Quick growth is sometimes unavoidable. For a dozen or even a hundred items, the difference may be negligible -- but if you can get in the habit of programming efficient functions as easily as alternatives, then it's worth conditioning yourself to do so as problems are cheapest to correct closest to the point where you created that problem.
Use LINQ2Objects, group by the things that should be unique and then select the first item in each group.
var noDupes = appInfo.GroupBy(
x => new { x.sTitle, x.sRelativePath })
.Select(g => g.First()).ToArray();
!!! Array of structs (value type) + sorting or any kind of search ==> a lot of unboxing operations.
I would suggest to stick with recommendations of Jon and Henk, so make it as a class and use generic List<T>.
Use LINQ GroupBy or DistinctBy, as for me it is much simple to use built in GroupBy, but it also interesting to take a look at an other popular library, perhaps it gives you some insights.
BTW, Also take a look at the LambdaComparer it will make you life easier each time you need such kind of in place sorting/search, etc...

Dynamic Linq: How to specify the StringComparison type?

I'm working on doing some custom filtering and sorting of a dataset, based on a collection of sort fields sent from the client browser, and am using Dynamic Linq to achieve (most of) the desired effect. Where I'm running into a problem is when I try to sort by a column of type String, which contains both traditional strings and numbers stored as strings. It doesn't appear that I can pass in a StringComparison enum value, or specify an IComparer parameter for the Dynamic Linq orderby function.
My sorting code looks like:
myList.AsQueryable().OrderBy("StringColWithNums ASC")
I end up with:
1
10
100
11
12
2
20
instead of:
1
2
10
11
12
20
100
Anyone have any experience doing something similar?
myList.AsQueryable().Sort((r, s) => int.Parse(r).CompareTo(int.Parse(s)));
will take some tweaking if those are objects, just use int.Parse(r.StringColWithNums), or whatever the field is.
Oops, sorry, didn't read all the OP to see it has letters too and you want the dynamic linq, editing
EDIT
I don't know that you're going to be able to do that using Dynamic linq and passing IComparer. You may be able to do it after getting the results (i.e. as I was originally writing the sort, with modifications). Comment if you want to pursue that line.
This is a fundamental problem with attempting to perform numeric comparisons within a string comparison. A couple of ways I would do this:
When loading the list, prefix numbers with an amount of zeroes that will accompany the max string size, i.e. String.Format("000000", number). This will only work if you care mostly about sorting and less about the appearance of the results - even then, you could convert "000010" back to a numeric and call the ToString() method to display the number again without the leading zeroes.
Write your own implementation (extension method) of OrderBy wherein you pass a function (or anonymous function) as a parameter to re-sort the results calling the method passed in.
You can solve this by writing a new string comparer
class AlphaNumericComparer : IComparer<string>
{
public int Compare(string x, string y)
{
// if both values are integers then do int comparision
int xValue, yValue;
if (int.TryParse(x, out xValue) && int.TryParse(y, out yValue))
return xValue.CompareTo(yValue);
return x.CompareTo(y); // else do string comparison
}
}
Then you can use the comparer in methods like OrderBy and Sort
var sorted = lst.OrderBy(s => s, new AlphaNumericComparer());
lst.Sort(new AlphaNumericComparer());
This will give you the desired result. If not then just tweak the comparer.
It seems that this is not something that can be accomplished out of the box with Dynamic Linq, at least not in .NET 2.0/3.5. I ended up modifying the Dynamic Linq source code in order to accomplish what I needed.

String "Sort Template" in C#

I'm trying to come up with a clean way of sorting a set of strings based on a "sorting template". I apologize if my wording is confusing, but I can't think of a better way to describe it (maybe someone can come up with a better way to describe it after reading what I'm trying to do?).
Consider the following list of strings (my "sort template", each item in the list a "command"):
[FA, TY, AK, PO, PR, ZZ, QW, BC]
I'd like to use the order of the strings within that list to sort a list of those commands. For example, I'd like the following list:
[TY, PR, PR, ZZ, BC, AK]
to be sorted into the following list based on the "sorting template":
[TY, AK, PR, PR, ZZ, BC]
What would be a good way to acomplish this?
The best idea I have yet is to use an enumeration...
enum Command
{
FA,
TY,
AK,
PO,
PR,
ZZ,
QW,
BC
};
...and do an Enum.Parse() on each command in my list I want sorted, converting that list from a list of strings into a list of Commands, which would then be sorted based on the order of the enumeration.
I don't know. The enumeration seems like it would work, but is there a better way I could go about this?
Here is a very simple way to do it!
List<string> template = new List<string>{ "ZD", "AB", "GR"};
List<string> myList = new List<string>{"AB", "GR", "ZD", "AB", "AB"};
myList.Sort((a, b) => template.IndexOf(a).CompareTo(template.IndexOf(b)));
You could use a Dictionary<string, int> to store and retrieve your sorting template tokens. However, this basically does the same as your enum (only perhaps in a slightly more readable manner), because Enum.Parse here could be confusing.
var ordering = Dictionary<string, int>();
ordering.Add("FA", 0);
ordering.Add("TY", 1); // …
MyList.Sort((a, b) => ordering[a].CompareTo(ordering[b]));
This uses an appropriate overload of the List<T>.Sort method to compare two elements based on their value in the template dictionary.
You could rename your commands like
[1FA, 2TY, 3AK, 4PO, 5PR, 6ZZ, 7QW, 8BC]
and strip out the first character when you were ready to use it. I think that's called a kludge.
I can't help thinking you might get some mileage out of using a SortedList but in effect it propbably will work more or less like your enum
SortedList Commands = new SortedList();
Commands.Add(1,FA);
Commands.Add(2,TY);
//etc
Use the Command pattern ( I think it's called)
write a sort method that sorts the list, but uses an external method to do the comparison between pairs of objects... Then pass it a delegate to the comparison method... Write the comparison method to take two members of the list, and the sorting template as input parameters... In the method, return a -1, a 0 or a + 1 based on whether the first member of the pair or the second member is found first in the template list.
In your sort method use the return value from the compare method to implement the sort, whatever kind of sort you do...

Categories