int a, b, n;
...
(a, b) = (2, 3);
// 'a' is now 2 and 'b' is now 3
This sort of thing would be really helpfull in C#. In this example 'a' and 'b' arn't encapsulated together such as the X and Y of a position might be. Does this exist in some form?
Below is a less trivial example.
(a, b) = n == 4 ? (2, 3) : (3, n % 2 == 0 ? 1 : 2);
Adam Maras shows in the comments that:
var result = n == 4 ? Tuple.Create(2, 3) : Tuple.Create(3, n % 2 == 0 ? 1 : 2);
Sort of works for the example above however as he then points out it creates a new truple instead of changing the specified values.
Eric Lippert asks for use cases, therefore perhaps:
(a, b, c) = (c, a, b); // swap or reorder on one line
(x, y) = move((x, y), dist, heading);
byte (a, b, c, d, e) = (5, 4, 1, 3, 2);
graphics.(PreferredBackBufferWidth, PreferredBackBufferHeight) = 400;
notallama also has use cases, they are in his answer below.
We have considered supporting a syntactic sugar for tuples but it did not make the bar for C# 4.0. It is unlikely to make the bar for C# 5.0; the C# 5.0 team is pretty busy with getting async/await working right. We will consider it for hypothetical future versions of the language.
If you have a really solid usage case that is compelling, that would help us prioritize the feature.
use case:
it'd be really nice for working with IObservables, since those have only one type parameter. you basically want to subscribe with arbitrary delegates, but you're forced to use Action, so that means if you want multiple parameters, you have to either use tuples, or create custom classes for packing and unpacking parameters.
example from a game:
public IObservable<Tuple<GameObject, DamageInfo>> Damaged ...
void RegisterHitEffects() {
(from damaged in Damaged
where damaged.Item2.amount > threshold
select damaged.Item1)
.Subscribe(DoParticleEffect)
.AddToDisposables();
}
becomes:
void RegisterHitEffects() {
(from (gameObject, damage) in Damaged
where damage.amount > threshold
select gameObject)
.Subscribe(DoParticleEffect)
.AddToDisposables();
}
which i think is cleaner.
also, presumably IAsyncResult will have similar issues when you want to pass several values. sometimes it's cumbersome to create classes just to shuffle a bit of data around, but using tuples as they are now reduces code clarity. if they're used in the same function, anonymous types fit the bill nicely, but they don't work if you need to pass data between functions.
also, it'd be nice if the sugar worked for generic parameters, too. so:
IEnumerator<(int, int)>
would desugar to
IEnumerator<Tuple<int,int>>
The behavior that you're looking for can be found in languages that have support or syntactic sugar for tuples. C# is not among these langauges; while you can use the Tuple<...> classes to achieve similar behavior, it will come out very verbose (not clean like you're looking for.)
Deconstruction was introduced in C# 7.0:
https://blogs.msdn.microsoft.com/dotnet/2016/08/24/whats-new-in-csharp-7-0/#user-content-deconstruction
The closest structure I can think of is the Tuple class in version 4.0 of the framework.
As others already wrote, C# 4 Tuples are a nice addition, but nothing really compelling to use as long as there aren't any unpacking mechanisms. What I really demand of any type I use is clarity of what it describes, on both sides of the function protocol (e.g. caller, calle sides)... like
Complex SolvePQ(double p, double q)
{
...
return new Complex(real, imag);
}
...
var solution = SolvePQ(...);
Console.WriteLine("{0} + {1}i", solution.Real, solution.Imaginary);
This is obvious and clear at both caller and callee side. However this
Tuple<double, double> SolvePQ(double p, double q)
{
...
return Tuple.Create(real, imag);
}
...
var solution = SolvePQ(...);
Console.WriteLine("{0} + {1}i", solution.Item1, solution.Item2);
Doesn't leave the slightest clue about what that solution actually is (ok, the string and the method name make it pretty obvious) at the call site. Item1 and Item2 are of the same type, which renders tooltips useless. The only way to know for certain is to "reverse engineer" your way back through SolvePQ.
Obivously, this is far fetched and everyone doing serious numerical stuff should have a Complex type (like that in the BCL).
But everytime you get split results and you want give those results distinct names for the sake of readability, you need tuple unpacking. The rewritten last two lines would be:
var (real, imaginary) = SolvePQ(...); // or var real, imaginary = SolvePQ(...);
Console.WriteLine("{0} + {1}i", real, imaginary);
This leaves no room for confusion, except for getting used to the syntax.
Creating a set of Unpack<T1, T2>(this Tuple<T1, T2>, out T1, out T2) methods would be a more idiomatic c# way of doing this.
Your example would then become
int a, b, n;
...
Tuple.Create(2, 3).Unpack(out a, out b);
// 'a' is now 2 and 'b' is now 3
which is no more complex than your proposal, and a lot clearer.
Related
This question already has answers here:
What does the '=>' syntax in C# mean?
(7 answers)
Closed 2 years ago.
Please look at this block of code:
List<int> list = new List<int>();
list.Add(1);
list.Add(6);
list.Add(2);
list.Add(3);
list.Add(8);
list.Sort((x, y) => x.CompareTo(y));
int[] arr = list.ToArray();
foreach (var i in arr)
{
Console.WriteLine(i);
}
I know that when executed, the code above will print the list in ascending order. If I switch the positions of x and y in "x.CompareTo(y)", then the list will be sorted in descending order instead. I know what CompareTo() does, but here how does it work with x and y to decide the order of sorting? what does x and y represent here?
Sort method signature is public void Sort(Comparison<T> comparison), and if you see the declaration of Comparison it is public delegate int Comparison<in T>(T x, T y).
Just see the comments for Comparison.
//
// Summary:
// Represents the method that compares two objects of the same type.
//
// Parameters:
// x:
// The first object to compare.
//
// y:
// The second object to compare.
//
// Type parameters:
// T:
// The type of the objects to compare.
//
// Returns:
// A signed integer that indicates the relative values of x and y, as shown in the
// following table.
// Value – Meaning
// Less than 0 –x is less than y.
// 0 –x equals y.
// Greater than 0 –x is greater than y.
So it does the comparison for all the elements in the order, and if you swap x and y then it will do just the reverse of it.
In simple terms, (x, y) => x.CompareTo(y) is like a one line version of a method - it goes by various names, usually "lambda" or "delegate". You can think of it as a method that is stripped down as much as possible. Here's the full method, I'll talk about stripping it down shortly:
int MyComparer(int x, int y){
return x.CompareTo(y);
}
The list will perform a sort and will call this method every time it wants to make a choice as to which of the two ints it's considering, is larger than the other. How it does the sort is irrelevant, probably quicksort ish, but ultimately it will often want to know whether x is larger, same or smaller than y
You could take that method I wrote up there and call sort with it instead:
yourList.Sort(MyComparer)
So c# has this notion of being able to pass methods round in the same way that it passes variable data around. Typically when you pass a method you're saying "here's some logic you should call when you need to know that thing you want to know" - like a custom sort algorithm trhat you can vary by varying the logic you pass in
In stripping MyComparer down, well.. we know this list is full of ints so we know that x and y are ints, so we can throw the "int" away. We can get rid of the method name too, and because it's just one line we can ditch the curly brackets. The return keyword is unnecessary too because by definition to be useful the single line is going to have to evaluate to something returnable and it shall be returned implicitly. The => separates the parameters from the body logic
(method,parameters) => logic_that_produces_a_value(
So that long winded method can be boiled down to
(x,y) => x.CompareTo(y)
x,y are whatever type the list holds, the result value is a -1, 0 or 1 which list will use to determine how x and y relate sorting wise. They could be called anything; just like when you declare a method MyMethod(int left, int right) and choose to call the parameters "left" and "right", here we do the same because this is just declaring name parameters - you could easily write (left,right) => left.CompareTo(right) and it'd work the same
You'll see it a lot in c#, because c# devs are quite focused on representing logic in a compact fashion. Often the most useful way of making something flexible and powerful is to implement some of the logic but then leave it possible for another dev to finish the logic off. Microsoft implemented sorting into list, but they leave it up to us to decide "which is greater or lesser" so we can supply logic that influences sort order. They just mandated "any time we want to know which way round two things are we'll ask you" - which means they don't need to have SortAscending and SortDescending - we just flip the logic and declare that 10 is less than 9 if we want 10 to come first.
LINQ relies on "supply me some logic" heavily - you might be looking for all ints less than 3 in your list:
myList.Where(item => item < 3)
LINQ will loop over the list calling your supplied logic (lambda) here on every item and delivering only those where your lambda is true. item is one of the list entries, in this case an int. if the list held Person objects, item would be a Person. To that end I'd recommend calling it p or person to make it clearer what it is. This will become important when you start using more complex lambdas like
string[] wantedNames = new [] { "John", "Luke" };
people.Where(p => wantedNames.Any(n => p.Name == n));
This declares multiple names you're looking for then searches the list of people for all those whose name is in the list of wanted names. For every person p, LINQ asks "wantedNames, do Any of your elements return true for this supplied logic" and then it loops over the wantedNames calling the logic you supplied (n, the item from wantedNames is equal to the person name?).
Here you've used two different methods that take custom logic that LINQ calls repeatedly - Where is a method that takes a delegate, and so is Any; the Any operates on the list of names, the Where operates on the list of people so keeping it clear which is a Person p and which is a name string n using variable names is better than using bland names like a and b.
Got more into on your Sort method see the docs:
See List.Sort(Comparer) for more info
https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1.sort?view=net-5.0#System_Collections_Generic_List_1_Sort_System_Comparison__0__
Consider the following simple code with LINQ OrderBy and ThenBy:
static void Main()
{
var arr1 = new[] { "Alpha", "Bravo", "Charlie", };
var coStr = Comparer<string>.Create((x, y) =>
{
Console.WriteLine($"Strings: {x} versus {y}");
return string.CompareOrdinal(x, y);
});
arr1.OrderBy(x => x, coStr).ToList();
Console.WriteLine("--");
var arr2 = new[]
{
new { P = "Alpha", Q = 7, },
new { P = "Bravo", Q = 9, },
new { P = "Charlie", Q = 13, },
};
var coInt = Comparer<int>.Create((x, y) =>
{
Console.WriteLine($"Ints: {x} versus {y}");
return x.CompareTo(y);
});
arr2.OrderBy(x => x.P, coStr).ThenBy(x => x.Q, coInt).ToList();
}
This simply uses some comparers that write out to the console what they compare.
On my hardware and version of the Framework (.NET 4.6.2), this is the output:
Strings: Bravo versus Alpha
Strings: Bravo versus Bravo
Strings: Bravo versus Charlie
Strings: Bravo versus Bravo
--
Strings: Bravo versus Alpha
Strings: Bravo versus Bravo
Ints: 9 versus 9
Strings: Bravo versus Charlie
Strings: Bravo versus Bravo
Ints: 9 versus 9
My question is: Why would they compare an item from the query to itself?
In the first case, before the -- separator, they do four comparisons. Two of them compare an entry to itself ("Strings: Bravo versus Bravo"). Why?
In the second case, there should not ever be a need for resorting to comparing the Q properties (integers); for there are no duplicates (wrt. ordinal comparison) in the P values, so no tie-breaking from ThenBy should be needed ever. Still we see "Ints: 9 versus 9" twice. Why use the ThenBy comparer with identical arguments?
Note: Any comparer has to return 0 upon comparing something to itself. So unless the algorithm just wants to check if we implemented a comparer correctly (which it will never be able to do fully anyway), what is going on?
Be aware: There are no duplicates in the elements yielded by the queries in my examples.
I saw the same issue with another example with more entries yielded from the query. Above I just give a small example. This happens with an even number of elements yielded, as well.
In the reference source of the QuickSort method used by OrderBy you can see these two lines:
while (i < map.Length && CompareKeys(x, map[i]) > 0) i++;
while (j >= 0 && CompareKeys(x, map[j]) < 0) j--;
These while loops run until they find an element that is no longer "greater" (resp. "less") than the one x points to. So they will break when the identical element is compared.
I can't prove it mathematical, but I guess to avoid comparing identical elements would make the algorithm more complicated and introduce overhead that would impact performance more than this single comparison.
(Note that your comparer should be implemented clever enough to quickly return 0 for identical elements)
In the first case, before the -- separator, they do four comparisons. Two of them compare an entry to itself ("Strings: Bravo versus Bravo"). Why?
Efficiency. Sure it would be possible to check the object isn't itself first, but that means doing an extra operation on every comparison to avoid a case that comes up relatively rarely and which in most cases is pretty cheap (most comparers are). That would be a nett loss.
(Incidentally, I did experiment with just such a change to the algorithm, and when measured it really was an efficiency loss with common comparisons such as the default int comparer).
In the second case, there should not ever be a need for resorting to comparing the Q properties (integers); for there are no duplicates (wrt. ordinal comparison) in the P values, so no tie-breaking from ThenBy should be needed ever. Still we see "Ints: 9 versus 9" twice. Why use the ThenBy comparer with identical arguments?
Who is to say there are no duplicates? The internal comparison was given two things (not necessarily reference types, so short-circuiting on reference identity isn't always an option) and has two rules to follow. The first rule needed a tie-break so the tie-break was done.
The code is designed to work for cases where there can be equivalent values for the first comparison.
If it's known that there won't be equivalent values for the OrderBy then it's for the person who knows that to not use an unnecessary ThenBy, as they are the person who can potentially know that.
Ok, let's see the possibilities here:
T is a value type
In order to check if it's comparing an item against itself, it first needs to check if both items are the same one. How would you do that?
You could call Equals first and then CompareTo if the items are not the same. Do you really want to do that? The cost of Equals is going to be roughly the same as comparing so you'd actually be doubling the cost of the ordering for exactly what? OrderBy simply compares all items, period.
T is a reference type
c# doesn't let you overload only with generic constraints so you'd need to check in runtime if T is a reference type or not and then call a specific implementation that would change the behavior described above. Do you want to incurr in that cost in every case? Of course not.
If the comparison is expensive, then implement in your comparison logic a reference optimization to avoid incurring in stupid costs when comparing an item to itself, but that choice must be yours. I'm pretty sure string.CompareTo does precisely that.
I hope this makes my answer clearer, sorry for the previous short answer, my reasoning wasnt that obvious.
In simple terms in case 1
var coStr = Comparer<string>.Create((x, y) =>
{
Console.WriteLine($"Strings: {x} versus {y}");
return string.CompareOrdinal(x, y);
});
We are just comparing the elements there is no condition to ignore if the result is 0. so Console.WriteLine condition is irrespective to the output of comparison. If you change your code like below
var coStr = Comparer<string>.Create((x, y) =>
{
if (x != y)
Console.WriteLine($"Strings: {x} versus {y}");
return string.CompareOrdinal(x, y);
});
Your output will be like
Strings: Bravo versus Alpha
Strings: Bravo versus Charlie
Same thing for the second statement here we are checking the output of both so for string comparison will return 0 then it will go for the in comparison so it will take that one and output the required. Hope it clears your doubt :)
I was at career fair and I was asked a question "what does this query do and in which language it is".
I told that it is in .NET LINQ, but was unable to predict what it does.
Can any one help me
I wrote in .NET and tried .
var youShould = from c
in "3%.$#9/52#2%35-%#4/#./3,!#+%23 !2#526%N#/-"
select (char)(c ^ 3 << 5);
Label1.Text = youShould.ToString();
And got this output :
System.Linq.Enumerable+WhereSelectEnumerableIterator`2[System.Char,System.Char]
First of all, don't feel bad that you didn't get the answer. I know exactly what's going on here and I'd have probably just laughed and walked away if someone asked me what this did.
There's a number of things going on here, but start with the output:
var youShould = from c in "3%.$#9/52#2%35-%#4/#./3,!#+%23 !2#526%N#/-"
select (char)(c ^ 3 << 5);
Label1.Text = youShould.ToString();
>>> System.Linq.Enumerable+WhereSelectEnumerableIterator`2[System.Char,System.Char]
When you run a LINQ query, or use any of the equivalent methods like Select() that return sets, what you get back is a special, internal type of object called an "iterator", specifically, an object that implements the IEnumerable interface. .NET uses these objects all over the place; for example, the foreach loop's purpose is to iterate over iterators.
One of the most important things to know about these kind of objects is that just creating one doesn't actually "do" anything. The iterator doesn't actually contain a set of things; rather, it contains the instructions needed to produce a set of things. If you try to do something like, e.g. ToString on it, the result you get won't be very useful.
However, it does tell us one thing: it tells us that this particular iterator takes a source list of type char and returns a new set, also of type char. (I know that because I know that's what the two generic parameters of a "select iterator" do.) To get the actual results out of this iterator you just need to loop over it somehow, e.g.:
foreach (var c in youShould)
{
myLabel.Text += c;
}
or, slightly easier,
myLabel.Text = new string(youShould.ToArray());
To actually figure out what it does, you have to also recognize the second fact: LINQ treats a string as a "set of characters". It is going to process each character in that string, one at a time, and perform the bit-wise operations on the value.
The long-form equivalent of that query is something like this:
var input= "3%.$#9/52#2%35-%#4/#./3,!#+%23 !2#526%N#/-";
var output = string.Empty;
foreach (var c in input)
{
var i = (int)c;
var i2 = i ^ (3 << 5);
var c2= (char)i2;
output += c2;
}
If you did the math by hand you'd get the correct output message. To save you the brain-numbing exercise, I'll just tell you: it toggles bits 5 and 6 of the ASCII value, changing each character to one further up the ASCII table. The resulting string is:
SEND YOUR RESUME TO [an email address]
Demostrative .NET Fiddle: https://dotnetfiddle.net/x7UvYA
For each character in the string, project the character by xor-ing it with (3 left shifted by 5), then cast the numeric value back to a char.
You could generate your own code strings by running the query again over an uncoded string, because if you XOR a number twice by the same value, you'll be left with the same number you started with. (e.g. x ^ y ^ y = x)
I'll leave it as an exercise to the reader to figure out what the following is:
4()3#)3#!#$5-"#4%34
I suppose it tests:
Linq to objects
Understanding of IEnumerable<T> interface and how it relates to strings
Casting
Bitwise operations
Bitwise operator precedence
Personally, I think this is a useless test that doesn't really reflect real world problems.
I have a linq query which is not ordered the way I would like.
The Query:
return (from obj in context.table_orders
orderby obj.order_no
select obj.order_no.ToString() + '-' + obj.order_description).ToList<string>();
What happens is that my records are ordered alphabeticaly, is there a Linq keyword I can use so my records are ordered correctly (so order 30 comes before order 100)?
I want the result to be a list of string since this is used to populate a ComboBox.
Also some of the 'order_no' in the DB are like '2.10' and '9.1.1'.
What happens is that my records are ordered alphabeticaly, is there a Linq keyword I can use so my records are
ordered correctly (so order #30 comes before order #100)?
If I would get a cente everytime someone asks this I would be rich.
Yes, there is - simple answer: ORDER A NUMBER NOT A STRING.
so order #30 comes before order #100)
But #30 comes AFTER #100 for the simple reason that they ARE sorted alphabetically becase THEY ARE STRINGS.
Parse the string, convert the number to - well - a number, and order by it.
WHOEVER had the idea that order_no should be a string WITHOUT A FIXED LENGH (like 00030) should - well - ;) get a basic education on database modelling. I really like things like invoice numbers etc. to be strings (they are NOT numbers) but keeping them in (a) a defiable pattern and (b) checksummed (so that data entry errors are easily catched) should be basics ;)
This is the kind of issue you get with junior people defining databases and data models and not thinking about the consequences.
You are in for some pain - parse the string, order by parsing result.
If obj.order_no is a string, then convert it to a number for sorting
orderby Int32.Parse(obj.order_no)
or
orderby Decimal.Parse(obj.order_no)
of cause this works only if the string represents a valid number.
If order_no is not a valid number (e.g. "17.7-8A") then write a function that formats it to contain right aligned numbers, like "00017.007-0008A" and sort using this function
orderby FormatOrderNo(obj.order_no)
UPDATE
Since you are working with EF you cannot call this function in the EF part of your query. Convert the EF-result to an IEnumerable<T> and perform the sorting using LINQ-To-Objects
return (from obj in context.table_orders select ...)
.AsEnumerable()
.OrderBy(obj => FormatOrderNo(obj.order_no))
.ToList();
Based on what you said - that the numbers are not really numbers but rather custom sequence identifiers (i.e. you don't even know what level of depth you get) I would suggest implementing a custom comparer.
If you do this you can define what you exaclty want - and that is I believe something along these lines:
split the string on .
compare the sequences up to and including the array.length of the shorter sequence
if there was a shorter sequence and by now there is a tie, pick the shorter before the longer (i.e. 2.1 before 2.1.1)
if both sequences had the same length, by the end of the comparison you should know which one is 'bigger'
solved.
If you need inspiration on IComparer implementation an example below:
http://zootfroot.blogspot.co.uk/2009/09/natural-sort-compare-with-linq-orderby.html
I would either change the datatype if possible or add it as another firld if applicable.
If that is a no-go you can look at the solutions mentioned by others in this question but beware - The .ToList() option is nice for small tables if you are pulling everything from them but getting used to it will eventually give you a world of pain. You don't wanna get everything in the long run, either use a Where or top critera.
The other solutions are nice but to complex for my taste to accomplish the task.
You could go with shooting sql directly through LinqToSql. http://msdn.microsoft.com/en-us/library/bb399403.aspx
In the sql you are free to convert and sort however you like. Some think this is a great idea and some will tell you it it bad. You loose strong typing and gain performance. You will have to know WHY you take this kinds of decision, that is the most important thing imho.
Since nobody came up with a custom orderby function translatable into SQL, I went for the IComparer function like so:
public class OrderComparer<T> : IComparer<string>
{
#region IComparer<string> Members
public int Compare(string x, string y)
{
return GetOrderableValue(x.Split('-').First()).CompareTo(GetOrderableValue(y.Split('-').First()));
}
#endregion
private int GetOrderableValue(string value)
{
string[] splitValue = value.Split('.');
int orderableValue = 0;
if (splitValue.Length.Equals(1))
orderableValue = int.Parse(splitValue[0]) * 1000;
else if (splitValue.Length.Equals(2))
orderableValue = int.Parse(splitValue[0]) * 1000 + int.Parse(splitValue[1]) * 100;
else if (splitValue.Length.Equals(3))
orderableValue = int.Parse(splitValue[0]) * 1000 + int.Parse(splitValue[1]) * 100 + int.Parse(splitValue[2]) * 10;
else
orderableValue = int.Parse(splitValue[0]) * 1000 + int.Parse(splitValue[1]) * 100 + int.Parse(splitValue[2]) * 10 + int.Parse(splitValue[3]);
return orderableValue;
}
}
The values have a maximum of 4 levels.
Anyone has a recommandation?
I can't figured out in remove duplicates entries from an Array of struct
I have this struct:
public struct stAppInfo
{
public string sTitle;
public string sRelativePath;
public string sCmdLine;
public bool bFindInstalled;
public string sFindTitle;
public string sFindVersion;
public bool bChecked;
}
I have changed the stAppInfo struct to class here thanks to Jon Skeet
The code is like this: (short version)
stAppInfo[] appInfo = new stAppInfo[listView1.Items.Count];
int i = 0;
foreach (ListViewItem item in listView1.Items)
{
appInfo[i].sTitle = item.Text;
appInfo[i].sRelativePath = item.SubItems[1].Text;
appInfo[i].sCmdLine = item.SubItems[2].Text;
appInfo[i].bFindInstalled = (item.SubItems[3].Text.Equals("Sí")) ? true : false;
appInfo[i].sFindTitle = item.SubItems[4].Text;
appInfo[i].sFindVersion = item.SubItems[5].Text;
appInfo[i].bChecked = (item.SubItems[6].Text.Equals("Sí")) ? true : false;
i++;
}
I need that appInfo array be unique in sTitle and sRelativePath members the others members can be duplicates
EDIT:
Thanks to all for the answers but this application is "portable" I mean I just need the .exe file and I don't want to add another files like references *.dll so please no external references this app is intended to use in a pendrive
All data comes form a *.ini file what I do is: (pseudocode)
ReadFile()
FillDataFromFileInAppInfoArray()
DeleteDuplicates()
FillListViewControl()
When I want to save that data into a file I have these options:
Using ListView data
Using appInfo array (this is more faster¿?)
Any other¿?
EDIT2:
Big thanks to: Jon Skeet, Michael Hays thanks for your time guys!!
Firstly, please don't use mutable structs. They're a bad idea in all kinds of ways.
Secondly, please don't use public fields. Fields should be an implementation detail - use properties.
Thirdly, it's not at all clear to me that this should be a struct. It looks rather large, and not particularly "a single value".
Fourthly, please follow the .NET naming conventions so your code fits in with all the rest of the code written in .NET.
Fifthly, you can't remove items from an array, as arrays are created with a fixed size... but you can create a new array with only unique elements.
LINQ to Objects will let you do that already using GroupBy as shown by Albin, but a slightly neater (in my view) approach is to use DistinctBy from MoreLINQ:
var unique = appInfo.DistinctBy(x => new { x.sTitle, x.sRelativePath })
.ToArray();
This is generally more efficient than GroupBy, and also more elegant in my view.
Personally I generally prefer using List<T> over arrays, but the above will create an array for you.
Note that with this code there can still be two items with the same title, and there can still be two items with the same relative path - there just can't be two items with the same relative path and title. If there are duplicate items, DistinctBy will always yield the first such item from the input sequence.
EDIT: Just to satisfy Michael, you don't actually need to create an array to start with, or create an array afterwards if you don't need it:
var query = listView1.Items
.Cast<ListViewItem>()
.Select(item => new stAppInfo
{
sTitle = item.Text,
sRelativePath = item.SubItems[1].Text,
bFindInstalled = item.SubItems[3].Text == "Sí",
sFindTitle = item.SubItems[4].Text,
sFindVersion = item.SubItems[5].Text,
bChecked = item.SubItems[6].Text == "Sí"
})
.DistinctBy(x => new { x.sTitle, x.sRelativePath });
That will give you an IEnumerable<appInfo> which is lazily streamed. Note that if you iterate over it more than once, however, it will iterate over listView1.Items the same number of times, performing the same uniqueness comparisons each time.
I prefer this approach over Michael's as it makes the "distinct by" columns very clear in semantic meaning, and removes the repetition of the code used to extract those columns from a ListViewItem. Yes, it involves building more objects, but I prefer clarity over efficiency until benchmarking has proved that the more efficient code is actually required.
What you need is a Set. It ensures that the items entered into it are unique (based on some qualifier which you will set up). Here is how it is done:
First, change your struct to a class. There is really no getting around that.
Second, provide an implementation of IEqualityComparer<stAppInfo>. It may be a hassle, but it is the thing that makes your set work (which we'll see in a moment):
public class AppInfoComparer : IEqualityComparer<stAppInfo>
{
public bool Equals(stAppInfo x, stAppInfo y) {
if (ReferenceEquals(x, y)) return true;
if (x == null || y == null) return false;
return Equals(x.sTitle, y.sTitle) && Equals(x.sRelativePath,
y.sRelativePath);
}
// this part is a pain, but this one is already written
// specifically for your question.
public int GetHashCode(stAppInfo obj) {
unchecked {
return ((obj.sTitle != null
? obj.sTitle.GetHashCode() : 0) * 397)
^ (obj.sRelativePath != null
? obj.sRelativePath.GetHashCode() : 0);
}
}
}
Then, when it is time to make your set, do this:
var appInfoSet = new HashSet<stAppInfo>(new AppInfoComparer());
foreach (ListViewItem item in listView1.Items)
{
var newItem = new stAppInfo {
sTitle = item.Text,
sRelativePath = item.SubItems[1].Text,
sCmdLine = item.SubItems[2].Text,
bFindInstalled = (item.SubItems[3].Text.Equals("Sí")) ? true : false,
sFindTitle = item.SubItems[4].Text,
sFindVersion = item.SubItems[5].Text,
bChecked = (item.SubItems[6].Text.Equals("Sí")) ? true : false};
appInfoSet.Add(newItem);
}
appInfoSet now contains a collection of stAppInfo objects with unique Title/Path combinations, as per your requirement. If you must have an array, do this:
stAppInfo[] appInfo = appInfoSet.ToArray();
Note: I chose this implementation because it looks like the way you are already doing things. It has an easy to read for-loop (though I do not need the counter variable). It does not involve LINQ (wich can be troublesome if you aren't familiar with it). It requires no external libraries outside of what .NET framework provides to you. And finally, it provides an array just like you've asked. As for reading the file in from an INI file, hopefully you see that the only thing that will change is your foreach loop.
Update
Hash codes can be a pain. You might have been wondering why you need to compute them at all. After all, couldn't you just compare the values of the title and relative path after each insert? Well sure, of course you could, and that's exactly how another set, called SortedSet works. SortedSet makes you implement IComparer in the same way that I implemented IEqualityComparer above.
So, in this case, AppInfoComparer would look like this:
private class AppInfoComparer : IComparer<stAppInfo>
{
// return -1 if x < y, 1 if x > y, or 0 if they are equal
public int Compare(stAppInfo x, stAppInfo y)
{
var comparison = x.sTitle.CompareTo(y.sTitle);
if (comparison != 0) return comparison;
return x.sRelativePath.CompareTo(y.sRelativePath);
}
}
And then the only other change you need to make is to use SortedSet instead of HashSet:
var appInfoSet = new SortedSet<stAppInfo>(new AppInfoComparer());
It's so much easier in fact, that you are probably wondering what gives? The reason that most people choose HashSet over SortedSet is performance. But you should balance that with how much you actually care, since you'll be maintaining that code. I personally use a tool called Resharper, which is available for Visual Studio, and it computes these hash functions for me, because I think computing them is a pain, too.
(I'll talk about the complexity of the two approaches, but if you already know it, or are not interested, feel free to skip it.)
SortedSet has a complexity of O(log n), that is to say, each time you enter a new item, will effectively go the halfway point of your set and compare. If it doesn't find your entry, it will go to the halfway point between its last guess and the group to the left or right of that guess, quickly whittling down the places for your element to hide. For a million entries, this takes about 20 attempts. Not bad at all. But, if you've chosen a good hashing function, then HashSet can do the same job, on average, in one comparison, which is O(1). And before you think 20 is not really that big a deal compared to 1 (after all computers are pretty quick), remember that you had to insert those million items, so while HashSet took about a million attempts to build that set up, SortedSet took several million attempts. But there is a price -- HashSet breaks down (very badly) if you choose a poor hashing function. If the numbers for lots of items are unique, then they will collide in the HashSet, which will then have to try again and again. If lots of items collide with the exact same number, then they will retrace each others steps, and you will be waiting a long time. The millionth entry will take a million times a million attempts -- HashSet has devolved into O(n^2). What's important with those big-O notations (which is what O(1), O(log n), and O(n^2) are, in fact) is how quickly the number in parentheses grows as you increase n. Slow growth or no growth is best. Quick growth is sometimes unavoidable. For a dozen or even a hundred items, the difference may be negligible -- but if you can get in the habit of programming efficient functions as easily as alternatives, then it's worth conditioning yourself to do so as problems are cheapest to correct closest to the point where you created that problem.
Use LINQ2Objects, group by the things that should be unique and then select the first item in each group.
var noDupes = appInfo.GroupBy(
x => new { x.sTitle, x.sRelativePath })
.Select(g => g.First()).ToArray();
!!! Array of structs (value type) + sorting or any kind of search ==> a lot of unboxing operations.
I would suggest to stick with recommendations of Jon and Henk, so make it as a class and use generic List<T>.
Use LINQ GroupBy or DistinctBy, as for me it is much simple to use built in GroupBy, but it also interesting to take a look at an other popular library, perhaps it gives you some insights.
BTW, Also take a look at the LambdaComparer it will make you life easier each time you need such kind of in place sorting/search, etc...