How to order lists without respecting certain characters? - c#

I currently have a string list that needs to be sorted without taking into account the following characters ('.', ',', '-', '\'')
Example
var cities = new List<string>()
{
"Aigle ",
"Bulle",
"La Chaux-de-Fonds",
"L'Abbaye",
"Malleray",
"Sierre",
"S. City",
"St-Aubin",
"St-Cergue",
"St-Gingolph",
"St-Légier-La Chiesaz",
"St-Maurice",
"St-Sulpice",
"St-Sulpice",
"Staad"
};
Making the order by default
var ordered = cities
.OrderBy(x => x)
.ToList();
Output
"Aigle"
"Bulle"
"La Chaux-de-Fonds"
"L'Abbaye"
"Malleray"
"S. City"
"Sierre"
"Staad"
"St-Aubin"
"St-Cergue"
"St-Gingolph"
"St-Légier-La Chiesaz"
"St-Maurice"
"St-Sulpice"
"St-Sulpice"
And the output I want has to be like this.
"Aigle "
"Bulle"
"L'Abbaye"
"La Chaux-de-Fonds"
"Malleray"
"S. City"
"Sierre"
"St-Aubin"
"St-Cergue"
"St-Gingolph"
"St-Légier-La Chiesaz"
"St-Maurice"
"St-Sulpice"
"St-Sulpice"
"Staad"
I got the output I want by doing this.
var ordered = cities
.OrderBy(x => x.Replace(".", " ").Replace("-", " ").Replace("'", " "))
.ToList();
I honestly don't know if it's okay what I'm doing.
Is there any other way to get the desired result?

Perhaps a transformation can help you
var ordered = cities
.Select(city => new { Name = city, NameForOrdering = string.Join(string.Empty, city.Where(c => Char.IsLetterOrDigit(c)).ToArray()) })
.OrderBy(city => city.NameForOrdering)
.Select(city => city.Name)
.ToList();
This could be used as a quick and dirty way to may be get you through a hurdle or test out things but the real solution would be to use the second overload for OrderBy which takes your custom equality compare-r.

Well, we can order by letters only (we ignore, i.e. remove all non letter chars):
var ordered = cities
.OrderBy(city => string.Concat(city.Where(c => char.IsLetter(c))),
StringComparer.CurrentCultureIgnoreCase)
.ToList();
// Let's have a look
Console.Write(string.Join(Environment.NewLine, ordered));
We'll get the following order
Aigle
Bulle
L'Abbaye
La Chaux-de-Fonds
Malleray
S. City
Sierre
Staad
St-Aubin
St-Cergue
St-Gingolph
St-Légier-La Chiesaz
St-Maurice
St-Sulpice
St-Sulpice
If you want to treat all non letters as spaces ' ' (your current code):
var ordered = cities
.OrderBy(city => string.Concat(city.Select(c => char.IsLetter(c) ? c : ' ')),
StringComparer.CurrentCultureIgnoreCase)
.ToList();
And the order will be
Aigle
Bulle
L'Abbaye
La Chaux-de-Fonds
Malleray
S. City
Sierre
St-Aubin
St-Cergue
St-Gingolph
St-Légier-La Chiesaz
St-Maurice
St-Sulpice
St-Sulpice
Staad
The difference of the orders in Staad location

One way to Sort the list while ignoring the specified characters by replacing the characters that needs to be ignored.
For example, For the list of string cities
var cities = new List<string>()
{
"Aigle ",
"Bulle",
"La Chaux-de-Fonds",
"L'Abbaye",
"Malleray",
"Sierre",
"S. City",
"St-Aubin",
"St-Cergue",
"St-Gingolph",
"St-Légier-La Chiesaz",
"St-Maurice",
"St-Sulpice",
"St-Sulpice",
"Staad"
};
Option 1 : Without using Regex
var charList = new List<char>{'.', ',', '-', '\''};
var result = cities.OrderBy(x => charList.Aggregate(x, (c1, c2) => c1.Replace(c2, ' '))).ToArray();
Option 2 : Using Regex.
var charList = new List<char>{'.', ',', '-', '\''};
var regex = new Regex($"[{string.Join("",charList.OrderBy(x=>x))}]*");
var result = cities.OrderBy(x=> regex.Replace(x," "));
Output
Aigle
Bulle
L'Abbaye
La Chaux-de-Fonds
Malleray
S. City
Sierre
St-Aubin
St-Cergue
St-Gingolph
St-Légier-La Chiesaz
St-Maurice
St-Sulpice
St-Sulpice
Staad

Related

LINQ query to group strings by first letter and determine total length

A sequence of non-empty strings stringList is given, containing only uppercase letters of the Latin alphabet. For all strings starting with the same letter, determine their total length and obtain a sequence of strings of the form "S-C", where S is the total length of all strings from stringList that begin with the character C.
var stringList = new[] { "YELLOW", "GREEN", "YIELD" };
var expected = new[] { "11-Y", "5-G" };
I tried this:
var groups =
from word in stringList
orderby word ascending
group word by word[0] into groupedByFirstLetter
orderby groupedByFirstLetter.Key descending
select new { key = groupedByFirstLetter.Key, Words = groupedByFirstLetter.Select(x => x.Length) };
But the output of this query is Y 6 5 G 5 instead of Y-11 G-5.
What I would like to know is how to sum the lengths if there is more than 1 word in the group, and how to format the result/display it as expected?
This should do it:
var results = stringList.OrderByDescending(x => x[0])
.ThenBy(x => x)
.GroupBy(x => x[0])
.Select(g => $"{g.Sum(x => x.Length)}-{g.Key}")
.ToArray();
var result = stringList.GroupBy(e => e[0]).Select(e => $"{e.Sum(o => o.Length)}-{e.Key}").ToArray();
Not sure I am able to rewrite it in your form.

Compare two lists from user

I have a predefined list List words.Say it has 7 elements:
List<string> resourceList={"xyz","dfgabr","asxy", "abec","def","geh","mnbj"}
Say, the user gives an input "xy+ ab" i.e he wants to search for "xy" or "ab"
string searchword="xy+ ab";
Then I have to find all the words in the predefined list which have "xy" or "ab" i.e all words split by '+'
So, the output will have:
{"xyz","dfgabr","abec",""}
I am trying something like:
resourceList.Where(s => s.Name.ToLower().Contains(searchWords.Any().ToString().ToLower())).ToList()
But, I am unable to frame the LINQ query as there are 2 arrays and one approach I saw was concatenate 2 arrays and then try; but since my second array only contains part of the first array, my LINQ does not work.
You need to first split your search pattern with + sign and then you can easily find out which are those item in list that contains your search pattern,
var result = resourceList.Where(x => searchword.Split('+').Any(y => x.Contains(y.Trim()))).ToList();
Where:
Your resourceList is
List<string> resourceList = new List<string> { "xyz", "dfgabr", "asxy", "abec", "def", "geh", "mnbj" };
And search pattern is,
string searchword = "xy+ ab";
Output: (From Debugger)
Try following which doesn't need Regex :
List<string> resourceList= new List<string>() {"xyz","dfgabr","asxy","abec","def","geh","mnbj"};
List<string> searchPattern = new List<string>() {"xy","ab"};
List<string> results = resourceList.Where(r => searchPattern.Any(s => r.Contains(s))).ToList();
You can try querying with a help of Linq:
List<string> resourceList = new List<string> {
"xyz", "dfgabr", "asxy", "abec", "def", "geh", "mnbj"
};
string input = "xy+ ab";
string[] toFind = input
.Split('+')
.Select(item => item.Trim()) // we are looking for "ab", not for " ab"
.ToArray();
// {"xyz", "dfgabr", "asxy", "abec"}
string[] result = resourceList
.Where(item => toFind
.Any(find => item.IndexOf(find) >= 0))
.ToArray();
// Let's have a look at the array
Console.Write(string.Join(", ", result));
Outcome:
xyz, dfgabr, asxy, abec
If you want to ignore case, add StringComparison.OrdinalIgnoreCase parameter to IndexOf
string[] result = resourceList
.Where(item => toFind
.Any(find => item.IndexOf(find, StringComparison.OrdinalIgnoreCase) >= 0))
.ToArray();

Remove From Duplicate Starting Names From List Linq

I have a list of paths that look like
//servername/d$/directory
I am getting the serverName from the path with the following
var host = somePath.Split(new[] { '\\' }, StringSplitOptions.RemoveEmptyEntries).FirstOrDefault();
I want to refine this list to only 1 server Name listed (say the first one found)
Example
if the list contains
//serverA/d$/directoryA
//serverA/d$/directoryB
//serverA/d$/directoryC
//serverB/d$/directoryD
//serverB/d$/directoryE
the list would turn into
//serverA/d$/directoryA
//serverB/d$/directoryD
You can group them by the server name (by trimming the start and splitting on the / character and taking the first item), and then select the first item from each group into a new list:
var serverNames = new List<string>
{
"//serverA/d$/directoryA",
"//serverA/d$/directoryB",
"//serverA/d$/directoryC",
"//serverB/d$/directoryD",
"//serverB/d$/directoryE",
};
var results = serverNames
.GroupBy(name => name.TrimStart('/').Split('/')[0])
.Select(group => group.First())
.ToList();
From your first code example it's not clear if the paths begin with \, so to handle both cases you can do:
var results = serverNames
.GroupBy(name => name.TrimStart('\\', '/', ' ').Split('\\', '/')[0])
.Select(group => group.First())
.ToList();

Using C# Lambda to split string and search value

I have a string with the following value:
0:12211,90:33221,23:09011
In each pair, the first value (before the : (colon)) is an employee id, the second value after is a payroll id.
So If I want to get the payroll id for employee id 23 right now I have to do:
var arrayValues=mystring.split(',');
and then for each arrayValues do the same:
var employeeData = arrayValue.split(':');
That way I will get the key and the value.
Is there a way to get the Payroll ID by a given employee id using lambda?
If the employeeId is not in the string then by default it should return the payrollid for employeeid 0 zero.
Using a Linq pipeline and anonymous objects:
"0:12211,90:33221,23:09011"
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
.Where(x=> x.employeeId == "23")
Results in this:
{
employeeId = "23",
payrollId = "09011"
}
These three lines represent your data processing and projection logic:
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
Then you can add any filtering logic with Where after this the second Select
You can try something like that
"0:12211,90:33221,23:09011"
.Split(new char[] { ',' })
.Select(c => {
var pair = c.Split(new char[] { ':' });
return new KeyValuePair<string, string>(pair[0], pair[1]);
})
.ToList();
You have to be aware of validations of data
If I were you, I'd use a dictionary. Especially if you're going to do more than one lookup.
Dictionary<int, int> employeeIDToPayrollID = "0:12211,90:33221,23:09011"
.Split(',') //Split on comma into ["0:12211", "90:33221", "23:09011"]
.Select(x => x.Split(':')) //Split each string on colon into [ ["0", "12211"]... ]
.ToDictionary(int.Parse(x => x[0]), int.Parse(x => x[1]))
and now, you just have to write employeeIDtoPayrollID[0] to get 12211 back. Notice that int.Parse will throw an exception if your IDs aren't integers. You can remove those calls if you want to have a Dictionary<string, string>.
You can use string.Split along with string.Substring.
var result =
str.Split(',')
.Where(s => s.Substring(0,s.IndexOf(":",StringComparison.Ordinal)) == "23")
.Select(s => s.Substring(s.IndexOf(":",StringComparison.Ordinal) + 1))
.FirstOrDefault();
if this logic will be used more than once then I'd put it to a method:
public string GetPayrollIdByEmployeeId(string source, string employeeId){
return source.Split(',')
.Where(s => s.Substring(0, s.IndexOf(":", StringComparison.Ordinal)) == employeeId)
.Select(s => s.Substring(s.IndexOf(":", StringComparison.Ordinal) + 1))
.FirstOrDefault();
}
Assuming you have more than three pairs in the string (how long is that string, anyway?) you can convert it to a Dictionary and use that going forward.
First, split on the comma and then on the colon and put in a Dictionary:
var empInfo = src.Split(',').Select(p => p.Split(':')).ToDictionary(pa => pa[0], pa => pa[1]);
Now, you can write a function to lookup payroll IDs from employee IDs:
string LookupPayrollID(Dictionary<string, string> empInfo, string empID) => empInfo.TryGetValue(empID, out var prID) ? prID : empInfo["0"];
And you can call it to get the answer:
var emp23prid = LookupPayrollID(empInfo, "23");
var emp32prid = LookupPayrollID(empInfo, "32");
If you just have three employees in the string, creating a Dictionary is probably overkill and a simpler answer may be appropriate, such as searching the string.

LINQ query that combines grouping and sorting

I am relatively new to LINQ and currently working on a query that combines grouping and sorting. I am going to start with an example here. Basically I have an arbitrary sequence of numbers represented as strings:
List<string> sNumbers = new List<string> {"34521", "38450", "138477", "38451", "28384", "13841", "12345"}
I need to find all sNumbers in this list that contain a search pattern (say "384")
then return the filtered sequence such that the sNumbers that start with the search pattern ("384") are sorted first followed by the remaining sNumbers that contain the search pattern somewhere. So it will be like this (please also notice the alphabetical sort with in the groups):
{"38450", "38451", "13841", "28384", "138477"}
Here is how I have started:
outputlist = (from n in sNumbers
where n.Contains(searchPattern
select n).ToList();
So now we have all number that contain the search pattern. And this is where I am stuck. I know that at this point I need to 'group' the results into two sequences. One that start with the search pattern and other that don't. Then apply a secondary sort in each group alphabetically. How do I write a query that combines all that?
I think you don't need any grouping nor list splitting for getting your desired result, so instead of answer about combining and grouping I will post what I would do to get desired result:
sNumbers.Where(x=>x.Contains(pattern))
.OrderByDescending(x => x.StartsWith(pattern)) // first criteria
.ThenBy(x=>Convert.ToInt32(x)) //this do the trick instead of GroupBy
.ToList();
This seems fairly straight forward, unless I've misunderstood something:
List<string> outputlist =
sNumbers
.Where(n => n.Contains("384"))
.OrderBy(n => int.Parse(n))
.OrderByDescending(n => n.StartsWith("384"))
.ToList();
I get this:
var result = sNumbers
.Where(e => e.StartsWith("384"))
.OrderBy(e => Int32.Parse(e))
.Union(sNumbers
.Where(e => e.Contains("384"))
.OrderBy(e => Int32.Parse(e)));
Here the optimized version which only needs one LINQ statement:
string match = "384";
List<string> sNumbers = new List<string> {"34521", "38450", "138477", "38451", "28384", "13841", "12345"};
// That's all it is
var result =
(from x in sNumbers
group x by new { Start = x.StartsWith(match), Contain = x.Contains(match)}
into g
where g.Key.Start || g.Key.Contain
orderby !g.Key.Start
select g.OrderBy(Convert.ToInt32)).SelectMany(x => x);
result.ToList().ForEach(x => Console.Write(x + " "));
Steps:
1.) Group into group g based on StartsWith and Contains
2.) Just select those groups which contain the match
3.) Order by the inverse of the StartsWith key (So that StartsWith = true comes before StartsWith = false)
4.) Select the sorted list of elements of both groups
5.) Do a flatMap (SelectMany) over both lists to receive one final result list
Here an unoptimized version:
string match = "384";
List<string> sNumbers = new List<string> {"34521", "38450", "138477", "38451", "28384", "13841", "12345"};
var matching = from x in sNumbers
where x.StartsWith(match)
orderby Convert.ToInt32(x)
select x;
var nonMatching = from x in sNumbers
where !x.StartsWith(match) && x.Contains(match)
orderby Convert.ToInt32(x)
select x;
var result = matching.Concat(nonMatching);
result.ToList().ForEach(x => Console.Write(x + " "));
Linq has an OrderBy method that allows you give a custom class for deciding how things should be sorted. Look here: https://msdn.microsoft.com/en-us/library/bb549422(v=vs.100).aspx
Then you can write your IComparer class that takes a value in the constructor, then a Compare method that prefers values that start with that value.
Something like this maybe:
public class CompareStringsWithPreference : IComparer<string> {
private _valueToPrefer;
public CompareStringsWithPreference(string valueToPrefer) {
_valueToPrefer = valueToPrefer;
}
public int Compare(string s1, string s2) {
if ((s1.StartsWith(_valueToPrefer) && s2.StartsWith(_valueToPrefer)) ||
(!s1.StartsWith(_valueToPrefer) && !s2.StartsWith(_valueToPrefer)))
return string.Compare(s1, s2, true);
if (s1.StartsWith(_valueToPrefer)) return -1;
if (s2.StartsWith(_valueToPrefer)) return 1;
}
}
Then use it like this:
outputlist = (from n in sNumbers
where n.Contains(searchPattern)
select n).OrderBy(n, new CompareStringsWithPreference(searchPattern))ToList();
You can create a list with strings starting with searchPattern variable and another containing searchPattern but not starting with (to avoid repeating elements in both lists):
string searchPattern = "384";
List<string> sNumbers = new List<string> { "34521", "38450", "138477", "38451", "28384", "13841", "12345" };
var list1 = sNumbers.Where(s => s.StartsWith(searchPattern)).OrderBy(s => s).ToList();
var list2 = sNumbers.Where(s => !s.StartsWith(searchPattern) && s.Contains(searchPattern)).OrderBy(s => s).ToList();
var outputList = new List<string>();
outputList.AddRange(list1);
outputList.AddRange(list2);
Sorry guys, after reading through the responses, I realize that I made a mistake in my question. The correct answer would be as follows: (sort by "starts with" first and then alphabetically (not numerically)
// output: {"38450", "38451", "13841", "138477", "28384"}
I was able to achieve that with the following query:
string searchPattern = "384";
List<string> result =
sNumbers
.Where(n => n.Contains(searchpattern))
.OrderBy(s => !s.StartsWith(searchpattern))
.ThenBy(s => s)
.ToList();
Thanks

Categories