Split a string with condition

Split a string with condition - c#

I have a string variable that contains csv value like this:
string str = "105, c#, vb, 345, 53, sql51";
so now i want to get only alphanumeric items in a list or array without using loop.
required result:
string result = "c#, vb, sql51";
Or in list Or in array...

string str = "105, c#, vb, 345, 53, sql51";
var separator = ", ";
int dummy;
var parts = str.Split(new[]{separator}, StringSplitOptions.RemoveEmptyEntries)
.Where(s => !int.TryParse(s, out dummy));
string result = string.Join(separator, parts);
Console.WriteLine(result);
prints:
c#, vb, sql51

Split using the Split method, filter with a LINQ expression, and call ToArray or ToList on the result to produce a filtered array:
var res = str
.Split(new[] {',', ' '})
.Where(s => s.Any(c => !Char.IsDigit(c)))
.ToList();
Demo on ideone.

Something like:
var str = "test,test,tes,123,5";
var result = string.Join(",", str.Split(',').Where(s => !s.All(t => Char.IsNumber(t))));
result.Dump();

"105, c#, vb, 345, 53, sql51".Split(",")
.Where(item => new Regex("[#A-Za-z0-9]").IsMatch(item))
.Select(item=> item.Trim())
.ToList();
Note: Not sure why the OP wants the numbers filtered out-- Numbers are alphanumeric.

Related

How to order lists without respecting certain characters?

I currently have a string list that needs to be sorted without taking into account the following characters ('.', ',', '-', '\'')
Example
var cities = new List<string>()
{
"Aigle ",
"Bulle",
"La Chaux-de-Fonds",
"L'Abbaye",
"Malleray",
"Sierre",
"S. City",
"St-Aubin",
"St-Cergue",
"St-Gingolph",
"St-Légier-La Chiesaz",
"St-Maurice",
"St-Sulpice",
"St-Sulpice",
"Staad"
};
Making the order by default
var ordered = cities
.OrderBy(x => x)
.ToList();
Output
"Aigle"
"Bulle"
"La Chaux-de-Fonds"
"L'Abbaye"
"Malleray"
"S. City"
"Sierre"
"Staad"
"St-Aubin"
"St-Cergue"
"St-Gingolph"
"St-Légier-La Chiesaz"
"St-Maurice"
"St-Sulpice"
"St-Sulpice"
And the output I want has to be like this.
"Aigle "
"Bulle"
"L'Abbaye"
"La Chaux-de-Fonds"
"Malleray"
"S. City"
"Sierre"
"St-Aubin"
"St-Cergue"
"St-Gingolph"
"St-Légier-La Chiesaz"
"St-Maurice"
"St-Sulpice"
"St-Sulpice"
"Staad"
I got the output I want by doing this.
var ordered = cities
.OrderBy(x => x.Replace(".", " ").Replace("-", " ").Replace("'", " "))
.ToList();
I honestly don't know if it's okay what I'm doing.
Is there any other way to get the desired result?

Perhaps a transformation can help you
var ordered = cities
.Select(city => new { Name = city, NameForOrdering = string.Join(string.Empty, city.Where(c => Char.IsLetterOrDigit(c)).ToArray()) })
.OrderBy(city => city.NameForOrdering)
.Select(city => city.Name)
.ToList();
This could be used as a quick and dirty way to may be get you through a hurdle or test out things but the real solution would be to use the second overload for OrderBy which takes your custom equality compare-r.

Well, we can order by letters only (we ignore, i.e. remove all non letter chars):
var ordered = cities
.OrderBy(city => string.Concat(city.Where(c => char.IsLetter(c))),
StringComparer.CurrentCultureIgnoreCase)
.ToList();
// Let's have a look
Console.Write(string.Join(Environment.NewLine, ordered));
We'll get the following order
Aigle
Bulle
L'Abbaye
La Chaux-de-Fonds
Malleray
S. City
Sierre
Staad
St-Aubin
St-Cergue
St-Gingolph
St-Légier-La Chiesaz
St-Maurice
St-Sulpice
St-Sulpice
If you want to treat all non letters as spaces ' ' (your current code):
var ordered = cities
.OrderBy(city => string.Concat(city.Select(c => char.IsLetter(c) ? c : ' ')),
StringComparer.CurrentCultureIgnoreCase)
.ToList();
And the order will be
Aigle
Bulle
L'Abbaye
La Chaux-de-Fonds
Malleray
S. City
Sierre
St-Aubin
St-Cergue
St-Gingolph
St-Légier-La Chiesaz
St-Maurice
St-Sulpice
St-Sulpice
Staad
The difference of the orders in Staad location

One way to Sort the list while ignoring the specified characters by replacing the characters that needs to be ignored.
For example, For the list of string cities
var cities = new List<string>()
{
"Aigle ",
"Bulle",
"La Chaux-de-Fonds",
"L'Abbaye",
"Malleray",
"Sierre",
"S. City",
"St-Aubin",
"St-Cergue",
"St-Gingolph",
"St-Légier-La Chiesaz",
"St-Maurice",
"St-Sulpice",
"St-Sulpice",
"Staad"
};
Option 1 : Without using Regex
var charList = new List<char>{'.', ',', '-', '\''};
var result = cities.OrderBy(x => charList.Aggregate(x, (c1, c2) => c1.Replace(c2, ' '))).ToArray();
Option 2 : Using Regex.
var charList = new List<char>{'.', ',', '-', '\''};
var regex = new Regex($"[{string.Join("",charList.OrderBy(x=>x))}]*");
var result = cities.OrderBy(x=> regex.Replace(x," "));
Output
Aigle
Bulle
L'Abbaye
La Chaux-de-Fonds
Malleray
S. City
Sierre
St-Aubin
St-Cergue
St-Gingolph
St-Légier-La Chiesaz
St-Maurice
St-Sulpice
St-Sulpice
Staad

Compare two lists from user

I have a predefined list List words.Say it has 7 elements:
List<string> resourceList={"xyz","dfgabr","asxy", "abec","def","geh","mnbj"}
Say, the user gives an input "xy+ ab" i.e he wants to search for "xy" or "ab"
string searchword="xy+ ab";
Then I have to find all the words in the predefined list which have "xy" or "ab" i.e all words split by '+'
So, the output will have:
{"xyz","dfgabr","abec",""}
I am trying something like:
resourceList.Where(s => s.Name.ToLower().Contains(searchWords.Any().ToString().ToLower())).ToList()
But, I am unable to frame the LINQ query as there are 2 arrays and one approach I saw was concatenate 2 arrays and then try; but since my second array only contains part of the first array, my LINQ does not work.

You need to first split your search pattern with + sign and then you can easily find out which are those item in list that contains your search pattern,
var result = resourceList.Where(x => searchword.Split('+').Any(y => x.Contains(y.Trim()))).ToList();
Where:
Your resourceList is
List<string> resourceList = new List<string> { "xyz", "dfgabr", "asxy", "abec", "def", "geh", "mnbj" };
And search pattern is,
string searchword = "xy+ ab";
Output: (From Debugger)

Try following which doesn't need Regex :
List<string> resourceList= new List<string>() {"xyz","dfgabr","asxy","abec","def","geh","mnbj"};
List<string> searchPattern = new List<string>() {"xy","ab"};
List<string> results = resourceList.Where(r => searchPattern.Any(s => r.Contains(s))).ToList();

You can try querying with a help of Linq:
List<string> resourceList = new List<string> {
"xyz", "dfgabr", "asxy", "abec", "def", "geh", "mnbj"
};
string input = "xy+ ab";
string[] toFind = input
.Split('+')
.Select(item => item.Trim()) // we are looking for "ab", not for " ab"
.ToArray();
// {"xyz", "dfgabr", "asxy", "abec"}
string[] result = resourceList
.Where(item => toFind
.Any(find => item.IndexOf(find) >= 0))
.ToArray();
// Let's have a look at the array
Console.Write(string.Join(", ", result));
Outcome:
xyz, dfgabr, asxy, abec
If you want to ignore case, add StringComparison.OrdinalIgnoreCase parameter to IndexOf
string[] result = resourceList
.Where(item => toFind
.Any(find => item.IndexOf(find, StringComparison.OrdinalIgnoreCase) >= 0))
.ToArray();

Using C# Lambda to split string and search value

I have a string with the following value:
0:12211,90:33221,23:09011
In each pair, the first value (before the : (colon)) is an employee id, the second value after is a payroll id.
So If I want to get the payroll id for employee id 23 right now I have to do:
var arrayValues=mystring.split(',');
and then for each arrayValues do the same:
var employeeData = arrayValue.split(':');
That way I will get the key and the value.
Is there a way to get the Payroll ID by a given employee id using lambda?
If the employeeId is not in the string then by default it should return the payrollid for employeeid 0 zero.

Using a Linq pipeline and anonymous objects:
"0:12211,90:33221,23:09011"
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
.Where(x=> x.employeeId == "23")
Results in this:
{
employeeId = "23",
payrollId = "09011"
}
These three lines represent your data processing and projection logic:
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
Then you can add any filtering logic with Where after this the second Select

You can try something like that
"0:12211,90:33221,23:09011"
.Split(new char[] { ',' })
.Select(c => {
var pair = c.Split(new char[] { ':' });
return new KeyValuePair<string, string>(pair[0], pair[1]);
})
.ToList();
You have to be aware of validations of data

If I were you, I'd use a dictionary. Especially if you're going to do more than one lookup.
Dictionary<int, int> employeeIDToPayrollID = "0:12211,90:33221,23:09011"
.Split(',') //Split on comma into ["0:12211", "90:33221", "23:09011"]
.Select(x => x.Split(':')) //Split each string on colon into [ ["0", "12211"]... ]
.ToDictionary(int.Parse(x => x[0]), int.Parse(x => x[1]))
and now, you just have to write employeeIDtoPayrollID[0] to get 12211 back. Notice that int.Parse will throw an exception if your IDs aren't integers. You can remove those calls if you want to have a Dictionary<string, string>.

You can use string.Split along with string.Substring.
var result =
str.Split(',')
.Where(s => s.Substring(0,s.IndexOf(":",StringComparison.Ordinal)) == "23")
.Select(s => s.Substring(s.IndexOf(":",StringComparison.Ordinal) + 1))
.FirstOrDefault();
if this logic will be used more than once then I'd put it to a method:
public string GetPayrollIdByEmployeeId(string source, string employeeId){
return source.Split(',')
.Where(s => s.Substring(0, s.IndexOf(":", StringComparison.Ordinal)) == employeeId)
.Select(s => s.Substring(s.IndexOf(":", StringComparison.Ordinal) + 1))
.FirstOrDefault();
}

Assuming you have more than three pairs in the string (how long is that string, anyway?) you can convert it to a Dictionary and use that going forward.
First, split on the comma and then on the colon and put in a Dictionary:
var empInfo = src.Split(',').Select(p => p.Split(':')).ToDictionary(pa => pa[0], pa => pa[1]);
Now, you can write a function to lookup payroll IDs from employee IDs:
string LookupPayrollID(Dictionary<string, string> empInfo, string empID) => empInfo.TryGetValue(empID, out var prID) ? prID : empInfo["0"];
And you can call it to get the answer:
var emp23prid = LookupPayrollID(empInfo, "23");
var emp32prid = LookupPayrollID(empInfo, "32");
If you just have three employees in the string, creating a Dictionary is probably overkill and a simpler answer may be appropriate, such as searching the string.

LINQ Query to find string of multidimensional array with most duplicates

I have written a function that gives me an multidimensional array of an Match with multiple regex strings. (FileCheck[][])
FileCheck[0] // This string[] contains all the filenames
FileCheck[1] // This string[] is 0 or 1 depending on a Regex match is found.
FileCheck[2] // This string[] contains the Index of the first found Regex.
foreach (string File in InputFolder)
{
int j = 0;
FileCheck[0][k] = Path.GetFileName(File);
Console.WriteLine(FileCheck[0][k]);
foreach (Regex Filemask in Filemasks)
{
if (string.IsNullOrEmpty(FileCheck[1][k]) || FileCheck[1][k] == "0")
{
if (Filemask.IsMatch(FileCheck[0][k]))
{
FileCheck[1][k] = "1";
FileCheck[2][k] = j.ToString(); // This is the Index of the Regex thats Valid
}
else
{
FileCheck[1][k] = "0";
}
j++;
}
Console.WriteLine(FileCheck[1][k]);
}
k++;
}
Console.ReadLine();
// I need the Index of the Regex with the most valid hits
I'm trying to write a function that gives me the string of the RegexIndex that has the most duplicates.
This is what I tried but did not work :( (I only get the count of the string the the most duplicates but not the string itself)
// I need the Index of the Regex with the most valid hits
var LINQ = Enumerable.Range(0, FileCheck[0].GetLength(0))
.Where(x => FileCheck[1][x] == "1")
.GroupBy(x => FileCheck[2][x])
.OrderByDescending(x => x.Count())
.First().ToList();
Console.WriteLine(LINQ[1]);
Example Data
string[][] FileCheck = new string[3][];
FileCheck[0] = new string[]{ "1.csv", "TestValid1.txt", "TestValid2.txt", "2.xml", "TestAlsoValid.xml", "TestValid3.txt"};
FileCheck[1] = new string[]{ "0","1","1","0","1","1"};
FileCheck[2] = new string[]{ null, "3", "3", null,"1","2"};
In this example I need as result of the Linq query:
string result = "3";

With your current code, substituting 'ToList()' with 'Key' would do the trick.
var LINQ = Enumerable.Range(0, FileCheck[0].GetLength(0))
.Where(x => FileCheck[1][x] == "1")
.GroupBy(x => FileCheck[2][x])
.OrderByDescending(x => x.Count())
.First().Key;
Since the index is null for values that are not found, you could also filter out null values and skip looking at the FileCheck[1] array. For example:
var maxOccurringIndex = FileCheck[2].Where(ind => ind != null)
.GroupBy(ind=>ind)
.OrderByDescending(x => x.Count())
.First().Key;
However, just a suggestion, you can use classes instead of a nested array, e.g.:
class FileCheckInfo
{
public string File{get;set;}
public bool Match => Index.HasValue;
public int? Index{get;set;}
public override string ToString() => $"{File} [{(Match ? Index.ToString() : "no match")}]";
}
Assuming InputFolder is an enumerable of string and Filemasks an enumerable of 'Regex', an array can be filled with:
FileCheckInfo[] FileCheck = InputFolder.Select(f=>
new FileCheckInfo{
File = f,
Index = Filemasks.Select((rx,ind) => new {ind, IsMatch = rx.IsMatch(f)}).FirstOrDefault(r=>r.IsMatch)?.ind
}).ToArray();
Getting the max occurring would be much the same:
var maxOccurringIndex = FileCheck.Where(f=>f.Match).GroupBy(f=>f.Index).OrderByDescending(gr=>gr.Count()).First().Key;
edit PS, the above is all assuming you need to reuse the results, if you only have to find the maximum occurrence you're much better of with an approach such as Martin suggested!
If the goal is only to get the max occurrence, you can use:
var maxOccurringIndex = Filemasks.Select((rx,ind) => new {ind, Count = InputFolder.Count(f=>rx.IsMatch(f))})
.OrderByDescending(m=>m.Count).FirstOrDefault()?.ind;

Your question and code seems very convoluted. I am guessing that you have a list of file names and another list of file masks (regular expressions) and you want to find the file mask that matches most file names. Here is a way to do that:
var fileNames = new[] { "1.csv", "TestValid1.txt", "TestValid2.txt", "2.xml", "TestAlsoValid.xml", "TestValid3.txt" };
var fileMasks = new[] { #"\.txt$", #"\.xml$", "valid" };
var fileMaskWithMostMatches = fileMasks
.Select(
fileMask => new {
FileMask = fileMask,
FileNamesMatched = fileNames.Count(
fileName => Regex.Match(
fileName,
fileMask,
RegexOptions.IgnoreCase | RegexOptions.CultureInvariant
)
.Success
)
}
)
.OrderByDescending(x => x.FileNamesMatched)
.First()
.FileMask;
With the sample data the value of fileMaskWithMostMatches is valid.
Note that the Regex class will do some caching of regular expressions but if you have many regular expressions it will be more effecient to create the regular expressions outside the implied fileNames.Count for-each loop to avoid recreating the same regular expression again and again (creating a regular expression may take a non-trivial amount of time depending on the complexity).

As an alternative to Martin's answer, here's a simpler version to your existing Linq query that gives the desired result;
var LINQ = FileCheck[2]
.ToLookup(x => x) // Makes a lookup table
.OrderByDescending(x => x.Count()) // Sorts by count, descending
.Select(x => x.Key) // Extract the key
.FirstOrDefault(x => x != null); // Return the first non null key
// or null if none found.

Isn't this much more easier?
string result = FileCheck[2]
.Where(x => x != null)
.GroupBy(x => x)
.OrderByDescending(x => x.Count())
.FirstOrDefault().Key;

Get distinct output from an array

From a string array
string[] str1={"u1-u2","u1-u2","u1-u4","u4-u1"};
string[] str2 = str1.Distinct().ToArray();
Distinct elements in a arry is:"u1-u2","u1-u4","u4-u1"
But i have to get distinct output like this: "u1-u2","u1-u4".
so please help me out

You can do like this:
string[] output = str1.Select(s => new { Value = s, NormalizedValue = string.Join("-", s.Split('-').OrderBy(_ => _)) })
.GroupBy(p => p.NormalizedValue)
.Select(g => g.OrderBy(p => p.Value).First().Value)
.ToArray();

You can convert all values to their normalized form and the call Distinct() on that:
string[] output = str1.Select(string.Join("-", s.Split('-').OrderBy(x => x)))
.Distinct()
.ToArray();
(This is based on the code from Ulugbek Umirov's answer.)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Split a string with condition - c#

I have a string variable that contains csv value like this: string str = "105, c#, vb, 345, 53, sql51"; so now i want to get only alphanumeric items in a list or array without using loop. required result: string result = "c#, vb, sql51"; Or in list Or in array...

string str = "105, c#, vb, 345, 53, sql51"; var separator = ", "; int dummy; var parts = str.Split(new[]{separator}, StringSplitOptions.RemoveEmptyEntries) .Where(s => !int.TryParse(s, out dummy)); string result = string.Join(separator, parts); Console.WriteLine(result); prints: c#, vb, sql51

Split using the Split method, filter with a LINQ expression, and call ToArray or ToList on the result to produce a filtered array: var res = str .Split(new[] {',', ' '}) .Where(s => s.Any(c => !Char.IsDigit(c))) .ToList(); Demo on ideone.

Something like: var str = "test,test,tes,123,5"; var result = string.Join(",", str.Split(',').Where(s => !s.All(t => Char.IsNumber(t)))); result.Dump();

"105, c#, vb, 345, 53, sql51".Split(",") .Where(item => new Regex("[#A-Za-z0-9]").IsMatch(item)) .Select(item=> item.Trim()) .ToList(); Note: Not sure why the OP wants the numbers filtered out-- Numbers are alphanumeric.

Related

How to order lists without respecting certain characters?

Compare two lists from user

Using C# Lambda to split string and search value

LINQ Query to find string of multidimensional array with most duplicates

Get distinct output from an array

Categories

Resources