LINQ approach to parse lines with keys/values - c#

I have the following string
MyKey1=MyVal1
MyKey2=MyVal2
MyKey3=MyVal3
MyKey3=MyVal3
So first, in need to split into lines, then I need to split each line by '=' char to get key and value from that line. What I want, as a result, is a List<KeyValuePair<string, string>> (why not a Dictionary? => there may be duplicate keys inside the list), so I can't use the .ToDictionary() extension.
I'm pretty stuck with the following:
List<KeyValuePair<string, string>> fields =
(from lines in Regex.Split(input, #"\r?\n|\r", RegexOptions.None)
where !String.IsNullOrWhiteSpace(lines)
.Select(x => x.Split(new [] { '='}, 2, StringSplitOptions.RemoveEmptyEntries))
.ToList()
--> select new KeyValuePair? Or with 'let' for splitting by '='?
what about exception handling (e.g. ignoring empty values)

If you're concerned about duplicate keys, you could use an ILookup instead:
var fields =
(from line in Regex.Split(input, #"\r?\n|\r", RegexOptions.None)
select line.Split(new [] { '=' }, 2))
.ToLookup(x => x[0], x => x[1]);
var items = fields["MyKey3"]; // [ "MyVal3", "MyVal3" ]

You could use a Lookup<TKey, TValue> instead of a dictionary:
var keyValLookup = text.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries)
.Select(l =>
{
var keyVal = l.Split('=');
return new { Key = keyVal[0].Trim(), Value = keyVal.ElementAtOrDefault(1) };
})
.Where(x => x.Key.Length > 0) // not required, just to show how to handle invalid data
.ToLookup(x => x.Key, x => x.Value);
IEnumerable<string> values = keyValLookup["MyKey3"];
Console.Write(string.Join(", ",values)); // MyVal3, MyVal3
A lookup always returns a value even if the key is not present. Then it's an empty sequence. The key must not be unique, so you don't need to group by or remove duplicates before you use ToLookup.

You're pretty close (I changed your example to all method syntax for consistency):
List<KeyValuePair<string, string>> fields =
Regex.Split(input, #"\r?\n|\r", RegexOptions.None)
.Where(s => !String.IsNullOrWhiteSpace(s))
.Select(x => x.Split(new [] {'='}, 2, StringSplitOptions.RemoveEmptyEntries)
.Where(p => p.Length == 2) // to avoid IndexOutOfRangeException
.Select(p => new KeyValuePair(p[0], p[1]));
Although I agree with Jon's comment that a grouping would be cleaner if you have duplicate keys:
IEnumerable<IGrouping<string, string>> fields =
Regex.Split(input, #"\r?\n|\r", RegexOptions.None)
.Where(s => !String.IsNullOrWhiteSpace(s))
.Select(x => x.Split(new [] {'='}, 2, StringSplitOptions.RemoveEmptyEntries))
.GroupBy(p => p[0]);

I suggest you try matching the Key/Value instead of splitting. If you want a dictionary with multiple values for a key, you could use ToLookup (an ILookup):
var result = Regex.Matches(input, #"(?<key>[^=\r\n]+)=(?<value>[^=\r\n]+)")
.OfType<Match>()
.ToLookup(m => m.Groups["key"].Value,
m => m.Groups["value"].Value);
If you need to add to that list later on or want to keep using a list:
var result = Regex.Matches(input, #"(?<key>[^=\r\n]+)=(?<value>[^=\r\n]+)")
.OfType<Match>()
.Select(m => new KeyValuePair<string, string>(m.Groups["key"].Value, m.Groups["value"].Value))
.ToList();
Note: the Regex used might not be suited for your uses as we don't know the inputs you might have.

Related

How to modify string list for duplicate values?

I am working on project which is asp.net mvc core. I want to replace string list of duplicate values to one with comma separated,
List<string> stringList = surveylist.Split('&').ToList();
I have string list
This generate following output:
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF
1003=ABC
1=JKL
And I want to change output like this
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF,ABC
1=JKL
Duplicate items values should be comma separated.
There are probably 20 ways to do this. One simple one would be:
List<string> newStringList = stringList
.Select(a => new { KeyValue = a.Split("=") })
.GroupBy(a => a.KeyValue[0])
.Select(a => $"{a.Select(x => x.KeyValue[0]).First()}={string.Join(",", a.Select(x => x.KeyValue[1]))}")
.ToList();
Take a look at your output. Notice that an equal sign separates each string into a key-value pair. Think about how you want to approach this problem. Is a list of strings really the structure you want to build on? You could take a different approach and use a list of KeyValuePairs or a Dictionary instead.
If you really need to do it with a List, then look at the methods LINQ's Enumerable has to offer. Namely Select and GroupBy.
You can use Select to split once more on the equal sign: .Select(s => s.Split('=')).
You can use GroupBy to group values by a key: .GroupBy(pair => pair[0]).
To join it back to a string, you can use a Select again.
An end result could look something like this:
List<string> stringList = values.Split('&')
.Select(s => {
string[] pair = s.Split('=');
return new { Key = pair[0], Value = pair[1] };
})
.GroupBy(pair => pair.Key)
.Select(g => string.Concat(
g.Key,
'=',
string.Join(
", ",
g.Select(pair => pair.Value)
)
))
.ToList();
The group contains pairs so you need to select the value of each pair and join them into a string.

Using C# Lambda to split string and search value

I have a string with the following value:
0:12211,90:33221,23:09011
In each pair, the first value (before the : (colon)) is an employee id, the second value after is a payroll id.
So If I want to get the payroll id for employee id 23 right now I have to do:
var arrayValues=mystring.split(',');
and then for each arrayValues do the same:
var employeeData = arrayValue.split(':');
That way I will get the key and the value.
Is there a way to get the Payroll ID by a given employee id using lambda?
If the employeeId is not in the string then by default it should return the payrollid for employeeid 0 zero.
Using a Linq pipeline and anonymous objects:
"0:12211,90:33221,23:09011"
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
.Where(x=> x.employeeId == "23")
Results in this:
{
employeeId = "23",
payrollId = "09011"
}
These three lines represent your data processing and projection logic:
.Split(',')
.Select(x => x.Split(':'))
.Select(x => new { employeeId = x[0], payrollId = x[1] })
Then you can add any filtering logic with Where after this the second Select
You can try something like that
"0:12211,90:33221,23:09011"
.Split(new char[] { ',' })
.Select(c => {
var pair = c.Split(new char[] { ':' });
return new KeyValuePair<string, string>(pair[0], pair[1]);
})
.ToList();
You have to be aware of validations of data
If I were you, I'd use a dictionary. Especially if you're going to do more than one lookup.
Dictionary<int, int> employeeIDToPayrollID = "0:12211,90:33221,23:09011"
.Split(',') //Split on comma into ["0:12211", "90:33221", "23:09011"]
.Select(x => x.Split(':')) //Split each string on colon into [ ["0", "12211"]... ]
.ToDictionary(int.Parse(x => x[0]), int.Parse(x => x[1]))
and now, you just have to write employeeIDtoPayrollID[0] to get 12211 back. Notice that int.Parse will throw an exception if your IDs aren't integers. You can remove those calls if you want to have a Dictionary<string, string>.
You can use string.Split along with string.Substring.
var result =
str.Split(',')
.Where(s => s.Substring(0,s.IndexOf(":",StringComparison.Ordinal)) == "23")
.Select(s => s.Substring(s.IndexOf(":",StringComparison.Ordinal) + 1))
.FirstOrDefault();
if this logic will be used more than once then I'd put it to a method:
public string GetPayrollIdByEmployeeId(string source, string employeeId){
return source.Split(',')
.Where(s => s.Substring(0, s.IndexOf(":", StringComparison.Ordinal)) == employeeId)
.Select(s => s.Substring(s.IndexOf(":", StringComparison.Ordinal) + 1))
.FirstOrDefault();
}
Assuming you have more than three pairs in the string (how long is that string, anyway?) you can convert it to a Dictionary and use that going forward.
First, split on the comma and then on the colon and put in a Dictionary:
var empInfo = src.Split(',').Select(p => p.Split(':')).ToDictionary(pa => pa[0], pa => pa[1]);
Now, you can write a function to lookup payroll IDs from employee IDs:
string LookupPayrollID(Dictionary<string, string> empInfo, string empID) => empInfo.TryGetValue(empID, out var prID) ? prID : empInfo["0"];
And you can call it to get the answer:
var emp23prid = LookupPayrollID(empInfo, "23");
var emp32prid = LookupPayrollID(empInfo, "32");
If you just have three employees in the string, creating a Dictionary is probably overkill and a simpler answer may be appropriate, such as searching the string.

Saving a split string to an arraylist using LINQ

I have some code that takes a string and processes it by splitting it into words, and giving the count of each word.
The trouble is it only returns void, because I am only able to print to the screen after the processing is done. Is there any way I can save the results in an arraylist, so that that I can return it to the method that called it?
The current code:
message.Split(' ').Where(messagestr => !string.IsNullOrEmpty(messagestr))
.GroupBy(messagestr => messagestr).OrderByDescending(groupCount => groupCount.Count())
.Take(20).ToList().ForEach(groupCount => Console.WriteLine("{0}\t{1}", groupCount.Key, groupCount.Count()));
Thank you.
Try this code
var wordCountList = message.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
.GroupBy(messagestr => messagestr)
.OrderByDescending(grp => grp.Count())
.Take(20) //or take the whole
.Select(grp => new KeyValuePair<string, int>(grp.Key, grp.Count()))
.ToList(); //return wordCountList
//usage
wordCountList.ForEach(item => Console.WriteLine("{0}\t{1}", item.Key, item.Value));
If you want, you can return the wordCountList which is a List<KeyValuePair<string, int>> containing all the words and their counts in descending order.
How you can use that, is also shown in the last line.
And rather than taking first 20 from the list, if you want to take the whole, remove this .Take(20) part.
First of all, by calling Take(20) you just take the first 20 words and put the others away. So, if you want all the results, remove it.
After that, you can do it like this:
var words = message.Split(' ').
Where(messagestr => !string.IsNullOrEmpty(messagestr)).
GroupBy(messagestr => messagestr).
OrderByDescending(groupCount => groupCount.Count()).
ToList();
words.ForEach(groupCount => Console.WriteLine("{0}\t{1}", groupCount.Key, groupCount.Count()));
To put the results into some other data structure, you can use one of these ways:
var w = words.SelectMany(x => x.Distinct()).ToList(); //Add this line to get all the words in an array
// OR Use Dictionary
var dic = new Dictionary<string, int>();
foreach(var item in words)
{
dic.Add(item.Key, item.Count());
}

Convert Colon Separated String to a Dictionary<string, string>

I have a string Number1.pdf:Alpha1.pdf; Number2.pdf:Alpha2.pdf; Number3.pdf:Alpha3.pdf; and I would like get it converted to a Dictionary.
Ditionary<Number1,Alpha1> etc.
I looked for examples online and I found most of them converting Dictionary to String.Can someone help me ?
I would go with LINQ:
var input = "Number1.pdf:Alpha1.pdf; Number2.pdf:Alpha2.pdf; Number3.pdf:Alpha3.pdf;";
var items = input.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
var result = items.Select(x => x.Split(':'))
.ToDictionary(x => x[0].Split('.').First().Trim(),
x => x[1].Split('.').First().Trim());
It will skip .pdf at the end of both keys and values (as described in question).
foreach (var i in result)
Console.WriteLine(i);
prints
[Number1, Alpha1]
[Number2, Alpha2]
[Number3, Alpha3]
string s = "Number1.pdf:Alpha1.pdf; Number2.pdf:Alpha2.pdf; Number3.pdf:Alpha3.pdf;";
var names = s.Replace(".pdf","")
.Split(";".ToCharArray(), StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Split(':'))
.ToDictionary(x => x[0].Trim(), x => x[1]);

IEnumerable<string> to Dictionary<char, IEnumerable<string>>

I suppose that this question might partially duplicate other similar questions, but i'm having troubles with such a situation:
I want to extract from some string sentences
For example from
`string sentence = "We can store these chars in separate variables. We can also test against other string characters.";`
I want to build an IEnumerable words;
var separators = new[] {',', ' ', '.'};
IEnumerable<string> words = sentence.Split(separators, StringSplitOptions.RemoveEmptyEntries);
After that, go throught all these words and take firs character into a distinct ascending ordered collection of characters.
var firstChars = words.Select(x => x.ToCharArray().First()).OrderBy(x => x).Distinct();
After that, go through both collections and for each character in firstChars get all items from words which has the first character equal with current character and create a Dictionary<char, IEnumerable<string>> dictionary.
I'm doing this way:
var dictionary = (from k in firstChars
from v in words
where v.ToCharArray().First().Equals(k)
select new { k, v })
.ToDictionary(x => x);
and here is the problem: An item with the same key has already been added.
Whis is because into that dictionary It is going to add an existing character.
I included a GroupBy extension into my query
var dictionary = (from k in firstChars
from v in words
where v.ToCharArray().First().Equals(k)
select new { k, v })
.GroupBy(x => x)
.ToDictionary(x => x);
The solution above gives makes all OK, but it gives me other type than I need.
What I should do to get as result an Dictionary<char, IEnumerable<string>>dictionary but not Dictionary<IGouping<'a,'a>> ?
The result which I want is as in the bellow image:
But here I have to iterate with 2 foreach(s) which will Show me wat i want... I cannot understand well how this happens ...
Any suggestion and advice will be welcome. Thank you.
As the relation is one to many, you can use a lookup instead of a dictionary:
var lookup = words.ToLookup(word => word[0]);
loopkup['s'] -> store, separate... as an IEnumerable<string>
And if you want to display the key/values sorted by first char:
for (var sortedEntry in lookup.OrderBy(entry => entry.Key))
{
Console.WriteLine(string.Format("First letter: {0}", sortedEntry.Key);
foreach (string word in sortedEntry)
{
Console.WriteLine(word);
}
}
You can do this:
var words = ...
var dictionary = words.GroupBy(w => w[0])
.ToDictionary(g => g.Key, g => g.AsEnumerable());
But for matter, why not use an ILookup?
var lookup = words.ToLookup(w => w[0]);

Categories