IEnumerable<string> to Dictionary<char, IEnumerable<string>> - c#

I suppose that this question might partially duplicate other similar questions, but i'm having troubles with such a situation:
I want to extract from some string sentences
For example from
`string sentence = "We can store these chars in separate variables. We can also test against other string characters.";`
I want to build an IEnumerable words;
var separators = new[] {',', ' ', '.'};
IEnumerable<string> words = sentence.Split(separators, StringSplitOptions.RemoveEmptyEntries);
After that, go throught all these words and take firs character into a distinct ascending ordered collection of characters.
var firstChars = words.Select(x => x.ToCharArray().First()).OrderBy(x => x).Distinct();
After that, go through both collections and for each character in firstChars get all items from words which has the first character equal with current character and create a Dictionary<char, IEnumerable<string>> dictionary.
I'm doing this way:
var dictionary = (from k in firstChars
from v in words
where v.ToCharArray().First().Equals(k)
select new { k, v })
.ToDictionary(x => x);
and here is the problem: An item with the same key has already been added.
Whis is because into that dictionary It is going to add an existing character.
I included a GroupBy extension into my query
var dictionary = (from k in firstChars
from v in words
where v.ToCharArray().First().Equals(k)
select new { k, v })
.GroupBy(x => x)
.ToDictionary(x => x);
The solution above gives makes all OK, but it gives me other type than I need.
What I should do to get as result an Dictionary<char, IEnumerable<string>>dictionary but not Dictionary<IGouping<'a,'a>> ?
The result which I want is as in the bellow image:
But here I have to iterate with 2 foreach(s) which will Show me wat i want... I cannot understand well how this happens ...
Any suggestion and advice will be welcome. Thank you.

As the relation is one to many, you can use a lookup instead of a dictionary:
var lookup = words.ToLookup(word => word[0]);
loopkup['s'] -> store, separate... as an IEnumerable<string>
And if you want to display the key/values sorted by first char:
for (var sortedEntry in lookup.OrderBy(entry => entry.Key))
{
Console.WriteLine(string.Format("First letter: {0}", sortedEntry.Key);
foreach (string word in sortedEntry)
{
Console.WriteLine(word);
}
}

You can do this:
var words = ...
var dictionary = words.GroupBy(w => w[0])
.ToDictionary(g => g.Key, g => g.AsEnumerable());
But for matter, why not use an ILookup?
var lookup = words.ToLookup(w => w[0]);

Related

How to modify string list for duplicate values?

I am working on project which is asp.net mvc core. I want to replace string list of duplicate values to one with comma separated,
List<string> stringList = surveylist.Split('&').ToList();
I have string list
This generate following output:
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF
1003=ABC
1=JKL
And I want to change output like this
7=55
6=33
5=MCC
4=GHI
3=ABC
1003=DEF,ABC
1=JKL
Duplicate items values should be comma separated.
There are probably 20 ways to do this. One simple one would be:
List<string> newStringList = stringList
.Select(a => new { KeyValue = a.Split("=") })
.GroupBy(a => a.KeyValue[0])
.Select(a => $"{a.Select(x => x.KeyValue[0]).First()}={string.Join(",", a.Select(x => x.KeyValue[1]))}")
.ToList();
Take a look at your output. Notice that an equal sign separates each string into a key-value pair. Think about how you want to approach this problem. Is a list of strings really the structure you want to build on? You could take a different approach and use a list of KeyValuePairs or a Dictionary instead.
If you really need to do it with a List, then look at the methods LINQ's Enumerable has to offer. Namely Select and GroupBy.
You can use Select to split once more on the equal sign: .Select(s => s.Split('=')).
You can use GroupBy to group values by a key: .GroupBy(pair => pair[0]).
To join it back to a string, you can use a Select again.
An end result could look something like this:
List<string> stringList = values.Split('&')
.Select(s => {
string[] pair = s.Split('=');
return new { Key = pair[0], Value = pair[1] };
})
.GroupBy(pair => pair.Key)
.Select(g => string.Concat(
g.Key,
'=',
string.Join(
", ",
g.Select(pair => pair.Value)
)
))
.ToList();
The group contains pairs so you need to select the value of each pair and join them into a string.

How to custom sort a list of strings given a desired sorting hierarchy?

I would like to sort a List<string> in a particular way. Below is a unit test showing the input, the specific way (which I am calling a "hierarchy" - feel free to correct my terminology so that I may learn), and the desired output. The code should be self explanatory.
[Test]
public void CustomSortByHierarchy()
{
List<string> input = new List<string>{"TJ", "DJ", "HR", "HR", "TJ"};
List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };
List<string> sorted = input.Sort(hierarchy); // <-- does not compile. How do I sort by the hierarchy?
// ...and if the sort worked as desired, these assert statements would return true:
Assert.AreEqual("HR", sorted[0]);
Assert.AreEqual("HR", sorted[1]);
Assert.AreEqual("TJ", sorted[2]);
Assert.AreEqual("TJ", sorted[3]);
Assert.AreEqual("DJ", sorted[4]);
}
Another way to do it:
var hierarchy = new Dictionary<string, int>{
{ "HR", 1},
{ "TJ", 2},
{ "DJ", 3} };
var sorted = strings.OrderBy(s => hierarchy[s]).ToList();
There are so many ways to do this.
It's not great to create a static dictionary - especially when you have a static list of the values already in the order that you want (i.e. List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };). The problem with a static dictionary is that it is static - to change it you must recompile your program - and also it's prone to errors - you might mistype a number. It's best to dynamically create the dictionary. That way you can adjust your hierarchy at run-time and use it to order your input.
Here's the basic way to create the dictionary:
Dictionary<string, int> indices =
hierarchy
.Select((value, index) => new { value, index })
.ToDictionary(x => x.value, x => x.index);
Then it's an easy sort:
List<string> sorted = input.OrderBy(x => indices[x]).ToList();
However, if you have a missing value in the hierarchy then this will blow up with a KeyNotFoundException exception.
Try with this input:
List<string> input = new List<string> { "TJ", "DJ", "HR", "HR", "TJ", "XX" };
You need to decide if you are removing missing items from the list or concatenating them at the end of the list.
To remove you'd do this:
List<string> sorted =
input
.Where(x => indices.ContainsKey(x))
.OrderBy(x => indices[x])
.ToList();
Or to sort to the end you'd do this:
List<string> sorted =
input
.OrderBy(x => indices.ContainsKey(x) ? indices[x] : int.MaxValue)
.ThenBy(x => x) // groups missing items together and is optional
.ToList();
If you simply want to remove items from input that aren't in hierarchy then there are a couple of other options that might be appealing.
Try this:
List<string> sorted =
(
from x in input
join y in hierarchy.Select((value, index) => new { value, index })
on x equals y.value
orderby y.index
select x
).ToList();
Or this:
ILookup<string, string> lookup = input.ToLookup(x => x);
List<string> sorted = hierarchy.SelectMany(x => lookup[x]).ToList();
Personally, I like this last one. It's a two liner and it doesn't rely on indices at all.

LINQ query that combines grouping and sorting

I am relatively new to LINQ and currently working on a query that combines grouping and sorting. I am going to start with an example here. Basically I have an arbitrary sequence of numbers represented as strings:
List<string> sNumbers = new List<string> {"34521", "38450", "138477", "38451", "28384", "13841", "12345"}
I need to find all sNumbers in this list that contain a search pattern (say "384")
then return the filtered sequence such that the sNumbers that start with the search pattern ("384") are sorted first followed by the remaining sNumbers that contain the search pattern somewhere. So it will be like this (please also notice the alphabetical sort with in the groups):
{"38450", "38451", "13841", "28384", "138477"}
Here is how I have started:
outputlist = (from n in sNumbers
where n.Contains(searchPattern
select n).ToList();
So now we have all number that contain the search pattern. And this is where I am stuck. I know that at this point I need to 'group' the results into two sequences. One that start with the search pattern and other that don't. Then apply a secondary sort in each group alphabetically. How do I write a query that combines all that?
I think you don't need any grouping nor list splitting for getting your desired result, so instead of answer about combining and grouping I will post what I would do to get desired result:
sNumbers.Where(x=>x.Contains(pattern))
.OrderByDescending(x => x.StartsWith(pattern)) // first criteria
.ThenBy(x=>Convert.ToInt32(x)) //this do the trick instead of GroupBy
.ToList();
This seems fairly straight forward, unless I've misunderstood something:
List<string> outputlist =
sNumbers
.Where(n => n.Contains("384"))
.OrderBy(n => int.Parse(n))
.OrderByDescending(n => n.StartsWith("384"))
.ToList();
I get this:
var result = sNumbers
.Where(e => e.StartsWith("384"))
.OrderBy(e => Int32.Parse(e))
.Union(sNumbers
.Where(e => e.Contains("384"))
.OrderBy(e => Int32.Parse(e)));
Here the optimized version which only needs one LINQ statement:
string match = "384";
List<string> sNumbers = new List<string> {"34521", "38450", "138477", "38451", "28384", "13841", "12345"};
// That's all it is
var result =
(from x in sNumbers
group x by new { Start = x.StartsWith(match), Contain = x.Contains(match)}
into g
where g.Key.Start || g.Key.Contain
orderby !g.Key.Start
select g.OrderBy(Convert.ToInt32)).SelectMany(x => x);
result.ToList().ForEach(x => Console.Write(x + " "));
Steps:
1.) Group into group g based on StartsWith and Contains
2.) Just select those groups which contain the match
3.) Order by the inverse of the StartsWith key (So that StartsWith = true comes before StartsWith = false)
4.) Select the sorted list of elements of both groups
5.) Do a flatMap (SelectMany) over both lists to receive one final result list
Here an unoptimized version:
string match = "384";
List<string> sNumbers = new List<string> {"34521", "38450", "138477", "38451", "28384", "13841", "12345"};
var matching = from x in sNumbers
where x.StartsWith(match)
orderby Convert.ToInt32(x)
select x;
var nonMatching = from x in sNumbers
where !x.StartsWith(match) && x.Contains(match)
orderby Convert.ToInt32(x)
select x;
var result = matching.Concat(nonMatching);
result.ToList().ForEach(x => Console.Write(x + " "));
Linq has an OrderBy method that allows you give a custom class for deciding how things should be sorted. Look here: https://msdn.microsoft.com/en-us/library/bb549422(v=vs.100).aspx
Then you can write your IComparer class that takes a value in the constructor, then a Compare method that prefers values that start with that value.
Something like this maybe:
public class CompareStringsWithPreference : IComparer<string> {
private _valueToPrefer;
public CompareStringsWithPreference(string valueToPrefer) {
_valueToPrefer = valueToPrefer;
}
public int Compare(string s1, string s2) {
if ((s1.StartsWith(_valueToPrefer) && s2.StartsWith(_valueToPrefer)) ||
(!s1.StartsWith(_valueToPrefer) && !s2.StartsWith(_valueToPrefer)))
return string.Compare(s1, s2, true);
if (s1.StartsWith(_valueToPrefer)) return -1;
if (s2.StartsWith(_valueToPrefer)) return 1;
}
}
Then use it like this:
outputlist = (from n in sNumbers
where n.Contains(searchPattern)
select n).OrderBy(n, new CompareStringsWithPreference(searchPattern))ToList();
You can create a list with strings starting with searchPattern variable and another containing searchPattern but not starting with (to avoid repeating elements in both lists):
string searchPattern = "384";
List<string> sNumbers = new List<string> { "34521", "38450", "138477", "38451", "28384", "13841", "12345" };
var list1 = sNumbers.Where(s => s.StartsWith(searchPattern)).OrderBy(s => s).ToList();
var list2 = sNumbers.Where(s => !s.StartsWith(searchPattern) && s.Contains(searchPattern)).OrderBy(s => s).ToList();
var outputList = new List<string>();
outputList.AddRange(list1);
outputList.AddRange(list2);
Sorry guys, after reading through the responses, I realize that I made a mistake in my question. The correct answer would be as follows: (sort by "starts with" first and then alphabetically (not numerically)
// output: {"38450", "38451", "13841", "138477", "28384"}
I was able to achieve that with the following query:
string searchPattern = "384";
List<string> result =
sNumbers
.Where(n => n.Contains(searchpattern))
.OrderBy(s => !s.StartsWith(searchpattern))
.ThenBy(s => s)
.ToList();
Thanks

Saving a split string to an arraylist using LINQ

I have some code that takes a string and processes it by splitting it into words, and giving the count of each word.
The trouble is it only returns void, because I am only able to print to the screen after the processing is done. Is there any way I can save the results in an arraylist, so that that I can return it to the method that called it?
The current code:
message.Split(' ').Where(messagestr => !string.IsNullOrEmpty(messagestr))
.GroupBy(messagestr => messagestr).OrderByDescending(groupCount => groupCount.Count())
.Take(20).ToList().ForEach(groupCount => Console.WriteLine("{0}\t{1}", groupCount.Key, groupCount.Count()));
Thank you.
Try this code
var wordCountList = message.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
.GroupBy(messagestr => messagestr)
.OrderByDescending(grp => grp.Count())
.Take(20) //or take the whole
.Select(grp => new KeyValuePair<string, int>(grp.Key, grp.Count()))
.ToList(); //return wordCountList
//usage
wordCountList.ForEach(item => Console.WriteLine("{0}\t{1}", item.Key, item.Value));
If you want, you can return the wordCountList which is a List<KeyValuePair<string, int>> containing all the words and their counts in descending order.
How you can use that, is also shown in the last line.
And rather than taking first 20 from the list, if you want to take the whole, remove this .Take(20) part.
First of all, by calling Take(20) you just take the first 20 words and put the others away. So, if you want all the results, remove it.
After that, you can do it like this:
var words = message.Split(' ').
Where(messagestr => !string.IsNullOrEmpty(messagestr)).
GroupBy(messagestr => messagestr).
OrderByDescending(groupCount => groupCount.Count()).
ToList();
words.ForEach(groupCount => Console.WriteLine("{0}\t{1}", groupCount.Key, groupCount.Count()));
To put the results into some other data structure, you can use one of these ways:
var w = words.SelectMany(x => x.Distinct()).ToList(); //Add this line to get all the words in an array
// OR Use Dictionary
var dic = new Dictionary<string, int>();
foreach(var item in words)
{
dic.Add(item.Key, item.Count());
}

Converting Collection of Strings to Dictionary

This is probably a simple question, but the answer is eluding me.
I have a collection of strings that I'm trying to convert to a dictionary.
Each string in the collection is a comma-separated list of values that I obtained from a regex match. I would like the key for each entry in the dictionary to be the fourth element in the comma-separated list, and the corresponding value to be the second element in the comma-separated list.
When I attempt a direct call to ToDictionary, I end up in some kind of loop that appears to kick me of the BackgroundWorker thread I'm in:
var MoveFromItems = matches.Cast<Match>()
.SelectMany(m => m.Groups["args"].Captures
.Cast<Capture>().Select(c => c.Value));
var dictionary1 = MoveFromItems.ToDictionary(s => s.Split(',')[3],
s => s.Split(',')[1]);
When I create the dictionary manually, everything works fine:
var MoveFroms = new Dictionary<string, string>();
foreach(string sItem in MoveFromItems)
{
string sKey = sItem.Split(',')[3];
string sVal = sItem.Split(',')[1];
if(!MoveFroms.ContainsKey(sKey))
MoveFroms[sKey.ToUpper()] = sVal;
}
I appreciate any help you might be able to provide.
The problem is most likely that the keys have duplicates. You have three options.
Keep First Entry (This is what you're currently doing in the foreach loop)
Keys only have one entry, the first one that shows up - meaning you can have a Dictionary:
var first = MoveFromItems.Select(x => x.Split(','))
.GroupBy(x => x[3])
.ToDictionary(x => x.Key, x => x.First()[1]);
Keep All Entries, Grouped
Keys will have more than one entry (each key returns an Enumerable), and you use a Lookup instead of a Dictionary:
var lookup = MoveFromItems.Select(x => x.Split(','))
.ToLookup(x => x[3], x => x[1]);
Keep All Entries, Flattened
No such thing as a key, simply a flattened list of entries:
var flat = MoveFromItems.Select(x => x.Split(','))
.Select(x => new KeyValuePair<string,string>(x[3], x[1]));
You could also use a tuple here (Tuple.Create(x[3], x[1]);) instead.
Note: You will need to decide where/if you want the keys to be upper or lower case in these cases. I haven't done anything related to that yet. If you want to store the key as upper, just change x[3] to x[3].ToUpper() in everything above.
This splits each item and selects key out of the 4th split-value, and value out of the 2nd split-value, all into a dictionary.
var dictionary = MoveFromItems.Select(s => s.Split(','))
.ToDictionary(split => split[3],
split => split[1]);
There is no point in splitting the string twice, just to use different indices.
This would be just like saving the split results into a local variable, then using it to access index 3 and 1.
However, if indeed you don't know if keys might reoccur, I would go for the simple loop you've implemented, without a doubt.
Although you have a small bug in your loop:
MoveFroms = new Dictionary<string, string>();
foreach(string sItem in MoveFromItems)
{
string sKey = sItem.Split(',')[3];
string sVal = sItem.Split(',')[1];
// sKey might not exist as a key
if (!MoveFroms.ContainsKey(sKey))
//if (!MoveFroms.ContainsKey(sKey.ToUpper()))
{
// but sKey.ToUpper() might exist!
MoveFroms[sKey.ToUpper()] = sVal;
}
}
Should do ContainsKey(sKey.ToUpper()) in your condition as well, if you really want the key all upper cases.
This will Split each string in MoveFromItems with ',' and from them make 4th item (3rd Index) as Key and 2nd item(1st Index) as Value.
var dict = MoveFromItems.Select(x => x.Split(','))
.ToLookup(x => x[3], x => x[1]);

Categories