How to create group By in LINQ in a dictionary - c#

I am learning LINQ Query and stuck at one place. Suppose I have a strongly typed datatable like below
idGroup idUnit Status
1 12 foo
1 13 bar
1 15 hello
2 12 nofoo
2 16 nohello
I want the result like below:
int Generic List of int
1 12,13,15
2 12,16
So more like I want to create a dictionary but group by it based on idGroup.
My Attempt:
Dictionary<int, List<int>> temp = Mydatatable.ToDictionary(p => p.idGroup, p => p.idUnit);
Error: Above LINQ will return me <int>, <int>, but my expected result is <int>, List<int>.`
I want something like below:
Dictionary<int, List<int>> temp = Mydatatable.ToDictionary(p => p.idGroup,
p => p.idUnit.ToList());

First of all the error is because in your ToDictionary the value you specify is the idUnit which is an int and not a List<int> (for instance writing p => new List<int> { p.idUnit } would resolve that error)
Then after that for a dictionary output first GroupBy and then ToDictionary. Otherwise you will get an exception stating the given key already exists in the dictionary.
var result = Mydatatable.GroupBy(key => key.idGroup, val => val.idUnit)
.ToDictionary(key => key.Key, val => val.ToList());
Another option is to use a Lookup instead of a Dictionary and then just
var result = Mydatatable.ToLookup(p => p.idGroup, p => p.idUnit);

Related

LINQ to JSON group query on array

I have a sample of JSON data that I am converting to a JArray with NewtonSoft.
string jsonString = #"[{'features': ['sunroof','mag wheels']},{'features': ['sunroof']},{'features': ['mag wheels']},{'features': ['sunroof','mag wheels','spoiler']},{'features': ['sunroof','spoiler']},{'features': ['sunroof','mag wheels']},{'features': ['spoiler']}]";
I am trying to retrieve the features that are most commonly requested together. Based on the above dataset, my expected output would be:
sunroof, mag wheels, 2
sunroof, 1
mag wheels 1
sunroof, mag wheels, spoiler, 1
sunroof, spoiler, 1
spoiler, 1
However, my LINQ is rusty, and the code I am using to query my JSON data is returning the count of the individual features, not the features selected together:
JArray autoFeatures = JArray.Parse(jsonString);
var features = from f in autoFeatures.Select(feat => feat["features"]).Values<string>()
group f by f into grp
orderby grp.Count() descending
select new { indFeature = grp.Key, count = grp.Count() };
foreach (var feature in features)
{
Console.WriteLine("{0}, {1}", feature.indFeature, feature.count);
}
Actual Output:
sunroof, 5
mag wheels, 4
spoiler, 3
I was thinking maybe my query needs a 'distinct' in it, but I'm just not sure.
This is a problem with the Select. You are telling it to make each value found in the arrays to be its own item. In actuality you need to combine all the values into a string for each feature. Here is how you do it
var features = from f in autoFeatures.Select(feat => string.Join(",",feat["features"].Values<string>()))
group f by f into grp
orderby grp.Count() descending
select new { indFeature = grp.Key, count = grp.Count() };
Produces the following output
sunroof,mag wheels, 2
sunroof, 1
mag wheels, 1
sunroof,mag wheels,spoiler, 1
sunroof,spoiler, 1
spoiler, 1
You could use a HashSet to identify the distinct sets of features, and group on those sets. That way, your Linq looks basically identical to what you have now, but you need an additional IEqualityComparer class in the GroupBy to help compare one set of features to another to check if they're the same.
For example:
var featureSets = autoFeatures
.Select(feature => new HashSet<string>(feature["features"].Values<string>()))
.GroupBy(a => a, new HashSetComparer<string>())
.Select(a => new { Set = a.Key, Count = a.Count() })
.OrderByDescending(a => a.Count);
foreach (var result in featureSets)
{
Console.WriteLine($"{String.Join(",", result.Set)}: {result.Count}");
}
And the comparer class leverages the SetEquals method of the HashSet class to check if one set is the same as another (and this handles the strings being in a different order within the set, etc.)
public class HashSetComparer<T> : IEqualityComparer<HashSet<T>>
{
public bool Equals(HashSet<T> x, HashSet<T> y)
{
// so if x and y both contain "sunroof" only, this is true
// even if x and y are a different instance
return x.SetEquals(y);
}
public int GetHashCode(HashSet<T> obj)
{
// force comparison every time by always returning the same,
// or we could do something smarter like hash the contents
return 0;
}
}

How to group a list with Linq

I have a list which I get from a database. The structure looks like (which I'm representing with JSON as it's easier for me to visualise)
{id:1
value:"a"
},
{id:1
value:"b"
},
{id:1
value:"c"
},
{id:2
value:"t"
}
As you can see, I have 2 unique ID's, ID 1 and 2. I want to group by the ID. The end result I'd like is
{id:1,
values:["a","b","c"],
},
{id:2,
values["g"]
}
Is this possible with Linq? At the moment, I have a massive complex foreach, which first sorts the list (by ID) and then detects if it's already been added etc but this monstrous loop made me realise I'm doing wrong and honestly, it's too embarrassing to share.
You can group by the item Id and have the resulting type be a Dictionary<int, List<string>>
var result = myList.GroupBy(item => item.Id)
.ToDictionary(item => item.Key,
item => item.Select(i => i.Value).ToList());
You can either use GroupBy method on IEnumerable to create IGrouping object that contains a key and grouped objects or you can use ToLookupto create exactly what you want in result:
yourList.ToLookup(m => m.id, m => m.value);
This creates a hashed collection of keys with their values.
For more information please see below post:
https://www.c-sharpcorner.com/UploadFile/d3e4b1/practical-usage-of-using-tolookup-method-in-linq-C-Sharp/
Just a little more detail to emphasize the difference between the ToLookup approach and the GroupBy approach:
// class definition
public class Item
{
public long Id { get; set; }
public string Value { get; set; }
}
// create your list
var items = new List<Item>
{
new Item{Id = 0, Value = "value0a"},
new Item{Id = 0, Value = "value0b"},
new Item{Id = 1, Value = "value1"}
};
// this approach results in a List<string> (a collection of the values)
var lookup = items.ToLookup(i => i.Id, i => i.Value);
var groupOfValues = lookup[0].ToList();
// this approach results in a List<Item> (a collection of the objects)
var itemsGroupedById = items.GroupBy(i => i.Id).ToList();
var groupOfItems = itemsGroupedById[0].ToList();
So, if you want to work with values only after grouping, then you could take the first approach; if you want to work with objects after grouping, you could take the second approach. And, these are just a couple example implementations, there are plenty of ways to accomplish your goal.
First convert to a Lookup then select into a list, like so:
var groups = list
.ToLookup
(
item => item.ID,
item => item.Value
)
.Select
(
item => new
{
ID = item.Key,
Values = item.ToList()
}
)
.ToList();
The resulting JSON looks like this:
[{"ID":1,"Values":["a","b","c"]},{"ID":2,"Values":["t"]}]
Link to working example on DotNetFiddle.

TextBox display closest match string

How can I get the string from a list that best match with a base string using the Levenshtein Distance.
This is my code:
{
string basestring = "Coke 600ml";
List<string> liststr = new List<string>
{
"ccoca cola",
"cola",
"coca cola 1L",
"coca cola 600",
"Coke 600ml",
"coca cola 600ml",
};
Dictionary<string, int> resultset = new Dictionary<string, int>();
foreach(string test in liststr)
{
resultset.Add(test, Ldis.Compute(basestring, test));
}
int minimun = resultset.Min(c => c.Value);
var closest = resultset.Where(c => c.Value == minimun);
Textbox1.Text = closest.ToString();
}
In this example if I run the code I get 0 changes in string number 5 from the list, so how can I display in the TextBox the string itself?
for exemple : "Coke 600ml" Right now my TextBox just returns:
System.Linq.Enumerable+WhereEnumerableIterator`1
[System.Collections.Generic.KeyValuePair`2[System.String,System.Int32]]
Thanks.
Try this
var closest = resultset.First(c => c.Value == minimun);
Your existing code is trying to display a list of items in the textbox. I looks like it should just grab a single item where Value == min
resultset.Where() returns a list, you should use
var closest = resultset.First(c => c.Value == minimun);
to select a single result.
Then the closest is a KeyValuePair<string, int>, so you should use
Textbox1.Text = closest.Key;
to get the string. (You added the string as Key and changes count as Value to resultset earilier)
There is a good solution in code project
http://www.codeproject.com/Articles/36869/Fuzzy-Search
It can be very much simplified like so:
var res = liststr.Select(x => new {Str = x, Dist = Ldis.Compute(basestring, x)})
.OrderBy(x => x.Dist)
.Select(x => x.Str)
.ToArray();
This will order the list of strings from most similar to least similar.
To only get the most similar one, simply replace ToArray() with First().
Short explanation:
For every string in the list, it creates an anonymous type which contains the original string and it's distance, computed using the Ldis class. Then, it orders the collection by the distance and maps back to the original string, so as to lose the "extra" information calculated for the ordering.

Frequency table with zero counts for all values [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Dictionary returning a default value if the key does not exist
I have a string that contains only digits. I'm interested in generating a frequency table of the digits. Here's an example string:
var candidate = "424256";
This code works, but it throws a KeyNotFound exception if I look up a digit that's not in the string:
var frequencyTable = candidate
.GroupBy(x => x)
.ToDictionary(g => g.Key, g => g.Count());
Which yields:
Key Count
4 2
2 2
5 1
6 1
So, I used this code, which works:
var frequencyTable = (candidate + "1234567890")
.GroupBy(x => x)
.ToDictionary(g => g.Key, g => g.Count() - 1);
However, in other use cases, I don't want to have to specify all the possible key values.
Is there an elegant way of inserting 0-count records into the frequencyTable dictionary without resorting to creating a custom collection with this behavior, such as this?
public class FrequencyTable<K> : Dictionary<K, int>
{
public FrequencyTable(IDictionary<K, int> dictionary)
: base(dictionary)
{ }
public new int this[K index]
{
get
{
if (ContainsKey(index))
return base[index];
return 0;
}
}
}
If you do not somehow specify all possible key values, your dictionary will not contain an entry for such keys.
Rather than storing zero counts, you may wish to use
Dictionary.TryGetValue(...)
to test the existence of the key before trying to access it. If TryGetValue returns false, simply return 0.
You could easily wrap that in an extension method (rather than creating a custom collection).
static public class Extensions
{
static public int GetFrequencyCount<K>(this Dictionary<K, int> counts, K value)
{
int result;
if (counts.TryGetValue(value, out result))
{
return result;
}
else return 0;
}
}
Usage:
Dictionary<char, int> counts = new Dictionary<char, int>();
counts.Add('1', 42);
int count = counts.GetFrequencyCount<char>('1');
If there is a pattern for all the possible keys, you can use Enumerable.Range (or a for loop) to generate 0-value keys as a base table, then left join in the frequency data to populate the relevant values:
// test value
var candidate = "424256";
// generate base table of all possible keys
var baseTable = Enumerable.Range('0', '9' - '0' + 1).Select(e => (char)e);
// generate freqTable
var freqTable = candidate.ToCharArray().GroupBy (c => c);
// left join frequency table results to base table
var result =
from b in baseTable
join f in freqTable on b equals f.Key into gj
from subFreq in gj.DefaultIfEmpty()
select new { Key = b, Value = (subFreq == null) ? 0 : subFreq.Count() };
// convert final result into dictionary
var dict = result.ToDictionary(r => r.Key, r => r.Value);
Sample result:
Key Value
0 0
1 0
2 2
3 0
4 2
5 1
6 1
7 0
8 0
9 0

Generate a map of list element indices using Linq

I want to take a List, and generate a Dictionary which maps each element to its index in the List. I can do this like so, for a List<string>:
var myList = new List<string>{ /* populate list */ };
var orderMap = new Dictionary<string, int>();
foreach (var element in myList)
{
orderMap[element] = myList.IndexOf(element);
}
Basically, I want to take a list like:
Apple
Banana
Orange
And return a map showing indices:
Apple -> 0
Banana -> 1
Orange -> 2
How can I do this with Linq? I think something like this should work:
orderMap = myList.Select( x => /* return a key value pair mapping x to myList.IndexOf(x) */ );
But I can't figure out the right syntax for it. Besides, can you refer to the list itself in the delegate used for Select?
While you can refer to the list within the delegate, it's not generally a good idea. You really want to use the overload of Select which provides the index as well as the value:
var dictionary = list.Select((value, index) => new { value, index })
.ToDictionary(p => p.value, p => p.index);
Note that this will throw an exception if you have any duplicate elements.
You could try the ToDictionary extension method:
int index = 0;
orderMap = myList.ToDictionary(x => x, x => index++);
Take a look at this overload of ToDictionary<TKey, TValue>(). It takes to functions to convert the input element into a Key and a Value.
e.g.
var myList = new List<string>{ /* populate list */ };
var orderMap = myList.ToDictionary(x => myList.IndexOf(x), x => x);
However, one problem with this is if the elements of myList aren't unique.

Categories