Linq groupby list matching and count - c#

I have a line of json that I deserialize and create a list of lists:
var json = #"[{'names': ['a','b']} ,{'names': ['z','y','j']} ,{'names':
['a','b']}]";
var json_converted = JsonConvert.DeserializeObject<List<RootObject>>(json);
var namelist = json_converted
I want to use linq to compare the list contents and see if the items in the list match another list
ex)
names............matching list count
List {a,b} = 2
List {z,y,j} = 1
I've tried the following, but no dice :/
var namelist = json_converted
.GroupBy(n => n.names)
.Select(i => new { name = i.Key, Count = i.Count() });
Any suggestions?

You can group by a string produced from your list items taken in the same order. Assuming that '|' character is not allowed inside names, you can do this:
var namelist = json_converted
.GroupBy(n => string.Join("|", n.names.OrderBy(s => s)))
.Select(g => new {
Name = g.First().names
, Count = g.Count()
});
This approach constructs a string "a|b" from lists ["a", "b"] and ["b", "a"], and use that string for grouping the content.
Since the keys are composed of ordered names, g.First().names used as Name may not be in the same order for all elements of the group.

You can write a comparer for the list of names.
public class NamesComparer : IEqualityComparer<IEnumerable<string>>
{
public bool Equals(IEnumerable<string> x, IEnumerable<string> y)
{
//do your null checks
return x.SequenceEqual(y);
}
public int GetHashCode(IEnumerable<string> obj)
{
return 0;
}
}
Then use that in GroupBy:
var namelist = json_converted.GroupBy(n => n.names, new NamesComparer())
.Select(i => new { name = i.Key, Count = i.Count() });

Related

Linq group by multiple elements and get the count

I have the following json file:
[{"fruits": ["strawberry"]}, {"fruits": ["mango"]}, {"fruits": ["strawberry", "kiwi"]}]
I have the following class
class shake
{
public List<string> fruits = new List<string>();
}
I know this will give me all unique fruits:
var fruits = JsonConvert.DeserializeObject<List<shake>>(json);
var shakes = fruits
.GroupBy(t => t.fruits[0])
.Select(group => new
{
fruit = group.Key,
Count = group.Count()
})
.OrderByDescending(x => x.Count);
I'm trying to get a sorted list with the most popular fruits.
Is it possible to group by multiple elements and get the frequency? is possible to do it with linq?
I think in your example, you are missing the kiwi. I think what you are after is the SelectMany function in Linq.
public class Shake
{
public List<string> Fruits = new List<string>();
}
var list =
(
from shake in shakes
from fruit in shake.Fruits
group fruit by fruit into fruitGroup
let category = new { Fruit = fruitGroup.Key, Count = fruitGroup.Count() }
orderby category.Count descending
select category
).ToList();
I am not sure exactly what final result you are looking for. How close is this?
class Shake
{
public List<string> fruits = new List<string>();
}
static void Main(string[] args)
{
string json = "[{\"fruits\": [\"strawberry\"]}, {\"fruits\": [\"mango\"]}, {\"fruits\": [\"strawberry\", \"kiwi\"]}]";
List<Shake> fruitList = JsonConvert.DeserializeObject<List<Shake>>(json);
var shakes = fruitList.SelectMany(x => x.fruits)
.GroupBy(t => t)
.Select(group => new
{
fruit = group.Key,
Count = group.Count()
})
.OrderByDescending(x => x.Count).ToList();
Console.WriteLine(JsonConvert.SerializeObject(shakes));
}
The output is:
[{"fruit":"strawberry","Count":2},{"fruit":"mango","Count":1},{"fruit":"kiwi","Count":1}]
fruits.SelectMany(x=>x.fruits).GroupBy(x=>x).OrderByDescending(x=>x.Count()).ToDictionary(x=>x.Key, x=>x.Count())
here we on the output get the dictionary where key is name of fruit and value is the count

C# Linq Find all indexes of item in List<int> within another List<int>

I have a List looks like:
List<int> List1= new List<int>(){3,4,5};
and another looks like:
List<int> List2 = new List<int>(){1,2,3,4,5,6};
How can I use Linq to get an array of all of the indices of List1 from List2 like below:
var ResultList = {2,3,4};
var ResultList = List1.Select(x => List2.IndexOf(x));
This is a longer solution but prevents a nested loop through the array which may be faster if the arrays are huge (but slower if the arrays are small).
List<int> List1= new List<int>(){3,4,5};
List<int> List2 = new List<int>(){1,2,3,4,5,6};
var lookup = new Dictionary<int, int>();
for(var i=0; i<List2.Count; i++) {
lookup[List2[i]] = i;
}
List<int> Result = List1.Select(i => {
int index;
return lookup.TryGetValue(i, out index) ? index : -1;
}).ToList();
You can also do the overloaded version of Select statement to select the Value and return the Index:
var result = List2.Select((a, b) => new {Value = a, Index = b})
.Where(x => List1.Any(d => d == x.Value))
.Select(c => c.Index).ToArray();
If your List2 contains more than one instance of a List1 value (or Equality) type, then you can use the indexed overload of Select to find all the duplicates:
var List1= new List<int>(){3,4,5};
var List2 = new List<int>(){1,2,3,4,5,6,1,2,3,5};
var result = List2.Select((x, idx) => Tuple.Create(x, idx))
.Where(t => List1.Contains(t.Item1))
.Select(x => x.Item2)
// 2,3,4,8,9
or better, using C#7 Value Tuples
List2.Select((x, idx) => (X:x, Idx:idx))
.Where(t => List1.Contains(t.X))
.Select(x => x.Idx);
(.IndexOf returns just the first index found in the target)

How to group by and count similar lists in a List<List<String>>

I have a list of a list of strings:
List<List<String>> pChain;
It might have repeated lists of strings (two list of strings are equal if they have the same strings in the same order). I want to have the count of each distinct list in the main list. I tried:
var results = (from t in pChain
group t by new { t }
into g
select new
{
g.Key,
Count = g.Count(),
}).OrderByDescending(x => x.Count).ToList();
foreach (var v in results)
{
ListViewItem lv = listView2.Items.Add(v.Key.ToString());
lv.SubItems.Add(v.Count + "");
}
But it doesn't group similar list of strings into one list and doesn't count them.
You can use SelectMany + Distinct:
var allDistinctItems = pChain.SelectMany(list => list).Distinct();
If you want the count use int countOfDistinctItems = allDistinctItems.Count();.
If you want a dictionary you could use:
Dictionary<string, int> itemCounts = pChain.SelectMany(list => list)
.GroupBy(item => item)
.ToDictionary(g => g.Key, g => g.Count());
You can check if a list of lists contains an specific list by iterating through its elements and checking if they are SequenceEqual(). You should be able to remove the duplicate lists with this:
for(int i = 0; i < pChain.Count(); i++)
{
// If the amount(Count) of SequenceEqual lists in pChain for the current iteration
// of pChain (pChain[i]) is > 1
if (pChain.Count(l => l.SequenceEqual(pChain[i])) > 1)
pChain.RemoveAt(i);
}
Thus the amount of distinct lists would be:
int count = pChain.Count();
You can put the code above into a single linQ line this way:
pChain.Select((x, y) => new { list = x, Index = y }).ToList()
.ForEach(l1 => {
if (pChain.Count(l2 => l2.SequenceEqual(l1.list)) > 1)
pChain.RemoveAt(l1.Index);
});
I tried Aggregate function to join the strings of the inner list to a string resulted from concatenating them. Then applied the GroupBy to this list.
Dictionary<string, int> itemCounts =
pChain.Select(list => list.Aggregate((i, j) => j + '/' + i))
.GroupBy(item => item).OrderByDescending(x => x.Key)
.ToDictionary(g => g.Key.ToString(), g => g.Count());
foreach (var v in itemCounts)
{
ListViewItem lv = listView2.Items.Add(v.Key.ToString());
lv.SubItems.Add(v.Value + "");
}

How to sort list<string> in custom order

Having a string like "CAATCCAAC" I am generating all kmers from it (k is variable but has to be less than string) doing:
string dna = "CAATCCAAC";
dna = dna.Replace("\n", "");
int k = 5;
List<string> kmerList = new List<string>();
var r = new Regex(#"(.{" + k + #"})");
while (dna.Length >= k)
{
Match m = r.Match(dna);
//Console.WriteLine(m.ToString());
kmerList.Add(m.ToString());
dna = dna.Substring(1);
}
var sortedList = kmerList.OrderBy(i =>'A').
ThenBy(i => 'C').
ThenBy(i => 'G').
ThenBy(i => 'T').ToList();
foreach (string result in sortedList)
{
Console.WriteLine(result);
}
I want to sort result
AATCC
ATCCA
CAATC
CCAAC
TCCAA
However I am getting
CAATC
AATCC
ATCCA
TCCAA
CCAAC
How can I sort elements so they are ordered first by 'A' then by 'C' then by 'G' and finally 'T' ?
I tried
var sortedList = kmerList.OrderBy(i =>'A').
ThenBy(i => 'C').
ThenBy(i => 'G').
ThenBy(i => 'T').ToList();
but that wouldn't work
I want the result like to be aplied for all string like
AAAA
AACG
ACCC
ACCG
ACCT
...
TTTT
In order to sort a list in an alphabetical order,you should use the built-in Sort function:
kmerList.Sort();
If you want to order in alphabetical order you can use:
List<string> sorted = kmerList.OrderBy(x => x).ToList();
To get the reverse:
List<string> sorted = kmerList.OrderByDescending(x => x).ToList();
There's a build-in sort function. Try kmerList.Sort()

How to sort List according to value in an array

I have
List<string> strs;
double[] values;
where the values array contains the value of each of the string in strs list
Say strs={"abc","def","ghi"}
and values={3,1,2}
this means "abc" has value 3 and so on.
I wish to sort strs and values ordered by values, such that it becomes
strs={"def","ghi","abc"}
values={1,2,3}
Is there any easy way to achieve this?
The Array.Sort method has an overload that takes two arrays and sorts both arrays according to the values in the first array, so make an array out of the list:
string[] strsArr = strs.ToArray();
Then sorting them can't be simpler:
Array.Sort(values, strsArr);
And then back to a list, if you need that:
strs = strsArr.ToList();
You can use Enumerable.Zip, then sort the result, then extract the list of strings.
Something like:
var result = strs.Zip(values, (first, second) => new Tuple<string, double>(first, second))
.OrderBy(x => x.Item2)
.Select(x => x.Item1)
.ToList();
How are you setting up these collections? Or are you given these two parameters?
You could create a StringAndOrder class and use LINQ:
public class StringAndOrder
{
public string String { get; set; }
public double Order { get; set; }
}
List<StringAndOrder> list; //create with this structure instead
var orderedStrings = list.OrderBy(item => item.Order).Select(item => item.String);
var sortedStrs = strs.Select((i, s) => new {Value = values[i], Str = s})
.OrderBy(x => x.Value)
.Select(x => x.Str).ToList();
If you could logically put those values as properties of a class, such as:
class NameAndOrder
{
public string Name;
public int Order;
}
Then it would be better and more organized, and then you could do:
var items = new List<NameAndOrder>(strs.Count);
for (var i = 0; i < strs.Count; i++)
{
items.Add(new NameAndOrder { Name = strs[i], Order = values[i] });
}
items.Sort((a, b) => a.Order.CompareTo(b.Order));
Why Don't you use Dictionary Object..
Dictionary<string, int> dictionary =
new Dictionary<string, int>();
dictionary.Add("cat", 2);
dictionary.Add("dog", 1);
dictionary.Add("llama", 0);
dictionary.Add("iguana", -1);
// Acquire keys and sort them.
var list = dictionary.Keys.ToList();
list.Sort();
var strs = new[] { "abc", "def", "ghi" };
var values = new[] { 3, 1, 2 };
var newArr = strs.Select((s, i) => new { s, i })
.OrderBy(x => values[x.i])
.Select(x => x.s)
.ToArray();

Categories