Find most common combination of elements in several arrays - c#

I have several arrays, like:
var arr1 = new[] { "A", "B", "C", "D" };
var arr2 = new[] { "A", "D" };
var arr3 = new[] { "A", "B", };
var arr4 = new[] { "C", "D" };
var arr5 = new[] { "B", "C", "D" };
var arr6 = new[] { "B", "A", };
... etc.
How can I get most common combination of elements in all of those arrays?
In this case it is A and B, because they occur in arr1, arr3 and arr6, and C and D, because they occur in arrays arr1, arr4 and arr5.
Just to mention that elements can be in any kind of collection, ie. in ArrayLists also.
UPDATE uuhhh, I was not clear enough...... Most common combinations of two elements in an array. That's what I tried to show in example, but did not mention in my question.
Sorry
:-((

If you are sure that each item appears only once in each array, you could just concatenate them together and get the counts, for example:
var arrs = new[] { arr1, arr2, arr3, arr4, arr5, arr6 };
var intermediate = arrs.SelectMany(a => a)
.GroupBy(x => x)
.Select(g => new { g.Key, Count = g.Count() })
.OrderByDescending(x => x.Count);
var maxCount = intermediate.First().Count;
var results = intermediate.TakeWhile(x => x.Count == maxCount);
Or if you prefer query syntax, that would be:
var arrs = new[] { arr1, arr2, arr3, arr4, arr5, arr6 };
var intermediate =
from a in arrs.SelectMany(a => a)
group a by a into g
orderby g.Count() descending
select new { g.Key, Count = g.Count() };
var maxCount = intermediate.First().Count;
var results = intermediate.TakeWhile(x => x.Count == maxCount);
The result set will contain 3 items:
Key, Count
"A", 4
"B", 4
"D", 4
Update
Given your updated question, something like this should work:
var items = arrs.SelectMany(a => a).Distinct();
var pairs =
from a in items
from b in items
where a.CompareTo(b) < 0
select new { a, b };
var results =
(from arr in arrs
from p in pairs
where arr.Contains(p.a) && arr.Contains(p.b)
group arr by p into g
orderby g.Count() descending
select g.Key)
.First();
The logic here is:
First find all distinct items in any array
Then find every pair of items to search for
Get of every pair, grouped by a list of what arrays contain that pair
Order by the groups by the number of arrays that contain each pair, descending
Return the first pair

use a Dictionary which will store an element as an index, and the occurrence count as a value. Iterate each list and count the occurrences.

var arr1 = new[] { "A", "B", "C", "D" };
var arr2 = new[] { "A", "D" };
var arr3 = new[] { "A", "B", };
var arr4 = new[] { "C", "D" };
var arr5 = new[] { "B", "C", "D" };
var arr6 = new[] { "B", "A", };
var results = new List<IEnumerable<string>>() { arr1, arr2, arr3, arr4, arr5, arr6 }
.Select(arr => arr.Distinct())
.SelectMany(s => s)
.GroupBy(s => s)
.Select(grp => new { Text = grp.Key, Count = grp.Count() })
.OrderByDescending(t => t.Count)
.ToList();
Gives you {A, 4}, {B, 4}, {D, 4}, {C, 3}

var result = new IEnumerable<String>[] {arr1, arr2, arr3, arr4, arr5, arr6}
.SelectMany(a => a)
.GroupBy(s => s)
.GroupBy(g => g.Count())
.OrderByDescending(g => g.Key)
.FirstOrDefault()
.SelectMany(g => g.Key);

Your question is unclear as you have not clearly defined what you are looking for. In general, you could combine all the arrays into one large array and count the distinct elements. By then ordering the elements you can do whatever you intend to do with the "most common".
static void Main()
{
var arr1 = new[] { "A", "B", "C", "D" };
var arr2 = new[] { "A", "D" };
var arr3 = new[] { "A", "B", };
var arr4 = new[] { "C", "D" };
var arr5 = new[] { "B", "C", "D" };
var arr6 = new[] { "B", "A", };
List<string> combined = Combine(arr1, arr2, arr3, arr4, arr5, arr6);
var ordered = combined.OrderBy(i => i);//sorted list will probably help other functions work more quickly such as distinct
var distinct = ordered.Distinct();
var counts = new Dictionary<string, int>();
foreach (var element in distinct)
{
var count = ordered.Count(i => i == element);
counts.Add(element, count);
}
var orderedCount = counts.OrderByDescending(c => c.Value);
foreach (var count in orderedCount)
{
Console.WriteLine("{0} : {1}", count.Key, count.Value);
}
Console.ReadLine();
}
private static List<string> Combine(string[] arr1, string[] arr2, string[] arr3, string[] arr4, string[] arr5, string[] arr6)
{
List<string> combined = new List<string>();
combined.AddRange(arr1);
combined.AddRange(arr2);
combined.AddRange(arr3);
combined.AddRange(arr4);
combined.AddRange(arr5);
combined.AddRange(arr6);
return combined;
}
Outputs: A : 4, B : 4, D : 4, C : 3

Related

C# - TextLogger

My English isn't that good.
I try to make a TextLogger, for example:
if I have two different arrays:
string[] array1 = {"a", "b", "c", "d"}
string[] array2 = {"y", "c", "h", "f"}
and I have the char "c" in both of the arrays, then both of the char "c" should be removed.
Output:
a, b, d, h, y, f
this is what I managed to do so far:
string[] array1 = {"a", "b", "c", "d"}
string[] array2 = {"y", "c", "h", "f"}
for(int i = 0; i < array1.Length; i++)
{
if(array1[i] == array2[i])
{
}
}
edit(sorry for keep changing my question):
and how I can do it with this:
ArrayList array1 = new ArrayList();
array1.Add("a");
array1.Add("b");
array1.Add("c");
array1.Add("d");
ArrayList array2 = new ArrayList();
array1.Add("y");
array1.Add("c");
array1.Add("h");
array1.Add("f");
1- Get common items using Enumerable.Intersect
2- replace each array by the same array except common items using Enumerable.Except
string[] array1 = { "a", "b", "c", "d" };
string[] array2 = { "y", "c", "h", "f" };
var intersect = array1.Intersect(array2); // 1
array1 = array1.Except(intersect).ToArray(); //2
array2 = array2.Except(intersect).ToArray(); //2
Edit: to take into account double values as mentioned in the comment:
string[] array1 = { "a", "b", "b", "b", "c", "d" };
string[] array2 = { "y", "b", "c", "h", "f" };
var grpArray1 = array1.GroupBy(a => a)
.Select(grp => new { item = grp.Key, count = grp.Count() });
var grpArray2 = array2.GroupBy(a => a)
.Select(grp => new { item = grp.Key, count = grp.Count() });
array1 = grpArray1.Select(a =>
{
var bCount = array2.Count(x => x.Equals(a.item));
return new { item = a.item, finalCount = a.count - bCount };
})
.Where(a => a.finalCount > 0)
.SelectMany(a => Enumerable.Repeat(a.item, a.finalCount))
.ToArray();
array2 = grpArray2.Select(a =>
{
var bCount = array1.Count(x => x.Equals(a.item));
return new { item = a.item, finalCount = a.count - bCount };
})
.Where(a => a.finalCount > 0)
.SelectMany(a => Enumerable.Repeat(a.item, a.finalCount))
.ToArray();
Console.WriteLine("-->array1:");
foreach (var item in array1)
Console.WriteLine(item);
Console.WriteLine("-->array2:");
foreach (var item in array2)
Console.WriteLine(item);
The results:
-->array1:
a
b
b
d
-->array2:
y
h
f
If if understand your problem correctly, you don't need to change that arrays (array1 and array2) but get a result from both of them.
so, you can solve your problem using the GroupBy method
string[] array1 = { "a", "b", "c", "d" };
string[] array2 = { "y", "c", "h", "f" };
var filteredArray = array1.Concat(array2).GroupBy(x => x).Where(x => x.Count() == 1).Select(x=>x.Key);
Console.WriteLine(string.Join(" ", filteredArray));
Console.ReadLine();
what we can see here, is concat the arrays into 1 list, then group by the chars into groups, and then filter the groups that contain more than 1 element, and in the end revert it into list of chars (instead of list of groups)
Edit:
out of the comment about the duplicated "b" inside each of the arrays, i created a new (with little bit more complexity) that works for your case:
string[] array1 = { "a", "b", "b", "c", "d" };
string[] array2 = { "y", "c", "h", "f" };
var filteredArray = array1.GroupBy(x => x)
.Concat(array2.GroupBy(x => x))
.GroupBy(x => x.Key)
.Where(x => x.Count() == 1).SelectMany(x => x.Key);
Console.WriteLine(string.Join(" ", filteredArray));
Console.ReadLine();
whats happen there? we group the arrays each for himself, then concat the groups together, and then we group the groups by their keys, and the filter where each group contain more then 1 inner group, and in the end we select the groups keys (in addition it's promise us there is only 1 instance of each char)
Hope that helps!
The following code snippet should provide you a clear insight about how to perform your task:
String[] array1 = new String[] {"a", "b", "c", "d"};
String[] array2 = new String[] {"y", "c", "h", "f"};
String[] n1 = array1.Where(x => !array2.Contains(x)).ToArray();
String[] n2 = array2.Where(x => !array1.Contains(x)).ToArray();
Console.WriteLine("Array 1");
foreach (String s in n1)
Console.WriteLine(s);
Console.WriteLine("\nArray 2");
foreach (String s in n2)
Console.WriteLine(s);
Console.ReadLine();
The output is:
Array 1
a
b
d
Array 2
y
h
f
and can see a working demo by visiting this link. For more information concerning the methods I used in order to accomplish, visit the following links:
Enumerable.Contains
Enumerable.ToArray
Enumerable.Where

Two arrays into one Dictionary

I would like to create a
Dictionary<string, int[]> dict
out of two arrays:
string[] keys = { "A", "B", "A", "D" };
int[] values = { 1, 2, 5, 2 };
the result:
["A"] = {1,5}
["B"] = {2}
["D"] = {2}
Is there a way i can do this with LINQ?
I have read about Zip but I don't think I can use since I need to add values to an existing key.value array.
Use .Zip to bind the two collections together and then GroupBy to group the keys.
string[] keys = { "A", "B", "A", "D" };
int[] values = { 1, 2, 5, 2 };
var result = keys.Zip(values, (k, v) => new { k, v })
.GroupBy(item => item.k, selection => selection.v)
.ToDictionary(key => key.Key, value => value.ToArray());
Then to add these items into the dictionary that you already have:
I changed the int[] to List<int> so it is easier to handle Add/AddRange
Dictionary<string, List<int>> existingDictionary = new Dictionary<string, List<int>>();
foreach (var item in result)
{
if (existingDictionary.ContainsKey(item.Key))
existingDictionary[item.Key].AddRange(item.Value);
else
existingDictionary.Add(item.Key, item.Value.ToList());
}
Linq solution:
string[] keys = { "A", "B", "A", "D" };
int[] values = { 1, 2, 5, 2 };
Dictionary<string, int[]> dict = keys
.Zip(values, (k, v) => new {
key = k,
value = v })
.GroupBy(pair => pair.key, pair => pair.value)
.ToDictionary(chunk => chunk.Key,
chunk => chunk.ToArray());
Test:
string report = String.Join(Environment.NewLine, dict
.Select(pair => $"{pair.Key} [{string.Join(", ", pair.Value)}]"));
Console.Write(report);
Outcome:
A [1, 5]
B [2]
D [2]
Try this :
string[] keys = { "A", "B", "A", "D" };
int[] values = { 1, 2, 5, 2 };
Dictionary<string, int[]> dict = keys.Select((x, i) => new { key = x, value = values[i] }).GroupBy(x => x.key, y => y.value).ToDictionary(x => x.Key, y => y.ToArray());

Generate a list holding/making mapping between a List<string> and a List<int>

This i probably a simple question, but I'm still new to C# and LINQ (which I assume is useful in this case).
I have a List with different groups:
e.g. List<string>() { a, a, b, c, a, b, b };
I would like to make a corresponding List (sort of GroupID), holding:
List<int>() { 1, 1, 2, 3, 1, 2, 2}
The amount of different groups could be anything from 1-x, so a dynamic generation of the List is needed. Duplicate groups should get same numbers.
All this should end up in a LINQ zip() of the two into 'CombinedList', and a SqlBulkCopy with other data to a database with a foreach.
table.Rows.Add(data1, data2, data3, CombinedList.GroupID.toString() ,CombinedList.Group.ToString());
Hope it makes sense.
Example:
List<string>() { a, a, b, c, a, b, b };
This list holds 3 unique groups: a, b and c.
assign an incrementing number to the groups, starting from 1:
a = 1, b = 2, c = 3.
The generated result list should then hold
List<string>() { 1, 1, 2, 3, 1, 2, 2 };
This works for me:
var source = new List<string>() { "a", "a", "b", "c", "a", "b", "b" };
var map =
source
.Distinct()
.Select((x, n) => new { x, n })
.ToDictionary(xn => xn.x, xn => xn.n + 1);
var result =
source
.Select(x => new { GroupIP = map[x], Value = x });
I get this result:
Generate a List:
var strings = new List<string>() { "a", "a", "b", "c", "a", "b", "b" };
var uniqueStrings = strings.Distinct().ToList();
var numbers = strings.Select(s => uniqueStrings.IndexOf(s)).ToList();
This produces:
List<int> { 0, 0, 1, 2, 0, 1, 1 }
If you want to have your values starting at 1 instead of 0, then modify the last line to include +1 as per below:
var numbers = strings.Select(s => uniqueStrings.IndexOf(s) + 1).ToList();
Not sure but I think this can help you.
var keyValue = new List<KeyValuePair<int, string>>();
var listString = new List<string>() { "a", "a", "b", "c", "a", "b", "b"};
var listInt = new List<int>();
int count = 1;
foreach (var item in listString)
{
if(keyValue.Count(c=>c.Value == item) == 0)
{
keyValue.Add(new KeyValuePair<int, string>(count, item));
count++;
}
}
foreach (var item in listString)
{
listInt.Add(keyValue.Single(s=>s.Value == item).Key);
}
How about this:
int groupId = 1;
var list = new List<string>() { "a", "a", "b", "c", "a", "b", "b" };
var dict = list
.GroupBy(x => x)
.Select(x => new {Id = groupId++, Val = x})
.ToDictionary(x => x.Val.First(), x => x.Id);
var result = list.Select(x => dict[x]).ToList();

Find common items in list of lists of strings

Hi I have allLists that contains lists of string I want to find common items among these string lists
i have tried
var intersection = allLists
.Skip(1)
.Aggregate(
new HashSet<string>(allLists.First()),
(h, e) => { h.IntersectWith(e); return h);`
and also intersection ( hard code lists by index) all of them did not work when I tried
var inter = allLists[0].Intersect(allLists[1]).Intersect(allLists[2])
.Intersect(allLists[3]).ToList();
foreach ( string s in inter) Debug.WriteLine(s+"\n ");
So how am I going to do this dynamically and get common string items in the lists;
is there a way to avoid Linq?
Isn't this the easiest way?
var stringLists = new List<string>[]
{
new List<string>(){ "a", "b", "c" },
new List<string>(){ "d", "b", "c" },
new List<string>(){ "a", "e", "c" }
};
var commonElements =
stringLists
.Aggregate((xs, ys) => xs.Intersect(ys).ToList());
I get a list with just "c" in it.
This also handles the case if elements within each list can be repeated.
I'd do it like this:
class Program
{
static void Main(string[] args)
{
List<string>[] stringLists = new List<string>[]
{
new List<string>(){ "a", "b", "c" },
new List<string>(){ "d", "b", "c" },
new List<string>(){ "a", "e", "c" }
};
// Will contian only 'c' because it's the only common item in all three groups.
var commonItems =
stringLists
.SelectMany(list => list)
.GroupBy(item => item)
.Select(group => new { Count = group.Count(), Item = group.Key })
.Where(item => item.Count == stringLists.Length);
foreach (var item in commonItems)
{
Console.WriteLine(String.Format("Item: {0}, Count: {1}", item.Item, item.Count));
}
Console.ReadKey();
}
}
An item is a common item if it occurs in all groups hence the condition that its count must be equal to the number of groups:
.Where(item => item.Count == stringLists.Length)
EDIT:
I should have used the HashSet like in the question. For lists you can replace the SelectMany line with this one:
.SelectMany(list => list.Distinct())

Find duplicate in a list from a reference list

I'd like know if at least one element of listRef is present more than once in listA ? The other values can be present more than once.
List<string> listA = new List<string> { "A", "A", "B", "C", "D", "E" };
List<string> listRef = new List<string> { "B", "D" };
Thanks,
Try this:
bool hasRef = listref.Any(r => listA.Count(a => a == r) > 1);
I would use ToLookup method to generate Lookup<string, string> first, and then use it to check your condition:
var lookup = listA.ToLookup(x => x);
return listRef.Any(x => lookup.Contains(x) && lookup[x].Count() > 1);
You could use GroupBy and ToDictionary to achieve the same:
var groups = listA.GroupBy(x => x).ToDictionary(g => g.Key, g => g.Count());
return listRef.Any(x => groups.ContainsKey(x) && groups[x] > 1);
something like this
var query = listRef.Where(x=>
listA.Where(a => a == x)
.Skip(1)
.Any());
listRef.ForEach(refEl => {
var count = listA.Count(aEl => aEl == refEl);
if(count > 1) {
//Do something
}
});
Finding the best performing option in this case is not simple because that depends on the number of items in the lists and the expected result.
Here's a way to do it that is performant in the face of big lists:
var appearances = listA.GroupBy(s => s)
.Where(g => g.Count() > 1)
.ToDictionary(g => g.Key, g => g.Count());
var hasItemAppearingMoreThanOnce = listRef.Any(r => appearances.ContainsKey(r));
this works
List<string> listA = new List<string> { "A", "A", "B", "C", "D", "E" };
List<string> listRef = new List<string> { "A", "D" };
foreach (var item in listRef)
{
if (listA.Where(x => x.Equals(item)).Count() > 1)
{
//item is present more than once
}
}
this can be another way to do
List<string> listA = new List<string> { "A", "A", "B", "C", "D", "E" , "D" };
List<string> listRef = new List<string> { "B", "D" };
var duplicates = listA.GroupBy(s => s).SelectMany(grp => grp.Skip(1));
var newData = duplicates.Select(i => i.ToString()).Intersect(listRef);
var result = listA.GroupBy(x=>x)
.Where(g=>g.Count()>1&&listRef.Contains(g.Key))
.Select(x=>x.First());
bool a = result.Any();
If the second list is large and can contain duplicates i would use a HashSet<string> and IntersectWith to remove possible duplicates and strings which are not in the first list from the second:
var refSet = new HashSet<string>(listRef);
refSet.IntersectWith(listA);
bool anyMoreThanOne = refSet.Any(rs => listA.ContainsMoreThanOnce(rs, StringComparison.OrdinalIgnoreCase));
Here the extension which is not very elegant but works:
public static bool ContainsMoreThanOnce(this IEnumerable<string> coll, String value, StringComparison comparer)
{
if (coll == null) throw new ArgumentNullException("col");
bool contains = false;
foreach (string str in coll)
{
if (String.Compare(value, str, comparer) == 0)
{
if (contains)
return true;
else
contains = true;
}
}
return false;
}
DEMO
However, if the second listRef isn't large or doesn't contain duplicates you can just use:
bool anyMoreThanOne = listRef
.Any(rs => listA.ContainsMoreThanOnce(rs, StringComparison.OrdinalIgnoreCase));

Categories