Remove duplicate items and calculate average values using LINQ - c#

For example I have a list of objects (properties: Name and value)
item1 20;
item2 30;
item1 50;
I want the result:
item1 35 (20+50)/2
item2 30
How can I do this?
Sorry guys, duplicate is based on item.Name.

var results =
from kvp in source
group kvp by kvp.Key.ToUpper() into g
select new
{
Key= g.Key,
Value= g.Average(kvp => kvp.Value)
}
or
var results = source.GroupBy(c=>c.Name)
.Select(c => new (c.Key, c.Average(d=>d.Value)));

You could do it using average and group by:
public class myObject
{
public string Name {get;set;}
public double Value {get;set;}
}
var testData = new List<myObject>() {
new myObject() { Name = "item1", Value = 20 },
new myObject() { Name = "item2", Value = 30 },
new myObject() { Name = "item1", Value = 50 }
};
var result = from x in testData
group x by x.Name into grp
select new myObject() {
Name=grp.Key,
Value= grp.Average(obj => obj.Value)
};

Related

group and combine items using LINQ

public class Product
{
public string Code { get; set; }
public string Name { get; set; }
public string Amount { get; set; }
}
public List<Product> Products = new List<Product>()
{
new Product() { Code="A1", Name="Vacuum", Amount = "10" },
new Product() { Code="A2", Name="Iron", Amount = "20" },
new Product() { Code="A3", Name="Kettle", Amount = "13" },
new Product() { Code="A2", Name="Microwave", Amount = "11" },
new Product() { Code="A3", Name="Dryer", Amount = "3" }
};
I need to select all products without duplicate code. Products with the same code should be combined into one line, in this case name and amount should be separated by commas. How to modify the following code to make it more elegant
var list1 = new List<Product>();
var gl = Products.GroupBy(x => x.Code).Where(x => x.Count() > 1);
gl.ToList().ForEach(x => list1.AddRange(x));
var list2 = Products.Where(x => !list1.Contains(x)).ToList(); // uniq values
var list3 = gl.Select(x =>
{
var p = new Product() { Code = x.Key };
p.Name = string.Join(",", x.ToList().Select(r => r.Name).Distinct());
p.Amount = string.Join(",", x.ToList().Select(r => r.Amount).Distinct());
return p;
}).ToList();
list2.AddRange(list3);
list2.ForEach(x =>
{
Console.WriteLine($"{x.Code.PadRight(20)},{x.Name.PadRight(20)},{x.Amount.PadRight(20)}");
});
the result should be:
Code Name Amount
A1 Vacuum 10
A2 Iron, Microwave 20, 11
A3 Kettle, Dryer 13, 3
Use GroupBy on Code, then iterate over it to get the individual elements to combine using string.Join().
var results = products
.GroupBy
(
p => p.Code
)
.Select
(
g => new
{
Code = g.Key,
Name = string.Join
(
",",
g.Select( p => p.Name )
),
Amount = string.Join
(
",",
g.Select( p => p.Amount.ToString() )
)
}
);
Output:
A1 Vacuum 10
A2 Iron,Microwave 20,11
A3 Kettle,Dryer 13,3
Link to working example on DotNetFiddle

Get all combinations of a list grouped by name

I have the following list of TestParam... This is just a parameter list that is doing to determine how a query is going to be run. In the following case, the expected result would be to be executed against all the combinations of different parameters. Hence, a list of lists, with CustomerId 33 together with each product Id available in the list...
List<TestParam> testList = new List<TestParam>();
testList.Add(new TestParam() { Name = "CustomerId", Value = "33" });
testList.Add(new TestParam() { Name = "ProductId", Value = "1" });
testList.Add(new TestParam() { Name = "ProductId", Value = "2" });
testList.Add(new TestParam() { Name = "ProductId", Value = "3" });
testList.Add(new TestParam() { Name = "ProductId", Value = "4" });
testList.Add(new TestParam() { Name = "ProductId", Value = "5" });
testList.Add(new TestParam() { Name = "ProductId", Value = "6" });
testList.Add(new TestParam() { Name = "ProductId", Value = "7" });
testList.Add(new TestParam() { Name = "ProductId", Value = "8" });
TestParam is a normal encapsulated parameter class having a name and a value...
public class TestParam
{
public string Name { get; set; }
public string Value { get; set; }
}
The end result would be a list of lists, having CustomerId 33, with all the rest of the products. The same result would be acquired if I had different names and values in the list of TestParam (the above is just an example).
The following code, ends up with several lists depending on the combinations of the list above...
// First get a list of distinct unique param collections...
List<string> distinctParameterNames = new List<string>();
testList.GroupBy(x => x.Name).ForEach(paramName => {
distinctParameterNames.Add(paramName.Key);
});
// Get counts
List<int> combinationList = new List<int>();
foreach (var x in distinctParameterNames) {
combinationList.Add(testList.Where(y=>y.Name == x).Count());
}
// Will contain 2 lists, one having all combinations of parameters named CustomerId, and another with ProductId combinations...
List<List<TestParam>> parameterList = new List<List<TestParam>>();
foreach (var x in distinctParameterNames) {
// Loop
List<TestParam> parameter = new List<TestParam>();
testList.Where(paramName => paramName.Name == x).ForEach(y =>
{
parameter.Add(new TestParam() { Name = y.Name, Value = y.Value });
});
parameterList.Add(parameter);
}
It would be an intersect between the list, and the end result will be a list of lists, and each list will have the combinations below... So a run would return (in this case) :
Customer 33, Product Id 1
Customer 33, Product Id 2
Customer 33, Product Id 3
Customer 33, Product Id 4
Customer 33, Product Id 5
Customer 33, Product Id 6
Customer 33, Product Id 7
Customer 33, Product Id 8
What would be the most efficient and generic way to do this?
The following is the solution that I was looking for...
public static List<List<T>> AllCombinationsOf<T>(params List<T>[] sets)
{
// need array bounds checking etc for production
var combinations = new List<List<T>>();
// prime the data
foreach (var value in sets[0])
combinations.Add(new List<T> { value });
foreach (var set in sets.Skip(1))
combinations = AddExtraSet(combinations, set);
return combinations;
}
private static List<List<T>> AddExtraSet<T>
(List<List<T>> combinations, List<T> set)
{
var newCombinations = from value in set
from combination in combinations
select new List<T>(combination) { value };
return newCombinations.ToList();
}
Usage (continues with my code snippet of the question itself) :
var intersection = AllCombinationsOf(parameterList.ToArray());
get all the list of customer first like this
var customers = from a in testlist where a.name='customerid'
select a;
var products = from a in testlist where a.name='productid'
select a;
then loop customers
for(var c in customers)
{
loop products
for(var p in products)
{
var customerproducts = new CustomerProducts{
Customer = c.Name +' ' + c.Value
Product = p.Name + ' ' + p.value
};
then add it into a list
}
}
The list needs to be grouped by Name, then it can be joined several times depending on count of groups:
var groups = testList.GroupBy(_ => _.Name);
IEnumerable<IEnumerable<TestParam>> result = null;
foreach (var g in groups)
{
var current = g.Select(_ => new[] { _ });
if (result == null)
{
result = current;
continue;
}
result = result.Join(current, _ => true, _ => true, (actual, c) => actual.Concat(c));
}
// check result
foreach (var i in result)
{
Console.WriteLine(string.Join(", ", i.Select(_ => string.Format("{0}-{1}", _.Name, _.Value))));
}

How to intersect list in c#?

I have following list of Item objects in c#:
public class Item
{
public int Id { get; set; }
public List<string> Orders { get; set; }
}
List<Item> item = new List<Item>() {
new Item() { Id = 1, Code = 23, Orders = new List<string>() { "A", "B" }},
new Item() { Id = 2, Code = 24, Orders = new List<string>() { "C", "D" }},
new Item() { Id = 1, Code = 23, Orders = new List<string>() { "E", "F" }},
new Item() { Id = 3, Code = 25, Orders = new List<string>() { "G", "H" }}
};
I want to concat the Orders whose Id is same, so the output of above list should be:
{
new Item() { Id = 1, Code = 23, Orders = new List<string>() { 'A', 'B', 'E', 'F' },
new Item() { Id = 2, Code = 24, Orders = new List<string>() { 'C', 'D' },
new Item() { Id = 3, Code = 25, Orders = new List<string>() { 'G', 'H' }
};
How can i do this efficiently in c# ( using linq if possible ) ?
You want to group the items based on their ID, and then create a new sequences based on all of the Orders for that group.
var query = items.GroupBy(item => item.Id)
.Select(group => new Item
{
Id = group.Key,
Orders = group.SelectMany(item => item.Orders).ToList()
});
Note that this is not the intersection of any data. You're getting the union of all data within each group.
It appears what you want is something like this:
var output = items.GroupBy(i => i.Id)
.Select(g => new Item()
{
Id = g.Key
Orders = g.SelectMany(i => i.Orders)
.ToList()
});
Or in query syntax:
var output =
from i in items
group i by i.Id into g
select new Item()
{
Id = g.Key
Orders = g.SelectMany(i => i.Orders).ToList()
};
You can group your items by their id, then create new item for each id concatenate the orders:
item.GroupBy(x => x.Id)
.Select(x => new Item
{
Id = x.Key,
Orders = x.SelectMany(a => a.Orders).ToList()
}).ToList();

how to get a SUM in Linq?

I need to do the following, I have a List with a class which contains 2 integer id and count
Now I want to do the following linq query:
get the sum of the count for each id
but there can be items with the same id, so it should be summerized e.g.:
id=1, count=12
id=2, count=1
id=1, count=2
sould be:
id=1 -> sum 14
id=2 -> sum 1
how to do this?
Group the items by Id and then sum the Counts in each group:
var result = items.GroupBy(x => x.Id)
.Select(g => new { Id = g.Key, Sum = g.Sum(x => x.Count) });
Try it ,
.GroupBy(x => x.id)
.Select(n => n.Sum(m => m.count));
The following program...
struct Item {
public int Id;
public int Count;
}
class Program {
static void Main(string[] args) {
var items = new [] {
new Item { Id = 1, Count = 12 },
new Item { Id = 2, Count = 1 },
new Item { Id = 1, Count = 2 }
};
var results =
from item in items
group item by item.Id
into g
select new { Id = g.Key, Count = g.Sum(item => item.Count) };
foreach (var result in results) {
Console.Write(result.Id);
Console.Write("\t");
Console.WriteLine(result.Count);
}
}
}
...prints:
1 14
2 1

Doing pivot with LINQ

I've got this problem..I have a CSV file in the following format (customer, bought item pair):
customer1 item1
customer1 item2
customer1 item3
customer2 item4
customer2 item2
customer3 item5
customer3 item1
customer3 item2
customer4 item1
customer4 item2
customer5 item5
customer5 item1
Now, I wish to show in query results:
item x; item y; how many customers have bought itemx and item together
For example:
item1 item2 3 (because cust1 and cust2 and cust3 bought item1 and item2 together)
item1 item5 1 (because cust5 and cust3 bought item1 and item5 together)
The query return all possible combinations of items that customers have bought in pairs. Also notice that Pair(x, y) is the same as Pair(y, x).
An SQL query would look like this:
SELECT a1.item_id, a2.item_id, COUNT(a1.cust_id) AS how_many_custs_bought_both
FROM data AS a1
INNER JOIN data AS a2
ON a2.cust_id=a1.cust_id AND a2.item_id<>a1.item_id AND a1.item_id<a2.item_id
GROUP BY a1.item_id, a2.item_id
How would you do that in C# 1) using regular for/foreach loops 2) using LINQ ?
I tried doing it in LINQ first but stuck when I noticed that LINQ doesn't support multiple equals keyword in join clause. Then I tried doing using normal loops, however, it became so unefficient that it could only process like 30 lines (of CSV file rows) per second.
Please advise!
Using LINQ (and following the first 5 lines from Tim's answer) combining the chained method syntax with the query syntax for the join part:
var custItems = new [] {
new { customer = 1, item = 1 },
new { customer = 1, item = 2 },
new { customer = 1, item = 3 },
new { customer = 2, item = 4 },
new { customer = 2, item = 2 },
new { customer = 3, item = 5 },
new { customer = 3, item = 1 },
new { customer = 3, item = 2 },
new { customer = 4, item = 1 },
new { customer = 4, item = 2 },
new { customer = 5, item = 5 },
new { customer = 5, item = 1 }
};
};
var pairs = custItems.GroupBy(x => x.customer)
.Where(g => g.Count() > 1)
.Select(x => (from a in x.Select( y => y.item )
from b in x.Select( y => y.item )
where a < b //If you want to avoid duplicate (a,b)+(b,a)
// or just: where a != b, if you want to keep the dupes.
select new { a, b}))
.SelectMany(x => x)
.GroupBy(x => x)
.Select(g => new { Pair = g.Key, Count = g.Count() })
.ToList();
pairs.ForEach(x => Console.WriteLine(x));
EDIT: Forgot that OP wanted pair ocurrence count, added another .GroupBy() magic.
EDIT: Completed the example to show what it would output:
{ Pair = { a = 1, b = 2 }, Count = 3 }
{ Pair = { a = 1, b = 3 }, Count = 1 }
{ Pair = { a = 2, b = 3 }, Count = 1 }
{ Pair = { a = 2, b = 4 }, Count = 1 }
{ Pair = { a = 1, b = 5 }, Count = 2 }
{ Pair = { a = 2, b = 5 }, Count = 1 }
EDIT: rolled back and changed strings to integers, as OP shows a dataset with integers as IDs, and that removes the need for .GetHashCode()
Perhaps:
var lines = File.ReadLines(csvFilePath);
var custItems = lines
.Select(l => new { split = l.Split() })
.Select(x => new { customer = x.split[0].Trim(), item = x.split[1].Trim() })
.ToList();
var groups = from ci1 in custItems
join ci2 in custItems
on ci1.customer equals ci2.customer
where ci1.item != ci2.item
group new { Item1 = ci1.item, Item2 = ci2.item } by new { Item1 = ci1.item, Item2 = ci2.item } into ItemGroup
select ItemGroup;
var result = groups.Select(g => new
{
g.Key.Item1,
g.Key.Item2,
how_many_custs_bought_both = g.Count()
});
Note that the materialization with ToList is important when the file is large because of the self-join.
{ Item1 = item1, Item2 = item2, how_many_custs_bought_both = 3 }
{ Item1 = item1, Item2 = item3, how_many_custs_bought_both = 1 }
{ Item1 = item2, Item2 = item1, how_many_custs_bought_both = 3 }
{ Item1 = item2, Item2 = item3, how_many_custs_bought_both = 1 }
{ Item1 = item3, Item2 = item1, how_many_custs_bought_both = 1 }
{ Item1 = item3, Item2 = item2, how_many_custs_bought_both = 1 }
{ Item1 = item4, Item2 = item2, how_many_custs_bought_both = 1 }
{ Item1 = item2, Item2 = item4, how_many_custs_bought_both = 1 }
{ Item1 = item5, Item2 = item1, how_many_custs_bought_both = 2 }
{ Item1 = item5, Item2 = item2, how_many_custs_bought_both = 1 }
{ Item1 = item1, Item2 = item5, how_many_custs_bought_both = 2 }
{ Item1 = item2, Item2 = item5, how_many_custs_bought_both = 1 }
You can write some like this:
IDictionary<int, int> pivotResult = customerItems.ToLookup(c => c.Customer)
.ToDictionary(x=>x.Key, y=>y.Count());
Working LINQ example, not too pretty!
using System;
using System.Collections.Generic;
using System.Linq;
class Data
{
public Data(int cust, int item)
{
item_id = item;
cust_id = cust;
}
public int item_id { get; set; }
public int cust_id { get; set; }
static void Main(string[] args)
{
var data = new List<Data>
{new Data(1,1),new Data(1,2),new Data(1,3),
new Data(2,4),new Data(2,2),new Data(3,5),
new Data(3,1),new Data(3,2),new Data(4,1),
new Data(4,2),new Data(5,5),new Data(5,1)};
(from a1 in data
from a2 in data
where a2.cust_id == a1.cust_id && a2.item_id != a1.item_id && a1.item_id < a2.item_id
group new {a1, a2} by new {item1 = a1.item_id, item2 = a2.item_id}
into g
select new {g.Key.item1, g.Key.item2, count = g.Count()})
.ToList()
.ForEach(x=>Console.WriteLine("{0} {1} {2}",x.item1,x.item2,x.count))
;
Console.Read();
}
}
Output:
1 2 3
1 3 1
2 3 1
2 4 1
1 5 2
2 5 1

Categories