Linq & C# - Inserting distinct data from one class into another

Linq & C# - Inserting distinct data from one class into another - c#

Afternoon all
I have a lovely webservice call that brings back a list, we'll call it List<Everything>
This would return something along the lines of:
Product ProductName SomethingElse
1 Dave abc
1 Dave def
1 Dave ghi
2 Jason abc
2 Jason def
3 Terry abc
3 Terry def
3 Terry ghi
3 Terry jkl
I then have another List<Products> (int Product, string ProductName) that I would like to populate using the distinct product information in List<Everything>.
So I'm trying to get the following result:
Product Productname
1 Dave
2 Jason
3 Terry
How can I achieve this using Linq?
Apologies for what is probably bloody obvious.

List<Products> products = (from x in everythingList
group x by new { x.Product, x.ProductName } into xg
select new Products
{
Product = xg.Key.Product,
ProductName = xg.Key.ProductName
}).ToList();

var distinctProducts = everything.Select(e=>new { Product, Productname = e.ProductName }).Distinct();

How about this:
List<Everything> items = ...
var results = items.GroupBy(x => new { x.Product, x.ProductName })
.Select(g => new Products()
{
Product = g.Key.Product,
ProductName = g.Key.ProductName
})
.ToList();

You could try to do it like this.
List<Products> products = new List<Products>();
var listEverything = select d in "data from your ws"
select d.Product, d.ProductName).Distinct(x=>x.ProductName);
foreach(var item in listEverything)
{
products.Add(new Products { Product=item.Product, ProductName=item.ProductName});
}

List<Products> products = GetEverythingService()
.Select(p => new Products { Product = p.Product, Productname = p.ProductName})
.Distinct();
As rightly pointed out by dlev in the comments Distinct will only work in this case if the Products class implements the IEqualityComparer<T> interface and overrides the Equals and GetHashCode methods.
This could be overkill if a comparison is only required in this one situation although if product object comparisons are to be carried out elsewhere using the id and product name then it is a viable option. I personally find it a bit more readable than the GroupBy Linq extension but obviously opinions will vary on this.

You can try this.
var unique = list.GroupBy(item => item.Product)
.Select(group => new
{
Product = group.Key,
group.First().ProductName
})
.ToList();

Related

c# - list of objects - group by - get distinct values by key - lambda / linq

i try to get all keys, that have identical values.
data:
public class CustItems
{
public string CustID { get; set; }
public string ItemID { get; set; }
}
List<CustItems> custItems = new List<CustItems>();
// GetData => fill list
custItems.Add(new CustItems { CustID = "1", ItemID = "1" });
No of items: 50'000,
No of customers: 2'000
base list contains 2 fields, meaning is, which customer can buy which item
CustID
ItemID
1
1
1
2
2
2
3
2
4
1
5
1
5
2
1
3
4
3
5
3
i try to find out, which items can be bought by the same customers
according to the demo-data
item 1 by customers 1,4,5
item 2 by customers 1,2,3,5
item 3 by customers 1,4,5
so item 1 and 3 can be bought by the same customers
couldn't find out, how to solve this in a performant way, using lambda or linq.
appreciate any hint very much! thx a lot!
p.s.
started with something like:
var groupedList = from c in custItems
group c by c.ItemID into grp
select new
{
ID = grp.Key,
CustList = grp.Select(g => g.CustID).ToList()
};
but after all, the CustList contains all customers by key (ItemID), but couldn't find a good way to find out, which of the keys (=Item) have identical values (=CustList)

Since your CustID and ItemID are strings (not very optimal performace-wise), I came up with the following linq solution:
var res = custItems
.GroupBy(s => s.ItemID)
.Select(g => new { ItemId = g.Key, Customers = g.Select(i => i.CustID).OrderBy(c => c).Aggregate((c0, c1) => $"{c0},{c1}") })
.GroupBy(g => g.Customers)
.Select(g => new { Customers = g.Key.Split(',').ToList(), Items = g.Select(i => i.ItemId).ToList() })
.ToList();
you first group your list by the ItemID to find out all the customers that buy each individual item
you then create an anonymous type containing the ItemID and a set of CustIDs - I've used string concatenation here, it's the first spot for improvement - converting a set of IDs that can be used for further grouping
then you group the results by the CustIDs sets
and in the end you bring your CustID sets back to a list of IDs and store those in an anonymous type containing the list of CustIDs and list of ItemID that this set of customers buy
finally you convert everything into a list for structured browsing.
Again, combining and splitting the customers (2nd and 4th step) is what can be optimised.

Selecting unique values of different columns using LINQ

I have a table (orders for ex) which has Multiple Columns.
products categories subcategories
--------------------------------------
prod1 cat1 sub1
prod1 cat2 sub2
prod2 cat3 sub6
prod1 cat1 sub1
prod5 cat2 sub8
prod2 cat1 sub1
prod1 cat7 sub3
prod8 cat2 sub2
prod2 cat3 sub1
Now I can write three different queries to get distinct values
var prod = (from p in _context.orders select p.products).ToList().Distinct();
similarly I can write it for others.
Now I need to get the distinct values of each column in a single query for which the result needs to look like
products categories subcategories
--------------------------------------
prod1 cat1 sub1
prod2 cat2 sub2
prod5 cat3 sub6
prod8 cat7 sub8
sub3
My ClassType for unique fields looks like this
public class UniqueProductFields
{
public IEnumerable<string> Products { get; set; }
public IEnumerable<string> Categories { get; set; }
public IEnumerable<string> Subcategories { get; set; }
}
Not sure how to do this in an efficient manner so that I dont have to write three methods. The table is in the database (hence the need for optimization)
Thanks!

Is it an absolutely unchangeable requirement to use Linq? Why do you need it to be returned in a single query?
Suggestion: Use SQL. It can be done in a single query but you won't like the query. I'm assuming SQL Server (can be done differently for other DBMSes).
WITH V AS (
SELECT DISTINCT
V.*
FROM
Orders O
CROSS APPLY (
VALUES (1, O.Products), (2, O.Categories), (3, O.Subcategories)
) V (Which, Value)
),
Nums AS (
SELECT
Num = Row_Number() OVER (PARTITION BY V.Which ORDER BY V.Value),
V.Which,
V.Value
FROM
V
)
SELECT
Products = P.[1],
Categories = P.[2],
Subcategories = P.[3]
FROM
Nums N
PIVOT (Max(N.Value) FOR N.Which IN ([1], [2], [3])) P
;
See this working at db<>fiddle
Output:
Products Categories Subcategories
-------- ---------- -------------
prod1 cat1 sub1
prod2 cat2 sub2
prod5 cat3 sub3
prod8 cat7 sub6
null null sub8
If you are bound and determined to use Linq, well, I can't help you with the query-style syntax. I only know the C# code style syntax, but here's a stab at that. Unfortunately, I don't think this will do you any good, because I had to use some pretty funky stuff to make it work. It uses essentially the same technique as the SQL query above, only, there's no equivalent of PIVOT in Linq and there's no real natural row object other than a custom class.
using System;
using System.Collections.Generic;
using System.Linq;
public class Program {
public static void Main() {
var data = new List<Order> {
new Order("prod1", "cat1", "sub1"),
new Order("prod1", "cat2", "sub2"),
new Order("prod2", "cat3", "sub6"),
new Order("prod1", "cat1", "sub1"),
new Order("prod5", "cat2", "sub8"),
new Order("prod2", "cat1", "sub1"),
new Order("prod1", "cat7", "sub3"),
new Order("prod8", "cat2", "sub2"),
new Order("prod2", "cat3", "sub1")
};
int max = 0;
var items = data
.SelectMany(o => new List<KeyValuePair<int, string>> {
new KeyValuePair<int, string>(1, o.Products),
new KeyValuePair<int, string>(2, o.Categories),
new KeyValuePair<int, string>(3, o.Subcategories)
})
.Distinct()
.GroupBy(d => d.Key)
.Select(g => {
var l = g.Select(d => d.Value).ToList();
max = Math.Max(max, l.Count);
return l;
})
.ToList();
Enumerable
.Range(0, max)
.Select(i => new {
p = items[0].ItemAtOrDefault(i, null),
c = items[1].ItemAtOrDefault(i, null),
s = items[2].ItemAtOrDefault(i, null)
})
.ToList()
.ForEach(row => Console.WriteLine($"p: {row.p}, c: {row.c}, s: {row.s}"));
}
}
public static class ListExtensions {
public static T ItemAtOrDefault<T>(this List<T> list, int index, T defaultValue)
=> index >= list.Count ? defaultValue : list[index];
}
public class Order {
public Order(string products, string categories, string subcategories) {
Products = products;
Categories = categories;
Subcategories = subcategories;
}
public string Products { get; set; }
public string Categories { get; set; }
public string Subcategories { get; set; }
}
I suppose that we could swap this
.Select(i => new {
p = items[0].ItemAtOrDefault(i, null),
c = items[1].ItemAtOrDefault(i, null),
s = items[2].ItemAtOrDefault(i, null)
})
for this:
.Select(i => new Order(
items[0].ItemAtOrDefault(i, null),
items[1].ItemAtOrDefault(i, null),
items[2].ItemAtOrDefault(i, null)
))
Then use that class's properties in the output section.

As far as i know, you won't be able to do it in a single query. Before thinking how would you do it with C# think how would you do it in SQL; I might be wrong but to me you'll be writing 3 querys anyway.
If you notice some performance issues and this is your actual code:
var prod = (from p in _context.orders select p.products).ToList().Distinct();
You may want to start by removing the .ToList() extension method beacuse that is retrieveng all records to memory and only after that the distinction is applied.
That's because your query expression (from p in ...) returns an IQueryable and calling .ToList() on it makes it IEnumerable. force the current formed SQL query to run and bring the results to memory.
The difference in this case is: Deferred execution
See: https://www.c-sharpcorner.com/UploadFile/rahul4_saxena/ienumerable-vs-iqueryable/

seeking elegant linq solution

I have a model like this:
public class Post
{
public int PostId,
public List<Category> Categories
}
Posts have at least 1 category, but can also have many categories.
I have a List, this list contains Posts (some with the same PostId), and each entry in the List contains exactly one unique Category (Categories.Count = 1 for each).
I want to create a new List with only distinct Posts (distinct PostId), with the Categories list populated with each category in the original List having the same PostId.
Basically, find each Post in the original list, and populate the Categories field by adding each of their First (and only) entry in their Categories field together.
Is there a nice solution for this in linq?
Category is just an Enum,
I have tried using varous nested foreach and for loops and it works but it is just gross. I know there is a clean way to do it.
Example:
Categories = { PostId = 1, Category = Shopping }, { PostId = 1, Category = Pizza }, { PostId = 2, Category = Laundry }
after sequence desired output to be:
Categories = { PostId = 1, Categories = Shopping, Pizza }, { PostId = 2, Categories = Laundry }
Order does not matter for the category list

Given that you will have only one category per post (as stated in the second paragraph), you can try
var result = aPosts
.GroupBy(item => item.PostId, item => item.Categories[0])
.Select(group => new Post() { PostId = group.Key, Categories = new List<Category>(group) })
.ToList();
Note that having a Post constructor that accepts both PostId and Categories would allow a more simplified version of any solution.
Post(int postId, IEnumerable<Category> categories)
{
PostId = postId;
Categories = new List<Category>(categories);
}
Would allow the following:
var result = aPosts
.GroupBy(item => item.PostId, item => item.Categories[0])
.Select(group => new Post(group.Key, group))
.ToList();

something like below
var result = yourlist.GroupBy(l=>l.PostId)
.Select(x=>new Post{ PostId =x.Key, Categories =x.SelectMany(y=>y.Categories).ToList()})
.ToList();

With LINQ expressions:
var result = from o in posts
group o by o.PostID into gr
select new Post
{
PostID = gr.Key,
Categories = gr.SelectMany(c=>c.Categories).ToList()
};

All the other given solutions would work. But if you might have more than 1 category in the Category list, and you need only the first of each Post you can use following.
var posts =
postList.GroupBy(p => p.PostId)
.Select(
g =>
new Post
{
PostId = g.Key,
Categories =
g.Select(p => p.Categories.FirstOrDefault())
.Where(c => c != null).ToList()
});
Also, make sure you initialize you Categories property (e.g. in the constructor of Post class) before using Linq given in the answers. Otherwise you might get NUllReferenceException.

Group By Query with Entity Framework

In my application I have Movements associated with a category.
I want a list of the most frequent category.
My objects are:
Category: catId, catName
Movement: Movid, movDate, movMount, catId
I think it would have to raise it with a "Group By" query (grouping by catId and getting those more)
(Im using Entity Framework 6 in c#)
From already thank you very much!

IMPORTANT: Entity Framework 7 (now renamed to Entity Framework Core 1.0) does not yet support GroupBy() for translation to GROUP BY in generated SQL. Any grouping logic will run on the client side, which could cause a lot of data to be loaded.
https://blogs.msdn.microsoft.com/dotnet/2016/05/16/announcing-entity-framework-core-rc2

group the movements by category and select catid and count.
join this result with category to get the name and then descending sort the results on count.
var groupedCategories = context.Movements.GroupBy(m=>m.catId).Select(g=>new {CatId = g.Key, Count = g.Count()});
var frequentCategories = groupedCategories.Join(context.Categories, g => g.CatId, c => c.catId, (g,c) => new { catId = c.catId, catName = c.catName, count = g.Count }).OrderByDescending(r => r.Count);
foreach (var category in frequentCategories)
{
// category.catId, category.catName and category.Count
}

i hope this help:
var query = dbContext.Category.Select(u => new
{
Cat = u,
MovementCount = u.Movement.Count()
})
.ToList()
.OrderByDescending(u => u.MovementCount)
.Select(u => u.Cat)
.ToList();

I resolved the problem!
I used the proposal by "Raja" solution (Thanks a lot!).
This return a collection composed of "Category" and "Count". I Change it a bit to return a list of Categories.
var groupedCategories = model.Movement.GroupBy(m => m.catId).Select(
g => new {catId= g.Key, Count = g.Count() });
var freqCategories= groupedCategories.Join(model.Category,
g => g.catId,
c => c.catId,
(g, c) => new {category = c, count = g.Count}).OrderByDescending(ca => ca.count).Select(fc => fc.category).ToList ();

you just need to use navigation property on category simply, you have a navigation property on category contains all related Movement, i call it Movements in following query. you can write your query like this, with minimum of connection with DB.
class Cat
{
public Guid catId { get; set; }
public string catName { get; set; }
public IEnumerable<Movement> Movements { get; set; }
public int MovementsCount { get { return Movements.Count(); } }
}
var Categories = category.Select(u => new Cat()
{
u.catId,
u.catName,
Movements = u.Movements.AsEnumerable()
}).ToList();
var CategoriesIncludeCount = Categories.OrderBy(u => u.MovementsCount).ToList();

linq query repeated columns in a list

I have a table similar to the one below.
Branch Dept Product ID Product Val Product Date
Branch 1 Dept 1 ID 1 1 5/23/2013
Branch 1 Dept 1 ID 2 1 5/23/2013
Branch 1 Dept 2 ID 3 1 5/23/2013
Branch 2 Dept 11 ID 4 1 5/23/2013
Branch 2 Dept 11 ID 5 1 5/23/2013
Branch 2 Dept 11 ID 6 1 5/23/2013
Branch 3 Dept 21 ID 7 1 5/23/2013
I am trying to use LINQ(am a rookie to LINQ) to load this as a collection of objects into an object like:
Products = { Branch1 { Dept1 {ID1,ID2},
Dept2 {ID3}},
Branch2 { Dept11 {ID4, ID5, ID6}},
Branch3 { Dept21 {ID7 }
}
And I have trying bit hard working overnight but could not get the right solution. So far I have achieved the following code;
var branches = (from p in ProductsList
select p.Branch).Distinct();
var products = from s in branches
select new
{
branch = s,
depts = (from p in ProductsList
where p.Branch == s
select new
{
dept = p.Dept,
branch = s,
prod = (from t in ProductsList
where t.Branch = s
where t.Dept == p.Dept
select t.ProductID)
})
};
where ProductsList is the list object of the whole table date List
Any help at the earliest is much appreciated. Thanks in advance!

I would go for soemthing like that if you really wanna use linq.
Somethimes, a few foreach are much clearer !
var myDic = ProductList
.GroupBy(m => m.Branch)
.ToDictionary(
m => m.Key,
m => m.GroupBy(x => x.Dept)
.ToDictionary(
x => x.Key,
x => x.Select(z => z.ProductId)));
result will be a
Dictionary<string, Dictionary<string, IEnumerable<string>>>
where first dictionary Key is Branch, inner dictionary key is Dept, and string list are ProductId
which seem to correpond to your wanted result.

Something like this, maybe?
Products.
.Select(prop => prop.Branch)
.Distinct()
.Select(b => new
{
Branch = b,
Departments = Products
.Where(p => p.Branch == b)
.Select(p => p.Dept)
.Distinct()
.Select(d => new
{
Products = Products
.Where(p => p.Department == d)
.Select(p => p.ProductID)
.Distinct()
})
})

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Linq & C# - Inserting distinct data from one class into another - c#

List<Products> products = (from x in everythingList group x by new { x.Product, x.ProductName } into xg select new Products { Product = xg.Key.Product, ProductName = xg.Key.ProductName }).ToList();

var distinctProducts = everything.Select(e=>new { Product, Productname = e.ProductName }).Distinct();

How about this: List<Everything> items = ... var results = items.GroupBy(x => new { x.Product, x.ProductName }) .Select(g => new Products() { Product = g.Key.Product, ProductName = g.Key.ProductName }) .ToList();

You can try this. var unique = list.GroupBy(item => item.Product) .Select(group => new { Product = group.Key, group.First().ProductName }) .ToList();

Related

c# - list of objects - group by - get distinct values by key - lambda / linq

Selecting unique values of different columns using LINQ

seeking elegant linq solution

Group By Query with Entity Framework

linq query repeated columns in a list

Categories

Resources