Trying to work a threshold value into linq query(ies) - c#

Good Evening everyone,
I've been trying to figure out the most efficient way to do this, but am falling short. Here's how it goes...
I am ultimately trying to determine "like customers" based on a specific customer's buying habits and a given threshold, say 50%. IE customer 1 purchased products A,B,C,D ... customer 2 purchased B,C,D,E ... these two customers are >= 50% "likeness" so they should be matched.
My schema is as would be expected
CLIENT (1 ----- many) CLIENT_PURCHASE (1 -------many) PRODUCT
*clientID *clientID *prodID *prodID
For now I am ignoring the threshold and simply am trying to find customers who have purchased any item within customer 1's history. I think I have this working with the following two queries:
var clientOneHistory = (from cp in client.Client_Purchase
select cp.prodID).ToList();
var matchedClients = (from cp in db.Client_Purchase
where clientOneHistory.Contains(cp.prodID)
select cp.Client.fullname).Distinct().ToList();
So my ultimate question is, "How do I work in the threshold portion?"
Thanks for your time

I'm not sure exactly how to form these queries for your particular case. Assuming you want to use LINQ-to-SQL. Instead, I'll use the NorthWind database as an example to do the same thing. You could then use the ideas used here in your implementation. I don't think it's possible to do this completely using LINQ-to-SQL, you'll have to do a mix.
var threshold = 0.5M;
// let's pick a customer id
var myId = "VINET";
// get products of current customer
var myProducts = (from c in db.Customers
where c.CustomerID == myId
join o in db.Orders on c.CustomerID equals o.CustomerID
join od in db.Order_Details on o.OrderID equals od.OrderID
select od.ProductID)
.Distinct()
.ToArray();
// get the products of all other customers
var others = (from c in db.Customers
where c.CustomerID != myId
join o in db.Orders on c.CustomerID equals o.CustomerID
join od in db.Order_Details on o.OrderID equals od.OrderID
group od.ProductID by c.CustomerID into g
select new { CustomerID = g.Key, Products = g.Distinct() })
.AsEnumerable();
// calculate "likeness" values for each person
var likeness = from o in others
let Percent = Decimal.Divide(myProducts.Intersect(o.Products).Count(), myProducts.Length)
where Percent >= threshold
select new { o.CustomerID, Percent };

Related

How do I write top 5 attribute and group by in a linq Expression?

I am trying to find out best 5 company names which makes the max profit for the company.
Here is what I have written so far:
var result = from c in db.Customers
join o in db.Orders
on c.CustomerID equals o.CustomerID
join od in db.Order_Details
on o.OrderID equals od.OrderID
select new
{
CompanyName = c.CompanyName,
Profit = (float)od.UnitPrice * (float)od.Quantity * (1 - od.Discount)
};
However, it doesn't contain not the group by and the best 5 company part which I'm actually looking for. I tried to do with
group c by c.CompanyName into CompanyName
but it doesn't work, and I couldn't find out that top 5 company query.
I think you would need to group your result with company name and sum the profit, then take the highest 5 by order by desc and take.
Something like the below (even though the sentence may not be exactly correct)
var grouped =from p in query
group p by p.CompanyName into g
select new
{
CompanyName = g.Key,
TotalProfit = g.sum(x=>x.Profit)
};
var Top5=grouped.orderbyDesc(x=>TotalProfit).take(5);
var totalCompanies = db.Order_Details.GroupBy(x => x.Order.Customer.CompanyName).OrderByDescending(x=>x.Sum(y=> (float)y.Quantity * (float)y.UnitPrice * 1 - (y.Discount))).Take(5).ToList();
List<string> bestCompanies = new List<string>();
foreach (var item in totalCompanies)
{
bestCompanies.Add(item.Key);
};
Here is the similar solution for the question.

LINQ multiple joins performance optimization

I need to retrieve a list of orders records, and on the same view I have to show for each order:
how many invoices have been issued (InvoiceCount)
the total amount covered by the invoices (Invoiced)
how much has been paid for the issued invoices (Payed)
all on the same grid view.
I think it's a basic view for business management software.
Order > invoices = one-to-many
Payments are related to invoices with a look-up table PaymentsToInvoices
GridView is paged server-side, with a page-size of 50 records.
I've written the following LINQ, that it works fine, but I have some doubt that is well written in terms of performance,
especially if this query is finally translated with a couple of sub-queries, that means 50*2 queries for each request!
var orders = (from o in db.Orders
join c in db.Customers on o.CustomerID equals c.CustomerID
join os in db.OrderStatuses on o.OrderStatusID equals os.OrderStatusID
join ps in db.PaymentStatuses on o.PaymentStatusID equals ps.PaymentStatusID
join ec in db.EntryChannels on o.EntryChannelID equals ec.EntryChannelID
orderby o.InsertDate descending
where o.OrderStatusID == 2
select new
{
o.OrderID,
o.CustomerID,
o.InsertDate,
o.TotalPrice,
o.Notes,
c.FirstName,
c.LastName,
OrderStatus = os.Name,
PaymentStatus = ps.Name,
EntryChannel = ec.Name,
InvoiceCount = (db.InvoiceDetails.Where(i => i.OrderID == o.OrderID).Select(s => s.InvoiceID).Distinct().Count()),
Invoiced = (db.InvoiceDetails.Where(i => i.OrderID == o.OrderID).Select(s => s.TotalPrice).Sum()),
Payed = (from py in db.Payments
join pti in db.PaymentsToInvoices on py.PaymentID equals pti.PaymentID
join inv in db.InvoiceDetails on pti.InvoiceID equals inv.InvoiceID
where inv.OrderID==o.OrderID
select new { py.PaymentID, py.Amount }).Distinct().Select(s=>s.Amount).Sum()
});
What do you think about the last three fields? Can they be improved in some way? Do they result in performance killer sub-queries?

Linq InnerJoin 4 tables

My goal is to return a list of orders that only contain orderItems that are from a specific merchant. My current solution is to iterate through EVERY order, then through every order item and every listing. I imagine that is not the best practice, but I am having a hard time figuring out how to construct a single query to retrieve merchant specific orders.
I have 4 tables
Merchants(the id field being merchantID)
Orders(the id field is orderID)
orderItems(the id field is orderItemID, and FK listingID)
listings(the id field is listingID, and FK merchantID)
You can use .Any() to help you get to where you want:
var ordersFromMerchant = db.Orders
.Where(o => o.Items.Any(oi => oi.Listing.merchantID = 10);
I've made assumptions about the names of your navigation properties, but you should be able to adapt this if they don't match.
If you prefer the linq syntax, you can use:
var ordersFromMerchant = from o in db.Orders
join oi in db.orderItems on o.orderID equals oi.orderID
join l in db.listings on oi.listingID equals l.listingID
where l.merchantID = 10
select o;
I don't know your structure but something like this should work.
var query = from o in orders
join oi in orderItems on o.id equals io.orderID
join l in listings on oi.listingID equals l.id
where l.merchantID == merchantID
select o;

Alternative to nesting when performing a left join and multiple inner joins

Consider the following fictitious scenario:
How would I go about getting a list of all the categories (distinct or otherwise, it doesn't matter) for each customer, even if a customer hasn't ordered any products?
Also assume that we don't have navigation properties, so we'll need to use manual joins.
This is my attempt which uses nesting:
var customerCategories = from c in context.Customers
join o in context.Orders on c.CustomerId equals o.CustomerId into orders
select new
{
CustomerName = c.Name,
Categories = (from o in orders
join p in context.Products on o.ProductId equals p.ProductId
join cat in context.Category on p.CategoryId equals cat.CategoryId
select cat)
};
Is there a different (possibly better way) to achieve the same outcome?
Alternative: Multiple Left (Group) Joins
var customerCategories = from customer in context.Customers
join o in context.Orders on customer.CustomerId equals o.CustomerId into orders
from order in orders.DefaultIfEmpty()
join p in context.Products on order.ProductId equals p.ProductId into products
from product in products.DefaultIfEmpty()
join cat in context.Categories on product.CategoryId equals cat.CategoryId into categories
select new
{
CustomerName = c.Name,
Categories = categories
};
I recreated your table structure and added some data so that I could get a better idea what you were trying to do. I found a couple of ways to accomplish what you want but I'm just going to add this method. I think it's the most concise and I think it's pretty clear.
Code
var summaries = Customers.GroupJoin(Orders,
cst => cst.Id,
ord => ord.CustomerId,
(cst, ord) => new { Customer = cst, Orders = ord.DefaultIfEmpty() })
.SelectMany(c => c.Orders.Select(o => new
{
CustomerId = c.Customer.Id,
CustomerName = c.Customer.Name,
Categories = Categories.Where(cat => cat.Id == c.Customer.Id)
}));
Output
Table Structure
Table Data
If you need all categories couldn't you just:
Categories = (from c in context.Category
select cat)

Linq orderby calculation

I have an SQL query that I built for a tool a while ago and I'm remaking the tool in MVC and using LINQ to Entities.
I can't seem to figure out how to sort my list of Brands by weighting my Cars by man hours and their testing value.
Here's the SQL query I had in the old tool:
SELECT Brand.ID, SUM(Car.EstManHours) - SUM(Car.EstManHours) * CAST(AVG(1.00 * TestingStatus.Value) AS DECIMAL(9 , 2)) / 100 AS Weighting
FROM TestingStatus INNER JOIN Car ON TestingStatus.ID = Car.StatusID
INNER JOIN Team ON Car.TeamID = Team.TeamID
RIGHT OUTER JOIN Brand
LEFT OUTER JOIN SubCategory ON Brand.ID = SubCategory.BrandID ON Car.SubCategoryID = SubCategory.ID
WHERE (Car.IsPunted == 'False')
GROUP BY Brand.YearID, Brand.FeatID
HAVING (Brand.YearID = #BrandYearID)
ORDER BY Weighting DESC
I've tried this, but whether I put descending or ascending the order doesn't actually change in the list, it keeps the sorting by Id:
var brands = (from b in _context.Brands
join s in _context.SubCategorys on f.Id equals s.BrandId
join c in _context.Cars on s.Id equals c.SubCategoryId
where (f.YearId == yearId && c.IsPunted == false)
orderby (c.ManHoursEst - (c.ManHoursEst * c.TestingStatu.Value / 100)) descending
select b).Distinct().ToList();
Would appreciate help on this conversion!
Thanks.
EDIT:
I'm now trying to get the order by and group by to work correctly.
The following query is listing tons of duplicates and not ordering properly as I don't think my weighting is done correctly.
var brands = (from b in _context.Brands
join s in _context.SubCategorys on f.Id equals s.BrandId
join c in _context.Cars on s.Id equals c.SubCategoryId
where (f.YearId == yearId && c.IsPunted == false)
let weighting = c.ManHoursEst - (c.ManHoursEst * c.TestingStatu.Value / 100)
orderby weighting descending
group b by b.Id).SelectMany(x=>x).ToList();
Any ideas?
Distinct does not preserve sorting. That is your problem.
You could do a group by like in your SQL to mimic the Distinct and perform everything server side.

Categories