Query tree structure - c#

I'm working with EntityFramework but can use other ways if need be
Here's the case: I have an SQL Server Database with a scheme similar to:
A B C AhasB AhasC
________ ________ ________ __________ ___________
AId BId CId AId AId
... Btxt Ctxt BId CId
BParent ...
...
Where ... means other columns not important to the problem.
Tables C and AhasC are there to keep data during a lengthy process and are cleared on process completion, so i always start with both empty
Now, the process get's a lot of data (1000+ records) from online sources and stores it in C. After C is filled, I want to fill table AhasC based on the following:
INSERT INTO C (AId, CId) VALUES (
SELECT A.AId, C.CId
FROM A, B, C, AhasB
WHERE A.AId = AhasB.AId AND B.BId = AhasB.BId AND
C.CTxt IN (
SELECT D.BTxt
FROM B AS D
WHERE D.BId = B.BId OR ??
)
)
Before i explain what i need in ?? let me run through what i have here:
I want to insert into table AhasC the pair A.AId, C.CId, so that in all pairs, C.CTxt is the same as a B.Btxt that is connected to A in AhasB.
Moreover (and here enters the ??) i also want it to match the B.Btxt of any parent of B.
Example:
A B
_______ ____________________________________
AId = 1 BId = 1, BTxt = 'a', BParent = Null
AId = 2 BId = 2, BTxt = 'b', BParent = 1
AId = 3 Bid = 3, BTxt = 'c', BParent = 2
AId = 4 BId = 4, BTxt = 'x', BParent = Null
C AahsB
_____________________ _________
CId = 1, Ctxt = 'b' AId = 1, BId = 3
CId = 2, CTxt = 'z' AId = 3, BId = 4
This should result in:
AhasC
____________
AId = 1, CId = 1
So again, AhasC must connect A and C if A is connected to a B that either has BTxt equal to CTxt, or who's parent (or grand-parent and so on) has a BTxt that is the same as CTxt.
Hope i didn't overcomplicate my explaining here :p
EDIT1: as per #dotctor's coments, here's an image of my real shema (not that i think it will add much to the question)
A = Contatos
B = Termos
C = ConcursosPublicos
AhasB = TermosContatos
AhasC = ConcursosContatos
A.AId = Contatos.Id
B.BId = Termos.Id
C.CId = ConcursosPublicos.Id
B.BTxt = Termos.Area
C.CTxt = ConcursosPublicos.Area
B.BParent = Termos.Pai
And here's my real code doing this work presently:
public static void Connect(ProgressBar progress)
{
lock (Locker)
using (var ctx = new ConcursosContainer())
{
int i = 0;
IList<Contatos> contatos = ctx.Contatos.ToList();
progress.Invoke((MethodInvoker) (() =>
{
progress.Value = 0;
progress.Maximum = contatos.Count;
}));
foreach (Contatos contato in contatos)
{
Console.WriteLine(contato.Id);
List<Termos> tree = GetTree(ctx, contato.Id).SelectMany(x => x.ToArray()).ToList();
List<int> attr = ctx.ConcursosContatos.Where(x => x.ContatoId == contato.Id).Select(x => x.ConcursoId).ToList();
IList<ConcursosPublicos> concursosPublicos = ctx.ConcursosPublicos.Where(x => !attr.Contains(x.Id)).ToList();
foreach (ConcursosPublicos concursosPublico in concursosPublicos)
{
if (tree.Any(termo => (termo.Tipo == concursosPublico.TipoConc) && concursosPublico.Area.Trim().EndsWith(termo.Area)))
{
ctx.ConcursosContatos.Add(new ConcursosContatos
{
ContatoId = contato.Id,
ConcursoId = concursosPublico.Id
});
i++;
}
if (i == 9)
{
ctx.SaveChanges();
i = 0;
}
}
progress.Invoke((MethodInvoker) (progress.PerformStep));
}
if (i > 0)
ctx.SaveChanges();
}
}
private static IEnumerable<Stack<Termos>> GetTree(ConcursosContainer ctx, int id)
{
var res = new List<Stack<Termos>>();
IQueryable<Termos> terms = ctx.Termos.Where(x => ctx.TermosContatos.Any(y => (y.ContatoId == id) && (y.TermoId == x.Id)));
foreach (Termos term in terms)
{
var stack = new Stack<Termos>();
if (term.Pai.HasValue)
AddParent(ctx, stack, term);
stack.Push(term);
res.Add(stack);
}
return res;
}
private static void AddParent(ConcursosContainer ctx, Stack<Termos> stack, Termos term)
{
Termos pai = ctx.Termos.First(x => x.Id == term.Pai.Value);
if (pai.Pai.HasValue)
AddParent(ctx, stack, pai);
stack.Push(pai);
}
This code does the job but for 1000+ members of ConcursosPublicos and 7000+ members of Contatos (with contatos on way to grow in the future) it can take between 15 to 20 hours to complete. Since this a daily process i need a more efficient way to fill in ConcursosContatos

You need some recursion to get the family tree on your B table. In SQL Server you can do this with a CTE:
;with chld as (
select B.BId, B.BTxt, B.BParent
from dbo.B as B
union all
select chld.BId , b1.BTxt, b1.BParent
from dbo.B as B1
inner join chld
on B1.BId = chld.BParent
)
select BId, BTxt from chld option(maxrecursion 32767)
The result set:
BId BTxt
1 a
2 b
3 c
4 x
3 b
3 a
2 a
If this isn't correct, no need to go any further. Otherwise, you can join this to the other tables as needed to populate your AhasC table.

Related

linq group by two columns and get only rows with same group by values

I want to retrieve data by group two columns ( Parent_Id and Name ) using LINQ and get the result only the rows with the same group by values.
Child
---------
Id Parent_Id Name
1 1 c1
2 1 c2
3 2 c1 <-----
4 2 c1 <-----
5 3 c2
6 3 c3
7 4 c4 <-----
As you can see, for Parent_Id 1 and 2, Name are different. So, I don't what those rows.
The result I want is like
Parent_Id Name
2 c1
4 c4
What I have tried is
from c in Child
group c by new
{
c.Parent_Id,
c.Name
} into gcs
select new Child_Model()
{
Parent_Id = gcs.Key.Parent_Id,
Name= gcs.Key.Name
};
But it return all rows.
As you describe it you should group by Parent_id only and get the groups that have distinct Names:
var result = children
.GroupBy(c => c.Parent_Id)
.Where(g => g.Select(t => t.Name).Distinct().Count() == 1)
.Select(g => new
{
Parent_Id = g.Key,
Name = g.Select(c => c.Name).First()
});
Reduced to final edit as per Gert Arnold's request:
var result = from r in (from c in children
where !children.Any(cc => cc.Id != c.Id &&
cc.Parent_Id == c.Parent_Id &&
cc.Name != c.Name)
select new {
Parent_Id = c.Parent_Id,
Name = c.Name
}).Distinct().ToList()
select new Child_Model
{
Parent_Id = r.Parent_Id,
Name = r.Name
};
var myModel = Child.GroupBy( c => $"{c.Parent_Id}|{c.Name}",
(k, list) => new Child_Model{
Parent_Id = list.First().Parent_Id,
Name = list.First().Parent_Id,
Count = list.Count()})
.Max (cm => cm.Count);
You can add a condition to filter result (groupName.Count() > 1):
from c in childs
group c by new { c.Parent_Id, c.Name } into gcs
where gcs.Count() > 1
select new { gcs.Key.Parent_Id, gcs.Key.Name }

How To Find Highest Value In A Row And Return Column Header and it's value

Imagine a row of 5 numeric values in an Entity Framework database, how would I retrieve the top 2 columns of that row including the name of the column and their values? Preferably using LINQ.
For example:
a b c d e
0 4 5 9 2
The top 2 values are 9 and 5. I would like to retrieve the values and the column names, c and d.
A more practical example:
var row = table.Where(model => model.Title.Contains(a.Title));
This line will give me a single row with many numeric values.
I would like something as follows,
row.list().OrderByDescendingOrder().top(2);
I don't know how you do this in linq, but here is a SQL Server query:
select t.*, v2.*
from t cross apply
(values ('a', a), ('b', b), ('c', c), ('d', d), ('e', e)
) v(col, val) cross apply
(select max(case when seqnum = 1 then val end) as val1,
max(case when seqnum = 1 then col end) as col1,
max(case when seqnum = 2 then val end) as val2,
max(case when seqnum = 3 then col end) as col2
from (select v.*, row_number() over (order by val desc) as seqnum
from v
) v
) v2;
EDIT:
Of course, you can do this with massive case expressions to get the maximum value:
select t.*,
(case when a >= b and a >= c and a >= d and a >= e then a
when b >= c and b >= d and b >= e then b
when c >= d and c >= e then c
when d >= e then d
else e
end) as max_value,
(case when a >= b and a >= c and a >= d and a >= e then 'a'
when b >= c and b >= d and b >= e then 'b'
when c >= d and c >= e then 'c'
when d >= e then 'd'
else 'e'
end) as max_value_col
from t;
The problem is extending this to the second value, particularly if there are duplicate values.
It looks like you want to pivot this data, but using Linq instead of T-SQL. (or some other SQL dialect)
The basic pattern to do this is to use a SelectMany to transform each row to
an array of key/value pairs that you can do an OrderByDescending on.
A somewhat generic pattern for this is below:
// The values that we want to query.
// In your case, it's essentially the table that you're querying.
// I used an anonymous class for brevity.
var values = new[] {
new { key = 99, a = 0, b = 4, c = 5, d = 9, e = 2 },
new { key = 100, a = 0, b = 5, c = 3, d = 2, e = 10 }
};
// The query. I prefer to use the linq query syntax
// for actual SQL queries, but you should be able to translate
// this to the lambda format fairly easily.
var query = (from v in values
// Transform each value in the object/row
// to a name/value pair We include the key so that we
// can distinguish different rows.
// Because we need this query to be translated to SQL,
// we have to use an anonymous class.
from column in new[] {
new { key = v.key, name = "a", value= v.a },
new { key = v.key, name = "b", value= v.b },
new { key = v.key, name = "c", value= v.c },
new { key = v.key, name = "d", value= v.d },
new { key = v.key, name = "e", value= v.e }
}
// Group the same row values together
group column by column.key into g
// Inner select to grab the top two values from
// each row
let top2 = (
from value in g
orderby value.value descending
select value
).Take(2)
// Grab the results from the inner select
// as a single-dimensional array
from topValue in top2
select topValue);
// Collapse the query to actual values.
var results = query.ToArray();
foreach(var value in results) {
Console.WriteLine("Key: {0}, Name: {1}, Value: {2}",
value.key,
value.name,
value.value);
}
However, since you have a single row, the logic becomes much more simple:
// The value that was queried
var value = new { key = 99, a = 0, b = 4, c = 5, d = 9, e = 2 };
// Build a list of columns and their corresponding values.
// You could even use reflection to build this list.
// Additionally, you could use C# 7 tuples if you prefer.
var columns = new[] {
new { name = "a", value = value.a },
new { name = "b", value = value.b },
new { name = "c", value = value.c },
new { name = "d", value = value.d },
new { name = "e", value = value.e }
};
// Order the list by value descending, and take the first 2.
var top2 = columns.OrderByDescending(v => v.value).Take(2).ToArray();
foreach(var result in top2) {
Console.WriteLine("Column: {0}, Value: {1}", result.name, result.value);
}
So you have a collection of items, which can be sorted on one of its properties and you want the two items in the collection with the largest value for this sort property?
var result = myItems
.OrderByDescending(myItem => myItem.MyProperty)
.Take(2);
In words: Order the complete collection in a descending order of MyProperty and take the first two items from the result, which will be the two with the largest value for MyProperty.
This will return two complete myItem objects. Usually it is not a good idea to transfer more properties from objects to local memory than you'll actually plan to use. Use a Select to make sure that only the values you plan to use are transferred:
var bestSellingProducts = products
.OrderByDescending(product=> product.Orders.Count())
.Select(product => new
{ // select only the properties you plan to use, for instance
Id = product.Id,
Name = product.Name,
Stock = product.Stock
Price = product.Price,
});
.Take(2);

LINQ left join + default if empty + anonymous type + group by

I wonder what is the best solution for the given problem, simplified here:
I have two locally stored sql tables which I want to join (left join) with Default If Empty property, then I need to group these data
I don't want to check for (obj == null) before accessing obj.column, which will throw an error if join was no successful for a given row
Data
LeftTable RightTable OUTPUT
A B C A B Z A B C Z
1 1 1 1 1 5 1 1 1 5
1 2 2 1 2 6 1 2 2 6
5 6 7 5 6 7 null
Code
var RightTable = from row in Source
where row.X > 10
select new { // anonymous type that I want to keep
A = row.AAA,
B = row.BBB,
Z = row.ZZZ
};
var test = from left in LeftTable
from right in RightTable
.Where(right => right.A == left.A
&& right.B == left.B )
.DefaultIfEmpty( /* XXXXX */ ) //<-- this line is interesting
group left by new {
left.A
left.B
left.C
right.Z //<-- this will throw null exception error
} into g // but I don't want to change it to
select g; // Z = (right != null) ? right.Z : (string) null
Question:
Can I fill an argument in DefaultIfEmpty with anything that I can dynamically get from this code?
I know I can create a helper type like below and replace the anonymous type in RightTable select and use it inside default if empty like:
DefaultIfEmpty(new Helper())
but I dont want to do it as I have to deal with 20,30+ columns in real life scenario.
public class Helper {
public string A,
public string B,
public string C
}
Thanks a lot for your time if you read until here. Hope to get some solution here.
Thanks.
i think the code says everything:
var LeftTable = new[]
{
new { A = 1, B=1, C=1 },
new { A = 1, B=2, C=2 },
new { A = 5, B=6, C=7 }
}
.ToList();
var RightTable = new[]
{
new { A = 1, B=1, Z=5 },
new { A = 1, B=2, Z=6 }
}
.ToList();
var query = (from left in LeftTable
join right in RightTable
on new { left.A, left.B } equals new { right.A, right.B }
into JoinedList
from right in JoinedList.DefaultIfEmpty(new { A = 0, B = 0, Z = 0 })
group left by new
{
left.A,
left.B,
left.C,
right.Z
} into g
select g)
.ToList();
I had the same problem as you, I had to find a way, came up to something like this (very simplified example):
var toReturn = from left in bd.leftys
join rights in myAnonimousGroupedList on left.Id equals rights.leftId into joinings
from right in joinings.DefaultIfEmpty()
select new {
left.A
left.B
left.C
// Z = right != null ? right.Z : 0
Z = joinings.Any() ? right.Z : 0 // see what i did here?
};

Select Single Element from Jagged Array

I'm working on a problem that's making my brain melt although I don't think it should be this hard. My example is long so I'll try to keep my question short!
I have an Array object that contains some elements that are also Arrays. For example:
customerAddresses = new customer_address[]
{
new // address #1
{
customer_id = 6676979,
customer_address_seq = 1,
customer_address_match_codes = new []
{
new
{
customer_address_seq = 1,
customer_id = 6676979,
customer_match_code_id = 5
}
}
},
new // address #2
{
customer_id = 6677070,
customer_address_seq = 1,
customer_address_match_codes = new []
{
new
{
customer_address_seq = 1,
customer_id = 6677070,
customer_match_code_id = 4
},
new
{
customer_address_seq = 1,
customer_id = 6677070,
customer_match_code_id = 5
},
new
{
customer_address_seq = 1,
customer_id = 6677070,
customer_match_code_id = 3
}
}
},
new // address #3
{
customer_id = 6677070,
customer_address_seq = 2,
customer_address_match_code = new []
{
new
{
customer_address_seq = 2,
customer_id = 6677070,
customer_match_code_id = 4
},
new
{
customer_address_seq = 2,
customer_id = 6677070,
customer_match_code_id = 5
}
}
}
};
As you can see, the Array contains a number of address records, with one record per combination of customer_id and customer_address_seq. What I'm trying to do is find the best matching customer_address according to the following rules:
There must be customer_match_code_id equal to 4 and there must be one equal to 5
If there is a customer_match_code_id equal to 3, then consider that customer_address a stronger match.
According to the above rules, the 2nd customer_address element is the "best match". However, the last bit of complexity in this problem is that there could be multiple "best matches". How I need to handle that situation is by taking the customer_address record with the minimum customer_id and minimum customer_address_seq.
I was thinking that using LINQ would be my best bet, but I'm not experienced enough with it, so I just keep spinning my wheels.
Had to make a change to your class so that you are actually assigning your one collection to something:
customer_address_match_codes = new customer_address_match_code[]
{
new
{
customer_address_seq = 1,
customer_id = 6676979,
customer_match_code_id = 5
}
}
And then here is the LINQ that I've tested and does what you specify:
var result = (from c in customerAddresses
let isMatch = c.customer_address_match_codes
.Where (cu => cu.customer_match_code_id == 4).Any () &&
c.customer_address_match_codes
.Where (cu => cu.customer_match_code_id == 5).Any ()
let betterMatch = isMatch && c.customer_address_match_codes
.Where (cu => cu.customer_match_code_id == 3).Any () ? 1 : 0
where isMatch == true
orderby betterMatch descending, c.customer_id, c.customer_address_seq
select c)
.FirstOrDefault ();
I've worked up an example using your data with anonymous types here: http://ideone.com/wyteM
Not tested and not the same names but this should get you going
customer cb = null;
customer[] cs = new customer[] {new customer()};
foreach (customer c in cs.OrderBy(x => x.id).ThenBy(y => y.seq))
{
if(c.addrs.Any(x => x.num == "5"))
{
if(c.addrs.Any(x => x.num == "3"))
{
if (cb == null) cb = c;
if (c.addrs.Any(x => x.num == "2"))
{
cb = c;
break;
}
}
}
}
This sounds like a job for LINQ
var bestMatch = (from address in DATA
where address.customer_address_match_code.Any(
x => x.customer_match_code_id == 4)
where address.customer_address_match_code.Any(
x => x.customer_match_code_id == 5)
select address).OrderBy(
x => x.customer_address_match_code.Where(
y => y.customer_match_code_id >= 3)
.OrderBy(y => y.customer_match_code_id)
.First()
.customer_match_code_id).FirstOrDefault();
My theory is this: Select addresses that have both a customer_match_code_id == 4 and a customer_match_code_id == 5. Then sort them by the the lowest customer_match_code_id they have that are at least 3, and then take the very first one. If there are a customer_match_code_id that equals 3 then that one is selected, if not, some else is selected. If nothing matches both 4 and 5 then null is returned.
Untested.
Seems quite straight forward in LINQ:
var query =
from ca in customerAddresses
where ca.customer_address_match_codes.Any(
mc => mc.customer_match_code_id == 4)
where ca.customer_address_match_codes.Any(
mc => mc.customer_match_code_id == 5)
orderby ca.customer_id
orderby ca.customer_address_seq
orderby ca.customer_address_match_codes.Any(
mc => mc.customer_match_code_id == 3) descending
select ca;
var result = query.Take(1);
How does that look?

How to get a sum of children values on one LINQ

This is the structure I have:
Program
- Description, etc...
Action
- Program_Id, Description, etc..
Cost
- Action_Id, Value1, Value2, Value3
One Action can Have multiple Costs.
What I Need is a query that group this values by Program. Like:
"Program name" | Total of Value1 | Total of Value 2 | Total of the program
This is my effort so far:
var ListByPrograma = from a in db.Actions
join c in db.Costs on a.Id equals c.Action_Id
group a by a.Program into p
select new
{
Program = p.Key,
actionsQuantity = p.Count(),
totalValue1 = p.Costs.????
totalValue2 = ?,
totalByProgram = ?
};
Does something like this work?
var ListByPrograma = from a in db.Actions
join c in db.Costs on a.ID equals c.Action_Id
group new {a,c} by a.Program into p
select new
{
Program = p.Key,
actionsQty = p.Count ( ),
totalValue1 = p.Sum(y => y.c.Value1),
totalValue2 = p.Sum (y => y.c.Value2),
totalValue3 = p.Sum(y=>y.c.Value3)
};

Categories