Should I apply counting in DB or in Code? - c#

I have the following code where I'm trying to get the amount of rows in the same dataset with various matches.
My question is if should I get the count in C# code with a IEnumerable or by querying a IQueryable from database?
Which one is more efficient, multiple database transactions or IEnumerable filtering and count?
public List<Tuple<string, int>> CalismaVeIzinleriHesapla(long personelId, DateTime baslangic, DateTime bitis)
{
var hesaplamalar = new List<Tuple<string, int>>();
var puantajList = puantajlar.Where(p => p.PersonelId == personelId && (p.Tarih >= baslangic && p.Tarih <= bitis));
var haftaTatili = puantajList.Where(p => p.Secenek.Deger == "Ht").Count();
var resmiTatil = puantajList.Where(p => p.Secenek.Deger == "Rt").Count();
var yillikIzin = puantajList.Where(p => p.Secenek.Deger == "Yi").Count();
var odenecekRapor = puantajList.Where(p => p.Secenek.Deger == "R+").Count();
var dogumIzni = puantajList.Where(p => p.Secenek.Deger == "Di").Count();
var olumIzni = puantajList.Where(p => p.Secenek.Deger == "Öi").Count();
var evlilikIzni = puantajList.Where(p => p.Secenek.Deger == "Ei").Count();
var odenmeyecekRapor = puantajList.Where(p => p.Secenek.Deger == "R-").Count();
var ucretsizIzin = puantajList.Where(p => p.Secenek.Deger == "Üi").Count();
var devamsizlik = puantajList.Where(p => p.Secenek.Deger == "D").Count();
return hesaplamalar;
}

As for your case, querying and counting in the db is more efficient.
something like this would be efficient.
puantajlar
.Where(p => p.PersonelId == personelId && (p.Tarih >= baslangic && p.Tarih <= bitis))
.GroupBy(x => x.Secenek.Deger)
.Select(group => new { group.Key, Count = group.Count() })

My question is if should I get the count in C# code with a IEnumarable or by querying a IQueryable in DB
If you need only the count of the rows then count must be done in database, not in memory. If you do count in memory by pulling the data list from database into memory then it will waste your server memory unnecessarily and cost performance.

Complexity and Performance, both of them depends on your situation, if there are no huge data the Performance is no matter, but sometime you have to make a decision based on your situation.
By running your code it should connect to DB and run the count query in each line of code
it is 100 per cent clear that counting same rows in DB is more efficient in one shoot so you can do something like :
select p.Secenek.Deger,
....
sum(case when p.Secenek.Deger = 'Ht' then 1 else 0 end) haftaTatili,
sum(case when p.Secenek.Deger = 'Rt' then 1 else 0 end) resmiTatil
.....
from puantajlar p
group by p.Secenek.Deger
or you can do it in more efficient way by grouping them in one shoot also like #amd mentioned:
puantajlar
.Where(p => p.PersonelId == personelId && (p.Tarih >= baslangic && p.Tarih <= bitis))
.GroupBy(x => x.Secenek.Deger)
.Select(group => new { group.Key, Count = group.Count() })

Related

LINQ Query Multiple Group and count of latest record - Oracle DB

I tried to divided Linq queries into 3 (total, success, fail) but so far "Total" Linq query is working fine. Please help me to get "Success", "Fail" columns (it has mulitple statuses and we have to check the last column of each transaction and destination)
Note: you need to group by ProcessTime, TransactionId, Destination and check last column whether it is success or Fail then apply count (we are using oracle as backend)
LINQ for Total count
var query = (from filetrans in context.FILE_TRANSACTION
join route in context.FILE_ROUTE on filetrans.FILE_TRANID equals route.FILE_TRANID
where
filetrans.PROCESS_STRT_TIME >= fromDateFilter && filetrans.PROCESS_STRT_TIME <= toDateFilter
select new { PROCESS_STRT_TIME = DbFunctions.TruncateTime((DateTime)filetrans.PROCESS_STRT_TIME), filetrans.FILE_TRANID, route.DESTINATION }).
GroupBy(p => new { p.PROCESS_STRT_TIME, p.FILE_TRANID, p.DESTINATION });
var result = query.GroupBy(x => x.Key.PROCESS_STRT_TIME).Select(x => new { x.Key, Count = x.Count() }).ToDictionary(a => a.Key, a => a.Count);
Check this solution. If it gives wrong result, then I need more details.
var fileTransQuery =
from filetrans in context.AFRS_FILE_TRANSACTION
where accountIds.Contains(filetrans.ACNT_ID) &&
filetrans.PROCESS_STRT_TIME >= fromDateFilter && filetrans.PROCESS_STRT_TIME <= toDateFilter
select filetrans;
var routesQuery =
from filetrans in fileTransQuery
join route in context.AFRS_FILE_ROUTE on filetrans.FILE_TRANID equals route.FILE_TRANID
select route;
var lastRouteQuery =
from d in routesQuery.GroupBy(route => new { route.FILE_TRANID, route.DESTINATION })
.Select(g => new
{
g.Key.FILE_TRANID,
g.Key.DESTINATION,
ROUTE_ID = g.Max(x => x.ROUTE_ID)
})
from route in routesQuery
.Where(route => d.FILE_TRANID == route.FILE_TRANID && d.DESTINATION == route.DESTINATION && d.ROUTE_ID == route.ROUTE_ID)
select route;
var recordsQuery =
from filetrans in fileTransQuery
join route in lastRouteQuery on filetrans.FILE_TRANID equals route.FILE_TRANID
select new { filetrans.PROCESS_STRT_TIME, route.CRNT_ROUTE_FILE_STATUS_ID };
var result = recordsQuery
.GroupBy(p => DbFunctions.TruncateTime((DateTime)p.PROCESS_STRT_TIME))
.Select(g => new TrendData
{
TotalCount = g.Sum(x => x.CRNT_ROUTE_FILE_STATUS_ID != 7 && x.CRNT_ROUTE_FILE_STATUS_ID != 8 ? 1 : 0)
SucccessCount = g.Sum(x => x.CRNT_ROUTE_FILE_STATUS_ID == 7 ? 1 : 0),
FailCount = g.Sum(x => failureStatus.Contains(x.CRNT_ROUTE_FILE_STATUS_ID) ? 1 : 0),
Date = g.Min(x => x.PROCESS_STRT_TIME)
})
.OrderBy(x => x.Date)
.ToList();

Slow Performance on searching through List in LINQ

I had a table which has more than 200,000 records for any particular month.
Getting records from a table is not a problem it is working as expected but searching through records shows very slow performance
var listEmpShiftDetails =ctx.tblTO_ShiftSchedule
.Where(m => m.CompanyId == companyId &&
m.ShiftDate >= fromdate &&
m.ShiftDate <= todate)
.Select(m => m).ToList();
Records fetched from database around: 200 000
var data = (from a in ctx.tblEmployee
join b in ctx.tblTO_Entry on a.Id equals b.EmployeeId
where a.CompanyId == companyId && b.CompanyId == companyId &&
(b.Entry_Date >= fromDate && b.Entry_Date <= toDate)
select new { a, b }).ToList();
*ote: No database called are made in below code.all the data is fetched above
Linq Query to fetch one by one record
foreach (var item in data) // Data consist of employee details 3k Records
{
if (listEmpShiftDetails
.Any(m => m.EmployeeId == item.a.Id &&
m.ShiftDate == item.b.Entry_Date))
{
var shiftDetails = listEmpShiftDetails
.Where(m => m.EmployeeId == item.a.Id &&
m.ShiftDate ==item.b.Entry_Date)
.Select(m => m)
.FirstOrDefault();
//Other Calculations
}
}
Above 2 Lines takes too much time to execute, below is output from Visual Studio. How to improve the performance?
Profiler Output
var listEmpShiftDetails = Records Fetched from Database around :- 2 Lakh Recordd;
foreach (var item in data) // Data consist of employee details 3k Records
{
var selectedItem = listEmpShiftDetails.FirstOrDefault(m => m.EmployeeId == item.Id &&
m.ShiftDate == item.Entry_Date);
if (selectedItem != null)
{
// Other Calculations
}
}
No need to iterate same query several times. you just need first item if it matches otherwise null. Hope above query gives you better performance.
What Comes on my head after looking your query:
Extra query load, no need to check If(){}.
If your "listEmpShiftDetails" is IQueryable then listEmpShiftDetails.Any() is OK. But if it is List then it hampers the persormance.
Worth spending 5 mins Count Vs Any performance.
Keep your query simple.
var shiftDetails = listEmpShiftDetails.FirstOrDefault(m => m.EmployeeId == item.Id && m.ShiftDate == item.Entry_Date);
To start with, do it this way to avoid executing the query twice:
foreach (var item in data) // Data consist of employee details 3k Records
{
var shiftDetails = listEmpShiftDetails.Where(m => m.EmployeeId == item.Id && m.ShiftDate == item.Entry_Date).FirstOrDefault()
if (shiftDetails != null)
{
//Other Calculations
}
}
Next, it appears that you're doing some sort of join it would be ideal to see what makes up data so that we could further suggest a way to improve the time significantly.
It's possible that this might give you some improvement:
var query =
(
from a in ctx.tblEmployee.Where(x => x.CompanyId == companyId)
join b in ctx.tblTO_Entry.Where(x => x.CompanyId == companyId) on a.Id equals b.EmployeeId
where b.Entry_Date >= fromDate
where b.Entry_Date <= toDate
join m in ctx.tblTO_ShiftSchedule.Where(x => x.CompanyId == companyId) on new
{
a.Id,
b.Entry_Date
} equals new
{
Id = m.EmployeeId,
Entry_Date = m.ShiftDate
} into g
from m2 in g.Where(x => x.ShiftDate >= fromDate).Where(x => x.ShiftDate <= toDate).Take(1)
select m2
).ToList();
foreach (var shiftDetails in query)
{
//Other Calculations
}

Improve performance in linq query - trying to change the query from subquery to join

I am doing the below linq query which is costing me a lot and this query is in a loop which I can not avoid and I have to do it in C# which also I can not avoid. I have lot of logic above the linq query and after the query. I wanted to check if I can change anything on the query to improve the performance at least a little bit.
lstDataTable.Where(i => i.Field<int>("ALLL_Snapshot_ID") == 20 &&
i.Field<int>("ALLL_Analysis_Segment_Group_Column_ID") == 5 &&
i.Field<DateTime>("OriginationDate") > startingSnapshotDate &&
i.Field<DateTime>("OriginationDate") <= endingSnapshotDate &&
snapshotDataWithDate.Select(j => j.Field<string>
("MaturityDateBorrowerIdNoteNumberKey")).Contains(i.Field<string>
("MaturityDateBorrowerIdNoteNumberKey")) &&
snapshotDataWithDate.Select(j => j.Field<string>
("OriginationDateBorrowerIdNoteNumberKey")).Contains(i.Field<string>
("OriginationDateBorrowerIdNoteNumberKey")))
.Select(i => i.Field<Decimal>("BalanceOutstanding") + i.Field<Decimal>
("UndisbursedCommitmentAvailability")).Sum();
where lstDataTable and snapshotDataWithDate are IEnumerable of DataRow.
I tried above query using join but it is not joining properly. The difference between the two results is way high. Below is the query I tried using join
(from p in lstDataTable
join t in snapshotDataWithDate on p.Field<string>
("MaturityDateBorrowerIdNoteNumberKey") equals t.Field<string>
("MaturityDateBorrowerIdNoteNumberKey") &&
p.Field<string>("OriginationDateBorrowerIdNoteNumberKey") equals
t.Field<string>("OriginationDateBorrowerIdNoteNumberKey")
where p.Field<int>("ALLL_Analysis_Segment_Group_Column_ID") ==
SegmentGroupCECLSurvivalRateObj.ALLL_Segment_Group_Column_ID &&
p.Field<DateTime>("OriginationDate") > startingSnapshotDate &&
p.Field<DateTime>("OriginationDate") <= endingSnapshotDate
select p.Field<Decimal>("BalanceOutstanding") + p.Field<Decimal>
("UndisbursedCommitmentAvailability")).Sum();
Try this query, I have changed some expressions in where clause.
lstDataTable.Where(i => i.Field<int>("ALLL_Snapshot_ID") == 20 &&
i.Field<int>("ALLL_Analysis_Segment_Group_Column_ID") == 5 &&
i.Field<DateTime>("OriginationDate") > startingSnapshotDate &&
i.Field<DateTime>("OriginationDate") <= endingSnapshotDate &&
snapshotDataWithDate.Any(j => j.Field<string>
("MaturityDateBorrowerIdNoteNumberKey") == i.Field<string>
("MaturityDateBorrowerIdNoteNumberKey")) &&
snapshotDataWithDate.Any(j => j.Field<string>
("OriginationDateBorrowerIdNoteNumberKey") == i.Field<string>
("OriginationDateBorrowerIdNoteNumberKey")))
.Select(i => i.Field<Decimal>("BalanceOutstanding") + i.Field<Decimal>
("UndisbursedCommitmentAvailability")).Sum();
Perhaps pulling out the Field accesses will provide a small amount of optimization?
var snapshotDataConvertedMDB = snapshotDataWithDate.Select(r => r.Field<string>("MaturityDateBorrowerIdNoteNumberKey")).ToList();
var snapshotDataConvertedODB = snapshotDataWithDate.Select(r => r.Field<string>("OriginationDateBorrowerIdNoteNumberKey")).ToList();
var ans = lstDataTable
.Select(r => new {
ALLL_Snapshot_ID = r.Field<int>("ALLL_Snapshot_ID"),
ALLL_Analysis_Segment_Group_Column_ID = r.Field<int>("ALLL_Analysis_Segment_Group_Column_ID"),
OriginationDate = r.Field<DateTime>("OriginationDate"),
MaturityDateBorrowerIdNoteNumberKey = r.Field<string>("MaturityDateBorrowerIdNoteNumberKey"),
OriginationDateBorrowerIdNoteNumberKey = r.Field<string>("OriginationDateBorrowerIdNoteNumberKey"),
BalanceOutstanding = r.Field<Decimal>("BalanceOutstanding"),
UndisbursedCommitmentAvailability = r.Field<Decimal>("UndisbursedCommitmentAvailability")
})
.Where(i => i.ALLL_Snapshot_ID == 20 &&
i.ALLL_Analysis_Segment_Group_Column_ID == 5 &&
i.OriginationDate > startingSnapshotDate &&
i.OriginationDate <= endingSnapshotDate &&
snapshotDataConvertedMDB.Contains(i.MaturityDateBorrowerIdNoteNumberKey) &&
snapshotDataConvertedODB.Contains(i.OriginationDateBorrowerIdNoteNumberKey))
.Select(i => i.BalanceOutstanding + i.UndisbursedCommitmentAvailability)
.Sum();

Map Linq groupby count to C# entity

I have this sql query that performs a groupby on a single field. It then counts the groupby's. So far so good.
select type, count(*)
from myTable
group by type
//Result
//TypeA = 5
//TypeB = 3
However, I am having trouble performing this query with Linq as I need to map the outcome of Count() to a specific entity.
The entity I want to map the count to:
public class MyEtity(){
public int TypeACount {get; set;}
public int TypeBCount {get; set;}
}
The linq query I currently use which
MyEntity test = data
.GroupBy(c => c.type)
.Select(g => new MyEntity (){
TypeACount = g.Where(d => d.type == "A").Count(),
TypeBCount = g.Where(d => d.type == "B").Count()
});
Extra info
Based on some answers, a little extra info. My original plan was to use following.
var firstResults = session.Query<MyEntity>()
.Where(//several date filter conditions)
.ToList();
return new MyEntity() {
TypeACount = firstResults.Where(s => s.type == "A").Count(),
TypeBCount = firstResults.Where(s => s.type == "B").Count()
};
This works, but table queried is rather large and the query took quite some time. Based on a colleagues feedback I was asked if the query couldn't be made in to 1 part instead of separating it. The idea being that the query counting logic would remain in SQL rather than in C#. I don't know if that would actually be faster, but that is what I am trying to figure out.
You should map after you get the information
var results = data
.Where(c => c.TypeOfUsage == "A" || c.TypeOfUsage == "B")
.GroupBy(c => c.TypeOfUsage)
.Select(g => new
{
Type = g.Key,
Count = g.Count()
}).ToList();
MyEntity test = new MyEntity
{
TypeACount = results.FirstOrDefault(d => d.Type == "A")?.Count ?? 0,
TypeBCount = results.FirstOrDefault(d => d.Type == "B")?.Count ?? 0
}
Or if you don't have C# 6
var a = results.FirstOrDefault(d => d.Type == "A");
var b = results.FirstOrDefault(d => d.Type == "B");
MyEntity test = new MyEntity
{
TypeACount = a == null ? 0 : a.Count,
TypeBCount = b == null ? 0 : b.Count
}
Another option would be to use a constant group by.
MyEntity test= data
.Where(c => c.TypeOfUsage == "A" || c.TypeOfUsage == "B")
.GroupBy(c => 1)
.Select(g => new MyEntity
{
TypeACount = g.Where(d => d.TypeOfUsage == "A").Count(),
TypeBCount = g.Where(d => d.TypeOfUsage == "B").Count()
}).Single();
This would be more like the following SQL
select
sum(case when typeOfUseage = 'A' then 1 else 0 end) AS TypeACount
, sum(case when typeOfUseage = 'B' then 1 else 0 end) AS TypeBCount
from myTable
why not the clasic way?, I do not see in your query the reason for group by or Select;
var entity=new MyEntity()
entity.TypeACount = data.Count(a => a.TypeOfUsage == "A"),
entity.TypeBCount =data.Count(b => b.TypeOfUsage == "B")

Group by Linq vs Transact Sql

I have this SQL query
select GrupoEmpaque,NumIdConceptoEmpaque,sum(NumCantidadEmpaques)
from Movimientos_Pedidos
where StrIdDocumento = '009000PV00000000000000599' and (GrupoEmpaque is null or GrupoEmpaque = 0 )
group by GrupoEmpaque , NumIdConceptoEmpaque
**It Returns:**
NULL 338 25
In the other side I have this Linq , Pedido allready has only '009000PV00000000000000599' data
var EmpaquesItemUnico = Pedido.Movimientos_Pedidos
.GroupBy(x => x.NumIdConceptoEmpaque)
.Select(x => new { GrupoEmpaque = x.FirstOrDefault().GrupoEmpaque, TipoEmpaque = x.FirstOrDefault().Merlin_ConceptosFacturacionEmpaque, Cantidad = x.Sum(y => y.NumCantidadEmpaques) })
.Where(x => x.GrupoEmpaque == 0 || x.GrupoEmpaque == null);
But now the results are
NULL 338 28
Now My questions are:
Why TSQL returns 25 and Linq Returns 28?
How to make those sentences return same results?
You have to filter results first before projecting, and your both groupby statements of t-sql and linq are not same:
var EmpaquesItemUnico = Pedido.Movimientos_Pedidos
.GroupBy(x => new
{
NumIdConceptoEmpaque =x.NumIdConceptoEmpaque,
GrupoEmpaque = x.GrupoEmpaque
}
)
.Where(x => x.Key.GrupoEmpaque == 0 || x.Key.GrupoEmpaque == null)
// now project here
.Select(x=> new
{
NumIdConceptoEmpaque = x.Key.NumIdConceptoEmpaque,
GrupoEmpaque = x.Key.GrupoEmpaque,
Sum = x.Sum(y => y.NumCantidadEmpaques)
});

Categories