I have some method that needs to fetch statistics data from multiple databases. The key idea is that each table hold a DBName and from it I drill down to the Client Main DB calling a stored proc with the desired database name. finally I drill down a second time to get data from the client's Project Database.
To sum it up:
I get the list of all my cloud users.
For each User I fetch his Clients by using stored proc on his main DB -> Marked as userClients.
For each Client I fetch his Statistic by using stored proc on the Clients Project DB.
It takes about 5-6 secs to execute for very little data.
public List<CloudAnalysisDTO> GetCloudAnalysisForPeriod(DateTime FromDate, DateTime ToDate)
{
var users = FindAll();
List<CloudAnalysisDTO> resultsList = new List<CloudAnalysisDTO>();
HashSet<string> userclients = new HashSet<string>();
using (var db = new ProjSQLDataContext(conn))
{
foreach (var user in users)
{
if (user.ID == 0)
continue;
var ids = string.Join(",", db.UserClients.Where(uc => uc.UserId == user.ID).Select(uc => uc.ClientId.ToString()).ToArray());
var mainDB = user.MainDB;
if (mainDB.Length == 0 || ids.Length == 0)
continue;
List<CloudAnalysisDTO> userClients =
db.ExecuteQuery<CloudAnalysisDTO>(#"EXEC CloudUsersAnalysis {0},{1}", mainDB, ids).ToList<CloudAnalysisDTO>();
List<CloudAnalysisDTO> needRemove = new List<CloudAnalysisDTO>();
foreach (var client in userClients)
{
if (!userclients.Contains(user.MainDB + client.ClientID.ToString()))
userclients.Add(user.MainDB + client.ClientID.ToString());
else
{
needRemove.Add(client);
continue;
}
ClientAnalysisDTO clientAnalysisDTO =
db.ExecuteQuery<ClientAnalysisDTO>(#"EXEC CloudClientAnalysis {0},{1},{2}", client.ProjectDB, FromDate, ToDate).SingleOrDefault<ClientAnalysisDTO>();
if (clientAnalysisDTO != null)
{
client.ClientAnalysisDTO = clientAnalysisDTO;
}
client.UserID = user.ID;
client.MainDB = user.MainDB;
}
foreach (var removeDTO in needRemove)
{
userClients.Remove(removeDTO);
}
if (userClients != null && userClients.Count > 0)
resultsList.AddRange(userClients);
}
}
return resultsList;
}
Any ideas of what I can do to improve performance ?
First thing I would do is enable .NET tracing, and write a line to the tracelog before and after each call.
https://msdn.microsoft.com/en-us/library/zs6s4h68(v=vs.110).aspx
This line makes me suspicious that you might be secretly running an "in" clause in one of the queries, which might be less-than performant:
var ids = string.Join(",", db.UserClients.Where(uc => uc.UserId == user.ID).Select(uc => uc.ClientId.ToString()).ToArray());
The next step once you find the weak performer (the above line is just my guess), you should enable database profiling to determine where there needs to be new indexing or database maintenance.
Related
I have two different linq expressions that are referencing the same column in the database. One works just fine, but the other throws an invalid identifier exception (ORA-00904).
Most of the questions I've found feature naked sql queries with some syntax errors. Others have to do with the entity model, but seeing as how it doesn't run into the issue in one query, I'm not convinced the issue is with the model.
The one that works:
public List<DateTime> GetAvailableDates()
{
var retData = new List<DateTime>();
using (var context = new CASTDbContext())
{
var result = context.SomeDataEntity.Select(x => x.CAPTURE_DATE).Distinct().ToList();
if(result != null && result.Count > 0)
{
retData = result;
}
}
return retData;
}
The one that doesn't work:
public List<SomeDataModel> GetSomeDataByDate(DateTime date)
{
var retData = new List<SomeDataModel>();
using (var context = new SomeDbContext())
{
var result = context.SomeDataEntity
.Where( y => DbFunctions.TruncateTime(y.CAPTURE_DATE) == date.Date).ToList(); // the line that's throwing the exception
if (result != null && result.Count > 0)
{
foreach (var item in result)
{
retData.Add(mapper.Map<SomeDataModel>(item));
}
}
}
return retData;
}
The issue ended up being a different part of the model, but just some info on Oracle perils:
The first query worked fine because it was only referencing one specific field that had a matching column in the database (oracle doesn't care about the rest of the model in that instance for some reason).
The second query didn't work because it was trying to pull every column from the table, and there was one field missing from the model.
I am writing a small program that takes in a .csv file as input with about 45k rows. I am trying to compare the contents of this file with the contents of a table on a database (SQL Server through dynamics CRM using Xrm.Sdk if it makes a difference).
In my current program (which takes about 25 minutes to compare - the file and database are the exact same here both 45k rows with no differences), I have all existing records on the database in a DataCollection<Entity> which inherits Collection<T> and IEnumerable<T>
In my code below I am filtering using the Where method and then doing a logic based the count of matches. The Where seems to be the bottleneck here. Is there a more efficient approach than this? I am by no means a LINQ expert.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age);
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
EDIT: I can confirm that all existingRecords are in memory before this code is executed. There is no IO or DB access in the above loop.
Himbrombeere is right, you should execute the query first and put the result into a collection before you use Any, Count, AddRange or whatever method will execute the query again. In your code it's possible that the query is executed 5 times in every loop iteration.
Watch out for the term deferred execution in the documentation. If a method is implemented in that way, then it means that this method can be used to construct a LINQ query(so you can chain it with other methods and at the end you have a query). But only methods that don't use deferred execution like Count, Any, ToList(or a plain foreach) will actually execute it. If you dont want that the whole query is executed everytime and you have to access this query multiple times , it's better to store the result in a collection(.f.e with ToList).
However, you could use a different approach which should be much more efficient, a Lookup<TKey, TValue> which is similar to a dictionary and can be used with an anonymous type as key:
var lookup = existingRecords.Entities.ToLookup(r => new
{
fund = r["field_1"].ToString(),
bps = Convert.ToDecimal(r["field_2"]),
withdrawalPct = Convert.ToDecimal(r["field_3"]),
percentile = Convert.ToDecimal(r["field_4"]),
age = Convert.ToDecimal(r["field_5"])
});
Now you can access this lookup in the loop very efficiently.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = lookup[new {fund, bps, withdrawalPct, percentile, age}].ToList();
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
Note that this will work even if the key does not exist(an empty list is returned).
Add a ToList after your Convert.ToDecimal(r["field_5"]) == age);-line to force an immediate execution of the query.
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age)
.ToList();
The Where doesn´t actually execute your query, it just prepares it. The actual execution happens later in a delayed way. In your case that happens when calling Count which itself will iterate the entire collection of items. But if the first condition fails, the second one is checked leading to a second iteration of the complete collection when calling Count. In this case you actually execute that query a thrird time when calling matchingRows.First().
When forcing an immediate execution you´re executing the query only once and thus iterating the entire collection only once also which will decrease your overall-time.
Another option, which is basically along the same lines as the other answers, is to prepare your data first, so that you're not repeatedly calling things like r["field_2"] (which are relatively slow to look up).
This is a (1) clean your data, (2) query/join your data, (3) process your data approach.
Do this:
(1)
var inputs =
inputDataLines
.Select(record =>
{
var fields = record.Split(',');
return new
{
fund = fields[0],
bps = Convert.ToDecimal(fields[1]),
withdrawalPct = Convert.ToDecimal(fields[2]),
percentile = Convert.ToInt32(fields[3]),
age = Convert.ToInt32(fields[4]),
bombOutTerm = Convert.ToDecimal(fields[5]),
record
};
})
.ToArray();
var entities =
existingRecords
.Entities
.Select(entity => new
{
fund = entity["field_1"].ToString(),
bps = Convert.ToDecimal(entity["field_2"]),
withdrawalPct = Convert.ToDecimal(entity["field_3"]),
percentile = Convert.ToInt32(entity["field_4"]),
age = Convert.ToInt32(entity["field_5"]),
bombOutTerm = Convert.ToDecimal(entity["field_6"]),
entity
})
.ToArray()
.GroupBy(x => new
{
x.fund,
x.bps,
x.withdrawalPct,
x.percentile,
x.age
}, x => new
{
x.bombOutTerm,
x.entity,
});
(2)
var query =
from i in inputs
join e in entities on new { i.fund, i.bps, i.withdrawalPct, i.percentile, i.age } equals e.Key
select new { input = i, matchingRows = e };
(3)
foreach (var x in query)
{
entitiesFound.AddRange(x.matchingRows.Select(y => y.entity));
if (x.matchingRows.Count() == 0)
{
rowsToAdd.Add(x.input.record);
}
else if (x.matchingRows.Count() == 1)
{
if (x.matchingRows.First().bombOutTerm != x.input.bombOutTerm)
{
rowsToUpdate.Add(x.input.record);
entitiesToUpdate.Add(x.matchingRows.First().entity);
}
}
else
{
entitiesToDelete.AddRange(x.matchingRows.Select(y => y.entity));
rowsToAdd.Add(x.input.record);
}
}
I would suspect that this will be the among the fastest approaches presented.
i'm new in entity framework.Below is my code,
So in my code i have created object of my db context and then i have a query 'queryForAuthentication' and in that i have used two tables 'conDb.SystemMasters' and joined with conDb.SystemAdminMasters , so will hit twice or how does it manage . i want to know when does entity framework will hit in to database ?
QuizzrEntities conDb = new QuizzrEntities();
List<OnLoginData> lstOnLogoonData = new List<OnLoginData>();
string userpassWordHash = string.Empty;
var queryForAuthentication =from systemObj in conDb.SystemMasters
where systemObj.StaffPin == dminLoginInput.StaffPin
join admin in conDb.SystemAdminMasters on systemObj.SystemId equals admin.SystemID
select new
{
admin.PasswordSalt,
admin.PasswordHash,
systemObj.StaffPin,
admin.UserName,
admin.SystemID
};
if (queryForAuthentication.Count() > 0)
{
CheckStaffPin = true;
var GetUserUsingUsernamePasword = queryForAuthentication.Where(u => u.UserName.ToLower() == AdminLoginInput.UserName.ToLower());
if (GetUserUsingUsernamePasword.ToList().Count == 1)
{
checkuserName = true;
string DBPasswordSalt = queryForAuthentication.ToList()[0].PasswordSalt,
DBPasswordHash = queryForAuthentication.ToList()[0].PasswordHash,
StaffPin = queryForAuthentication.ToList()[0].StaffPin;
userpassWordHash = Common.GetPasswordHash(AdminLoginInput.Password, DBPasswordSalt);
if ((DBPasswordHash == userpassWordHash) && (AdminLoginInput.StaffPin.ToLower() == StaffPin.ToLower()))
{
checkPassword = true;
CheckStaffPin = true;
}
else if (DBPasswordHash == userpassWordHash)
{
checkPassword = true;
}
else if (AdminLoginInput.StaffPin.ToLower() == StaffPin.ToLower())
{
CheckStaffPin = true;
}
}
}
So in my code i have created object of my db context and then i have a query 'queryForAuthentication' and in that i have used two tables 'conDb.SystemMasters' and joined with conDb.SystemAdminMasters , so will hit twice or how does it manage .
i want to know when does entity framework will hit in to database ?
It's hits the database whenever you fire a query. And query will be fired whenever you perform ToList, First, FirstOrDefault etc. operation. Till then it only builds the query.
try Code
QuizzrEntities conDb = new QuizzrEntities();
List<OnLoginData> lstOnLogoonData = new List<OnLoginData>();
string userpassWordHash = string.Empty;
var queryForAuthentication =(from systemObj in conDb.SystemMasters
where systemObj.StaffPin == dminLoginInput.StaffPin
join admin in conDb.SystemAdminMasters on systemObj.SystemId equals admin.SystemID
select new
{
PasswordSalt= admin.PasswordSalt,
PasswordHash= admin.PasswordHash,
StaffPin= systemObj.StaffPin,
UserName= admin.UserName,
SystemID = admin.SystemID
}).FirstOrDefault();
If(queryForAuthentication !=null)
{
-----------------
-----------------
*****Your Code*******
}
In entity framework also work with sql query based. If you are disconnected using .ToList() then only the record taken from local otherwise it's works as DBQuery. if you check the result view in debug view it's Execute the Query and Return the data.
If you are processing the data is discontinued from the base it's executed finally where you want the result.
You processing data locally then you can disconnect the connection between linq and sql using call .ToList(). it's Processing only one time the Object weight is high more than query.
var queryForAuthentication =from systemObj in conDb.SystemMasters
where systemObj.StaffPin == dminLoginInput.StaffPin
join admin in conDb.SystemAdminMasters on systemObj.SystemId equals admin.SystemID
select new
{
admin.PasswordSalt,
admin.PasswordHash,
systemObj.StaffPin,
admin.UserName,
admin.SystemID
}.ToList() ; // It will fetch the data
//Check from inmemory collection
if (queryForAuthentication.Count > 0)
//As you already have the data in memory this filter applied against inmemory collection not against database.
var GetUserUsingUsernamePasword = queryForAuthentication
.Where(u =>u.UserName.ToLower() == AdminLoginInput.UserName.ToLower());
I want modified SQL data using linq, but when I modified, I got into trouble.
I get this error:
New transaction is not allowed because there are other threads running in the session.
and this is my linq in C#
for (int i = 1; i < st.Count(); i++)
{
string Lesson = i.ToString();
var query = db.YearCourse_105008
.Where(o => o.Day == Day && o.Lesson == Lesson)
.Select(o => new
{
O_GUID = o.OpenCourseGUID,
Lesson = o.Lesson
});
foreach (var item in query)
{
var PK = db.YearCourse_105008.Find(item.O_GUID);
PK.Day = Day;
PK.RC_MAJORCODE = st[i];
PK.Lesson = i.ToString();
db.Entry(PK).State = EntityState.Modified;
db.SaveChanges();
}
}
Since query is IQueryable that can return objects, certainly you need to realise it using IEnumerable collection (e.g. List) when iterating query results using foreach loop:
foreach (var item in query.ToList()) // --> here the source should be IEnumerable collection
{
var PK = db.YearCourse_105008.Find(item.O_GUID);
PK.Day = Day;
PK.RC_MAJORCODE = st[i];
PK.Lesson = i.ToString();
db.Entry(PK).State = EntityState.Modified;
db.SaveChanges();
}
or convert query results from IQueryable to IEnumerable first:
var query = db.YearCourse_105008
.Where(o => o.Day == Day && o.Lesson == Lesson)
.Select(o => new
{
O_GUID = o.OpenCourseGUID,
Lesson = o.Lesson
}).ToList();
NB: In addition, db.SaveChanges() method may placed outside foreach loop to update all modified entries at once, hence you don't need to save every modified entries per each iteration.
Related problems:
New transaction is not allowed because there are other threads running in the session LINQ To Entity
SqlException from Entity Framework - New transaction is not allowed because there are other threads running in the session
Entity Framework - "New transaction is not allowed because there are other threads running in the session"
I have a query in EF where there is a List of string value that it checks for existence in another table.
Please consider the below query for more details.
Code
List<string> ItmsStock = item.Select(ds => ds.ItemNum).ToList(); // Currently, This List items count is 80,000 records.
this.Db.Database.CommandTimeout = 180;
var existsStckList = Db.Stocktakes.Where(ds => ItmsStock.Contains(ds.ItemNo)).Select(ds => ds.ItemNo).ToList();
item.RemoveAll(ds => existsStckList.Contains(ds.ItemNum));
var ItmsExists = Db.Items.Where(ds => ItmsStock.Contains(ds.ItemNo)).Select(ds => ds.ItemNo).ToList();
ItmsExists = Db.Stocktakes.Where(ds => !ItmsExists.Contains(ds.ItemNo)).Select(ds => ds.ItemNo).ToList();
I searched on the internet and found the converted sql uses IN to check for existence. so, the limit for the IN makes the problem. My question here is, How can I efficiently perform the above actions without using for loop.
I ll be appreciating you, If anybody can help me out.
Edit
Previously, I had the below code. After facing the performance issue with the below code, I wrote the above one.
foreach (var stockitems in item)
{
if (Db.Stocktakes.Any(a => a.ItemNo == stockitems.ItemNum))
{
StockResult ss = new StockResult();
ss.ItemNumber = stockitems.ItemNum;
ss.FileName = stockitems.FileName;
Stockres.Add(ss);
}
else if (!Db.Stocktakes.Any(a => a.ItemNo == stockitems.ItemNum) && Db.Items.Any(a => a.ItemNo == stockitems.ItemNum))
{
var ItemNo = stockitems.ItemNum;
var AdminId = Convert.ToInt32(Session["AccId"]);
var CreatedOn = System.DateTime.Now;
int dbres = Db.Database.ExecuteSqlCommand("insert into Stocktake values({0},{1},{2})", ItemNo, AdminId, CreatedOn);
Db.SaveChanges();
totalcount = totalcount + 1;
}
else
{
StockResult sss = new StockResult();
sss.ItemNumber = stockitems.ItemNum;
sss.FileName = stockitems.FileName;
Stockitemsdup.Add(sss);
}
}
Thanks.
Issue batches of 1000 item IDs to the database, or use native SQL and submit a table-valued parameter, or a temp table filled with SqlBulkCopy.
I'm surprised you got htis particular message. The parameter limit is about 2000 parameters. Your query should have been rejected.