ORA-00904 Occasional Invalid identifier with linq query - c#

I have two different linq expressions that are referencing the same column in the database. One works just fine, but the other throws an invalid identifier exception (ORA-00904).
Most of the questions I've found feature naked sql queries with some syntax errors. Others have to do with the entity model, but seeing as how it doesn't run into the issue in one query, I'm not convinced the issue is with the model.
The one that works:
public List<DateTime> GetAvailableDates()
{
var retData = new List<DateTime>();
using (var context = new CASTDbContext())
{
var result = context.SomeDataEntity.Select(x => x.CAPTURE_DATE).Distinct().ToList();
if(result != null && result.Count > 0)
{
retData = result;
}
}
return retData;
}
The one that doesn't work:
public List<SomeDataModel> GetSomeDataByDate(DateTime date)
{
var retData = new List<SomeDataModel>();
using (var context = new SomeDbContext())
{
var result = context.SomeDataEntity
.Where( y => DbFunctions.TruncateTime(y.CAPTURE_DATE) == date.Date).ToList(); // the line that's throwing the exception
if (result != null && result.Count > 0)
{
foreach (var item in result)
{
retData.Add(mapper.Map<SomeDataModel>(item));
}
}
}
return retData;
}

The issue ended up being a different part of the model, but just some info on Oracle perils:
The first query worked fine because it was only referencing one specific field that had a matching column in the database (oracle doesn't care about the rest of the model in that instance for some reason).
The second query didn't work because it was trying to pull every column from the table, and there was one field missing from the model.

Related

Obtaining entities from DbSet from a list of matching objects

I'm using Entity Framework Core 6 and I want to find a series of entities in a DbSet. The entities I want to obtain are the ones match some properties in a list of input objects.
I've tried something like this:
public IEnumerable<MyEntity> FindEntities(IEnumerable<MyEntityDtos> entries)
{
return dbContext.MyDbSet.Where(r => entries.Any(e => e.Prop1 == r.Prop1 && e.Prop2 == r.Prop2));
}
But I get the classic EF Core exception saying that my LINQ cannot be translated to a database query (the problem in particular is the entries.Any(...) instruction)
I know I can just loop over the list of entries and obtain the entities one by one from the DbSet, but that is very slow, I was wondering if there was a more efficient way to do this in EF Core that I don't know about.
I think this should work:
public IEnumerable<MyEntity> FindEntities(IEnumerable<MyEntityDtos> entries)
{
var props1=entries.Select(x=>x.Prop1).ToArray();
var props2=entries.Select(x=>x.Prop2).ToArray();
return dbContext.MyDbSet.Where(r => props1.Contains(r.Prop1) && props2.Contains(r.Prop2));
}
In the end, I've done this:
public static IEnumerable<MyEntity> GetRangeByKey(this DbSet<MyEntity> dbSet, IEnumerable<MyEntity> toFind)
{
var keys = new HashSet<string>(toFind.Select(e => e.Id));
IEnumerable<MyEntity> result = null;
for (int i = 0; i < keys.Length; i += 1000)
{
var keyChunk = keys[i..(Math.Min(i + 1000, keys.Length))];
var res = dbSet.Where(x => keyChunk.Any(k => x.ResourceArn == k));
if (result == null)
{
result = res;
}
else
{
result = result.Concat(res);
}
}
return result;
}
Basically I get the keys to find in a HashSet and use it to perform a Where query, which will be translated to a SQL IN clause which is quite fast. I do it in chunks because there's a maximum number of values you can put in a IN clause before the DB engine refuses it.

Most efficient way to search enumerable

I am writing a small program that takes in a .csv file as input with about 45k rows. I am trying to compare the contents of this file with the contents of a table on a database (SQL Server through dynamics CRM using Xrm.Sdk if it makes a difference).
In my current program (which takes about 25 minutes to compare - the file and database are the exact same here both 45k rows with no differences), I have all existing records on the database in a DataCollection<Entity> which inherits Collection<T> and IEnumerable<T>
In my code below I am filtering using the Where method and then doing a logic based the count of matches. The Where seems to be the bottleneck here. Is there a more efficient approach than this? I am by no means a LINQ expert.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age);
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
EDIT: I can confirm that all existingRecords are in memory before this code is executed. There is no IO or DB access in the above loop.
Himbrombeere is right, you should execute the query first and put the result into a collection before you use Any, Count, AddRange or whatever method will execute the query again. In your code it's possible that the query is executed 5 times in every loop iteration.
Watch out for the term deferred execution in the documentation. If a method is implemented in that way, then it means that this method can be used to construct a LINQ query(so you can chain it with other methods and at the end you have a query). But only methods that don't use deferred execution like Count, Any, ToList(or a plain foreach) will actually execute it. If you dont want that the whole query is executed everytime and you have to access this query multiple times , it's better to store the result in a collection(.f.e with ToList).
However, you could use a different approach which should be much more efficient, a Lookup<TKey, TValue> which is similar to a dictionary and can be used with an anonymous type as key:
var lookup = existingRecords.Entities.ToLookup(r => new
{
fund = r["field_1"].ToString(),
bps = Convert.ToDecimal(r["field_2"]),
withdrawalPct = Convert.ToDecimal(r["field_3"]),
percentile = Convert.ToDecimal(r["field_4"]),
age = Convert.ToDecimal(r["field_5"])
});
Now you can access this lookup in the loop very efficiently.
foreach (var record in inputDataLines)
{
var fields = record.Split(',');
var fund = fields[0];
var bps = Convert.ToDecimal(fields[1]);
var withdrawalPct = Convert.ToDecimal(fields[2]);
var percentile = Convert.ToInt32(fields[3]);
var age = Convert.ToInt32(fields[4]);
var bombOutTerm = Convert.ToDecimal(fields[5]);
var matchingRows = lookup[new {fund, bps, withdrawalPct, percentile, age}].ToList();
entitiesFound.AddRange(matchingRows);
if (matchingRows.Count() == 0)
{
rowsToAdd.Add(record);
}
else if (matchingRows.Count() == 1)
{
if (Convert.ToDecimal(matchingRows.First()["field_6"]) != bombOutTerm)
{
rowsToUpdate.Add(record);
entitiesToUpdate.Add(matchingRows.First());
}
}
else
{
entitiesToDelete.AddRange(matchingRows);
rowsToAdd.Add(record);
}
}
Note that this will work even if the key does not exist(an empty list is returned).
Add a ToList after your Convert.ToDecimal(r["field_5"]) == age);-line to force an immediate execution of the query.
var matchingRows = existingRecords.Entities.Where(r => r["field_1"].ToString() == fund
&& Convert.ToDecimal(r["field_2"]) == bps
&& Convert.ToDecimal(r["field_3"]) == withdrawalPct
&& Convert.ToDecimal(r["field_4"]) == percentile
&& Convert.ToDecimal(r["field_5"]) == age)
.ToList();
The Where doesn´t actually execute your query, it just prepares it. The actual execution happens later in a delayed way. In your case that happens when calling Count which itself will iterate the entire collection of items. But if the first condition fails, the second one is checked leading to a second iteration of the complete collection when calling Count. In this case you actually execute that query a thrird time when calling matchingRows.First().
When forcing an immediate execution you´re executing the query only once and thus iterating the entire collection only once also which will decrease your overall-time.
Another option, which is basically along the same lines as the other answers, is to prepare your data first, so that you're not repeatedly calling things like r["field_2"] (which are relatively slow to look up).
This is a (1) clean your data, (2) query/join your data, (3) process your data approach.
Do this:
(1)
var inputs =
inputDataLines
.Select(record =>
{
var fields = record.Split(',');
return new
{
fund = fields[0],
bps = Convert.ToDecimal(fields[1]),
withdrawalPct = Convert.ToDecimal(fields[2]),
percentile = Convert.ToInt32(fields[3]),
age = Convert.ToInt32(fields[4]),
bombOutTerm = Convert.ToDecimal(fields[5]),
record
};
})
.ToArray();
var entities =
existingRecords
.Entities
.Select(entity => new
{
fund = entity["field_1"].ToString(),
bps = Convert.ToDecimal(entity["field_2"]),
withdrawalPct = Convert.ToDecimal(entity["field_3"]),
percentile = Convert.ToInt32(entity["field_4"]),
age = Convert.ToInt32(entity["field_5"]),
bombOutTerm = Convert.ToDecimal(entity["field_6"]),
entity
})
.ToArray()
.GroupBy(x => new
{
x.fund,
x.bps,
x.withdrawalPct,
x.percentile,
x.age
}, x => new
{
x.bombOutTerm,
x.entity,
});
(2)
var query =
from i in inputs
join e in entities on new { i.fund, i.bps, i.withdrawalPct, i.percentile, i.age } equals e.Key
select new { input = i, matchingRows = e };
(3)
foreach (var x in query)
{
entitiesFound.AddRange(x.matchingRows.Select(y => y.entity));
if (x.matchingRows.Count() == 0)
{
rowsToAdd.Add(x.input.record);
}
else if (x.matchingRows.Count() == 1)
{
if (x.matchingRows.First().bombOutTerm != x.input.bombOutTerm)
{
rowsToUpdate.Add(x.input.record);
entitiesToUpdate.Add(x.matchingRows.First().entity);
}
}
else
{
entitiesToDelete.AddRange(x.matchingRows.Select(y => y.entity));
rowsToAdd.Add(x.input.record);
}
}
I would suspect that this will be the among the fastest approaches presented.

Method that fetch data from multiple databases running slow

I have some method that needs to fetch statistics data from multiple databases. The key idea is that each table hold a DBName and from it I drill down to the Client Main DB calling a stored proc with the desired database name. finally I drill down a second time to get data from the client's Project Database.
To sum it up:
I get the list of all my cloud users.
For each User I fetch his Clients by using stored proc on his main DB -> Marked as userClients.
For each Client I fetch his Statistic by using stored proc on the Clients Project DB.
It takes about 5-6 secs to execute for very little data.
public List<CloudAnalysisDTO> GetCloudAnalysisForPeriod(DateTime FromDate, DateTime ToDate)
{
var users = FindAll();
List<CloudAnalysisDTO> resultsList = new List<CloudAnalysisDTO>();
HashSet<string> userclients = new HashSet<string>();
using (var db = new ProjSQLDataContext(conn))
{
foreach (var user in users)
{
if (user.ID == 0)
continue;
var ids = string.Join(",", db.UserClients.Where(uc => uc.UserId == user.ID).Select(uc => uc.ClientId.ToString()).ToArray());
var mainDB = user.MainDB;
if (mainDB.Length == 0 || ids.Length == 0)
continue;
List<CloudAnalysisDTO> userClients =
db.ExecuteQuery<CloudAnalysisDTO>(#"EXEC CloudUsersAnalysis {0},{1}", mainDB, ids).ToList<CloudAnalysisDTO>();
List<CloudAnalysisDTO> needRemove = new List<CloudAnalysisDTO>();
foreach (var client in userClients)
{
if (!userclients.Contains(user.MainDB + client.ClientID.ToString()))
userclients.Add(user.MainDB + client.ClientID.ToString());
else
{
needRemove.Add(client);
continue;
}
ClientAnalysisDTO clientAnalysisDTO =
db.ExecuteQuery<ClientAnalysisDTO>(#"EXEC CloudClientAnalysis {0},{1},{2}", client.ProjectDB, FromDate, ToDate).SingleOrDefault<ClientAnalysisDTO>();
if (clientAnalysisDTO != null)
{
client.ClientAnalysisDTO = clientAnalysisDTO;
}
client.UserID = user.ID;
client.MainDB = user.MainDB;
}
foreach (var removeDTO in needRemove)
{
userClients.Remove(removeDTO);
}
if (userClients != null && userClients.Count > 0)
resultsList.AddRange(userClients);
}
}
return resultsList;
}
Any ideas of what I can do to improve performance ?
First thing I would do is enable .NET tracing, and write a line to the tracelog before and after each call.
https://msdn.microsoft.com/en-us/library/zs6s4h68(v=vs.110).aspx
This line makes me suspicious that you might be secretly running an "in" clause in one of the queries, which might be less-than performant:
var ids = string.Join(",", db.UserClients.Where(uc => uc.UserId == user.ID).Select(uc => uc.ClientId.ToString()).ToArray());
The next step once you find the weak performer (the above line is just my guess), you should enable database profiling to determine where there needs to be new indexing or database maintenance.

Matching objects by property name and value using Linq

I need to be able to match an object to a record by matching property names and values using a single Linq query. I don't see why this shouldn't be possible, but I haven't been able to figure out how to make this work. Right now I can do it using a loop but this is slow.
Heres the scenario:
I have tables set up that store records of any given entity by putting their primary keys into an associated table with the key's property name and value.
If I have a random object at run-time, I need to be able to check if a copy of that object exists in the database by checking if the object has property names that match all of the keys of a record in the database ( this would mean that they would be the same type of object) and then checking if the values for each of the keys match, giving me the same record.
Here's how I got it to work using a loop (simplified a bit):
public IQueryable<ResultDataType> MatchingRecordFor(object entity)
{
var result = Enumerable.Empty<ResultDataType>();
var records = _context.DataBaseRecords
var entityType = entity.GetType();
var properties = entityType.GetProperties().Where(p => p.PropertyType.Namespace == "System");
foreach (var property in properties)
{
var name = property.Name;
var value = property.GetValue(entity);
if (value != null)
{
var matchingRecords = records.Where(c => c.DataBaseRecordKeys.Any(k => k.DataBaseRecordKeyName == name && k.DataBaseRecordValue == value.ToString()));
if (matchingRecords.Count() > 0)
{
records = matchingRecords;
}
}
}
result = (from c in records
from p in c.DataBaseRecordProperties
select new ResultDataType()
{
ResultDataTypeId = c.ResultDataTypeID,
SubmitDate = c.SubmitDate,
SubmitUserId = c.SubmitUserId,
PropertyName = p.PropertyName
});
return result.AsQueryable();
}
The last statement joins a property table related to the database record with information on all of the properties.
This works well enough for a single record, but I'd like to get rid of that loop so that I can speed things up enough to work on many records.
using System.Reflection;
public IQueryable<ResultDataType> MatchingRecordFor(object entity)
{
var records = _context.DataBaseRecords;
var entityType = entity.GetType();
var properties = entityType.GetProperties().Where(p => p.PropertyType.Namespace == "System");
Func<KeyType, PropertyInfo, bool> keyMatchesProperty =
(k, p) => p.Name == k.DataBaseRecordKeyName && p.GetValue(entity).ToString() == k.DataBaseRecordValue;
var result =
from r in records
where r.DataBaseRecordKeys.All(k => properties.Any(pr => keyMatchesProperty(k, pr)))
from p in r.DataBaseRecordProperties
select new ResultDataType()
{
ResultDataTypeId = r.ResultDataTypeId,
SubmitDate = r.SubmitDate,
SubmitUserId = r.SubmitUserId,
PropertyName = p.PropertyName
});
return result.AsQueryable();
}
Hopefully I got that query language right. You'll have to benchmark it to see if it's more efficient than your original approach.
edit: This is wrong, see comments

Entity Query in Query, is it possible

I'm not really sure how to ask this question. I need to create an object, I believe it is called a projection, that has the result of one query, plus from that need to query another table and get that object into the projection.
This is a C# WCF Service for a Website we are building with HTML5, JS, and PhoneGap.
EDIT: Getting an error on the ToList (see code) - "The method or operation is not implemented."
EDIT3: changed the Entity Object company_deployed_files to IQueryable AND removed the FirstOrDefault caused a new/different exception Message = "The 'Distinct' operation cannot be applied to the collection ResultType of the specified argument.\r\nParameter name: argument"
Background: This is a kind of messed up Entity Model as it was developed for Postgresql, and I don't have access to any tools to update the model except by hand. Plus some design issues with the database does not allow for great model even if we did. In other words my two tables don't have key constrains(in the entity model) to perform a join in the entity model - unless someone shows me how - that honestly might be the best solution - but would need some help with that.
But getting the below code to work would be a great solution.
public List<FileIDResult> GetAllFileIDFromDeviceAndGroup ( int deviceID, int groupID)
{
List<FileIDResult> returnList = null;
using (var db = new PgContext())
{
IQueryable<FileIDResult> query = null;
if (deviceID > 0)
{
var queryForID =
from b in db.device_files
where b.device_id == deviceID
select new FileIDResult
{
file_id = b.file_id,
file_description = b.file_description,
company_deployed_files = (from o in db.company_deployed_files
where o.file_id == b.file_id
select o).FirstOrDefault(),
IsDeviceFile = true
};
if (query == null)
{
query = queryForID;
}
else
{
// query should always be null here
}
}
if (groupID > 0)
{
var queryForfileID =
from b in db.group_files
where b.group_id == groupID
select new FileIDResult
{
file_id = b.file_id,
file_description = b.file_description,
company_deployed_files = (from o in db.company_deployed_files
where o.file_id == b.file_id
select o).FirstOrDefault(),
IsDeviceFile = false
};
if (query != null)
{
// query may or may not be null here
query = query.Union(queryForfileID);
}
else
{
// query may or may not be null here
query = queryForfileID;
}
}
//This query.ToList(); is failing - "The method or operation is not implemented."
returnList = query.ToList ();
}
return returnList;
}
Edit 2
The ToList is throwing an exception.
I'm 98% sure it is the lines: company_deployed_files = (from o in db.company_deployed_files where o.file_id == b.file_id select o).FirstOrDefault()
End Edit 2

Categories