Incomprehensible problem between MongoDB and Oracle (dynamic assembly) - c#

I've got a similar problem like Oracle DataAccess related: "The invoked member is not supported in a dynamic assembly."
I've got the same Oracle exception ("the invoked member is not supported in a dynamic assembly").
However, I think the Oracle version is not the problem since the program worked fine before that, and the Oracle version is the same. It's related to the support of a custom oracle data type (SdoGeometry).
[OracleCustomTypeMappingAttribute("MDSYS.SDO_GEOMETRY")]
public class SdoGeometry : OracleCustomTypeBase<SdoGeometry> { ... }
Note: it worked well until now.
The program tries to get some data from an oracle database and computes and stores result in MongoDB database. Since there have been recent developments on the MongoDB part (what's the connection? I'm getting there), i'm getting the exception in some cases.
So, the program works as follow:
1. If there is data in MongoDB, the program checks it
2. The program selects data in Oracle and prepares it
3. It stores data in MongoDB
The (sometimes) failing operation is in step 2: it consists in adding colums type names in the output data, like this:
private void ReadTypes(Dictionary<string, string> types, DbDataReader reader){
if (types.Count == 0) {
int fieldCount = reader.FieldCount;
for (int i = 0; i < fieldCount; i++) {
string fieldName = reader.GetName(i);
try {
string fieldType = reader.GetFieldType(i).Name;
types[fieldName] = fieldType;
}
catch (Exception e) {
// the invoked member is not supported in a dynamic assembly
// only for SdoGeometry type
}
}
}
}
OracleDataReader.GetFieldType(i) fails for SdoGeometry type (our custom data type) only when step 1 is executed (there is some MongoDB operations). I've identified the responsible operation, it's:
mongoEvents = mongoEvents_.Find(e => e.Identifier.Table.Equals(table)).ToList();
(by moving an if-true-return block from line to line - on a part of the application that I did not show - I identified that it was this operation that produced the error).
This operation consists in extracting from Mongo, the data already archived for the current Oracle table. It's an operation for MongoAPI (IMongoCollection.Find). So:
If I comment this line and return an empty list (or with a manually inserted object), there is no more exception. Well...
But what is strange is this:
//I've replaced the previous statement.
//It's working, no mongo data is returned but this is independent of step 2,
//which in any case retrieves data from Oracle database.
//(MongoDataEvent is one of our classes which defines the structure of archived data)
mongoEvents = new List<MongoEvent.MongoDataEvent>();
Okay, but if instead of that, I add this statment after the previous one:
mongoEvents = mongoEvents_.Find(e => e.Identifier.Table.Equals(table)).ToList();
mongoEvents = new List<MongoEvent.MongoDataEvent>();
Okay it's useless, but when emptying the list after performing the Find method, the exception appears again (not while calling Find, but after while calling GetFieldType), while the list is empty.
So...I don't have any idea what's going on. Any ideas ? Thanks.

I found a solution !
By simply adding, in the first statement of my program, an Oracle (dummy) query making a select on a SdoGeometry type field of an arbitrarily chosen table.
I think this forces the SdoGeometry type to load.
So as I understand it the bug was happening while querying MongoDb data before loading Oracle SdoGeometry data, there was a problem loading in assembly (?).
Still, this solution works perfectly! Every time I try it works and as soon as I try to comment out the request forcing SdoGeometry to load, the error always occurs again. Additionally, the MongoDb data looks correct.
I don't understand everything in the details but it works !

Related

Why does calling SetDataBinding on a VSTO ListObject result in an exception (0x800A03EC) at runtime?

I am attempting to build a DataTable based on an Excel ListObject at runtime from a VSTO add-in. The basic approach is I have a UDF that watches for a reference to the ListObject, and on encountering it, essentially takes ownership of that object by building a corresponding strongly typed DataTable, binding the table to the ListObject, and operating on the table. The result should be that I can then operate on the DataTable in the future without first confirming if it actually reflects the data in Excel, and the user can tweak values as needed.
What I'm finding is that, the below code is throwing an exception with code 0x800A03EC (and no explanation) on the call to SetDataBinding. Does anyone know why this is?
// Using statements mapping the interop namespace to Excel, and the VSTO one (Excel.Tools) to VSTO.
// Called from an ExcelDNA addin...
public object MyUDF(params object args[])
{
if (!_DataDict.ContainsKey(args[0]))
{
Excel.ListObject listObject = FindListObject(args[0]); // Finds the appropriate object
VSTO.ListObject vstoListObject = Globals.Factory.GetVstoObject(listObject);
DataTable data = ReadListObjectToTable(listObject);
vstoListObject.SetDataBinding(data); // Exception Here
_DataDict.Add(args[0], data);
}
// Retrieve data by key and operate on it.
}
If I skip data binding, then everything works, but the data stays unbound: changes made in the worksheet aren't reflected in the table. If I insist on binding, then, I get an Exception. What is wrong?
SetDataBinding seems to make calls against Excel that are forbidden during the process of executing a calculation (but perfectly legal at other times).
Therefore, instead of performing the binding during the calculation, use a self-detaching event on Application.SheetSelectionChange, and perform the binding there. I.e.
Excel.AppEvents_SheetSelectionChangeEventHandler bindToData = null;
bindToData =
(s, e) => {
xlApp.SheetSelectionChange -= bindToData;
vstoList.SetDataBinding(clone);
};
xlApp.SheetSelectionChange += bindToData;
Result is everything binds up neatly.

Linq to SQL and using ToList, how many DB calls are made?

I got a function like this
public List<Code> GetCodes()
{
return _db.Codes.Select(p => p).ToList();
}
I have 160,000 records in this table which contains 2 columns.
Then I also loop through it as follows..
List<Code> CodeLastDigits = CodeDataProvider.GetCodes();
And then loop through it
foreach (var postCodeLastDigit in PostCodeLastDigits)
I am just trying to understand how many times a call is made to the database to retrieve those records, and want to make sure it only happens once.
Linq will delay the call to the database until you give it a reason to go.
In your case, you are giving a reason in the ToList() method.
Once you call ToList() you will have all the items in memory and wont be hitting the database again.
You didnt mention which DB platform you are using, but if it is SQL Server, you can use SQL Server profiler to watch your queries to the database. This is a good way to see how many calls and what sql is being run by linq to SQL. As #acfrancis notes below, LinqPad can also do this.
For SQL Server, here is a good tutorial on creating a trace
When you call ToList(), it's going to hit the database. In this case, it appears like you'll just be hitting the database once to populate your CodeLastDigits. As long as you aren't hitting the database again in your foreach, you should be good.
As long as you have the full version of SQl Server, you can run Sql Server Profiler while going through your code to see what's happening on the database.
Probably once. But the more complete answer is that it depends, and you should be familiar with the case where even a simple access pattern like this one can result in many, many round trips to the database.
It's not likely a table named codes contains any complex types, but if it did, you'd want to watch out. Depending on how you access the properties of a code object, you could incur extra hits to the database if you don't use LoadWith properly.
Consider an example where you have a Code object, which contains a CodeType object (also mapped to a table) in a class structure like this:
class Code {
CodeType type;
}
If you don't load CodeType objects with Code objects, Linq to SQL would contact the database on every iteration of your loop if CodeType is referenced inside the loop, because it would lazy-load the object only when needed. Watch out for this, and do some research on the LoadWith<> method and its use to get comfortable that you're not running into this situation.
foreach (x in PostCodeDigits) {
Print(x.type);
}

db4o does not delete the record

Good day! Try db4o, faced with this problem: I can not delete records:
using (IObjectServer server = Db4oClientServer.OpenServer(HttpContext.Current.Server.MapPath("~/transfers.data"), 0))
{
using (IObjectContainer client = server.OpenClient())
{
var keyValuePair = (from KeyValuePair<DateTime, Transfer> d in client where d.Key < DateTime.Now.AddHours(-3) select d);
client.Delete(keyValuePair.First());
client.Commit();
}
}
After this code the number of objects (KeyValuePair< DateTime, Transfer >) in the database is not changed.
This will not work! The reason is that a KeyValuePair is a value type, which means it has no identity. However db4o manages objects by their identity! Now C# happily boxes any value type to an object, but that useless for db4o since it won't find any object with the given identity in the database.
You ran into annoying corner-case between the .NET and db4o behavior. Basically there is no nice work around for this, especially since db4o doesn't have an API to delete a object by its internal id =(.
For the future. Don't store KeyValuePairs (or any struct) for itself. Only as part of another object. (and use 8.1, it has a bugfix preventing it from never deleting structs). That avoids this issue.

new objects added during long loop

We currently have a production application that runs as a windows service. Many times this application will end up in a loop that can take several hours to complete. We are using Entity Framework for .net 4.0 for our data access.
I'm looking for confirmation that if we load new data into the system, after this loop is initialized, it will not result in items being added to the loop itself. When the loop is initialized we are looking for data "as of" that moment. Although I'm relatively certain that this will work exactly like using ADO and doing a loop on the data (the loop only cycles through data that was present at the time of initialization), I am looking for confirmation for co-workers.
Thanks in advance for your help.
//update : here's some sample code in c# - question is the same, will the enumeration change if new items are added to the table that EF is querying?
IEnumerable<myobject> myobjects = (from o in db.theobjects where o.id==myID select o);
foreach (myobject obj in myobjects)
{
//perform action on obj here
}
It depends on your precise implementation.
Once a query has been executed against the database then the results of the query will not change (assuming you aren't using lazy loading). To ensure this you can dispose of the context after retrieving query results--this effectively "cuts the cord" between the retrieved data and that database.
Lazy loading can result in a mix of "initial" and "new" data; however once the data has been retrieved it will become a fixed snapshot and not susceptible to updates.
You mention this is a long running process; which implies that there may be a very large amount of data involved. If you aren't able to fully retrieve all data to be processed (due to memory limitations, or other bottlenecks) then you likely can't ensure that you are working against the original data. The results are not fixed until a query is executed, and any updates prior to query execution will appear in results.
I think your best bet is to change the logic of your application such that when the "loop" logic is determining whether it should do another interation or exit you take the opportunity to load the newly added items to the list. see pseudo code below:
var repo = new Repository();
while (repo.HasMoreItemsToProcess())
{
var entity = repo.GetNextItem();
}
Let me know if this makes sense.
The easiest way to assure that this happens - if the data itself isn't too big - is to convert the data you retrieve from the database to a List<>, e.g., something like this (pulled at random from my current project):
var sessionIds = room.Sessions.Select(s => s.SessionId).ToList();
And then iterate through the list, not through the IEnumerable<> that would otherwise be returned. Converting it to a list triggers the enumeration, and then throws all the results into memory.
If there's too much data to fit into memory, and you need to stick with an IEnumerable<>, then the answer to your question depends on various database and connection settings.
I'd take a snapshot of ID's to be processed -- quickly and as a transaction -- then work that list in the fashion you're doing today.
In addition to accomplishing the goal of not changing the sample mid-stream, this also gives you the ability to extend your solution to track status on each item as it's processed. For a long-running process, this can be very helpful for progress reporting restart / retry capabilities, etc.

How do I speed up DbSet.Add()?

I have to import about 30k rows from a CSV file to my SQL database, this sadly takes 20 minutes.
Troubleshooting with a profiler shows me that DbSet.Add is taking the most time, but why?
I have these Entity Framework Code-First classes:
public class Article
{
// About 20 properties, each property doesn't store excessive amounts of data
}
public class Database : DbContext
{
public DbSet<Article> Articles { get; set; }
}
For each item in my for loop I do:
db.Articles.Add(article);
Outside the for loop I do:
db.SaveChanges();
It's connected with my local SQLExpress server, but I guess there isn't anything written till SaveChanges is being called so I guess the server won't be the problem....
As per Kevin Ramen's comment (Mar 29)
I can confirm that setting db.Configuration.AutoDetectChangesEnabled = false makes a huge difference in speed
Running Add() on 2324 items by default ran 3min 15sec on my machine, disabling the auto-detection resulted in the operation completing in 0.5sec.
http://blog.larud.net/archive/2011/07/12/bulk-load-items-to-a-ef-4-1-code-first-aspx
I'm going to add to Kervin Ramen's comment by saying that if you are only doing inserts (no updates or deletes) then you can, in general, safely set the following properties before doing any inserts on the context:
DbContext.Configuration.AutoDetectChangesEnabled = false;
DbContext.Configuration.ValidateOnSaveEnabled = false;
I was having a problem with a once-off bulk import at my work. Without setting the above properties, adding about 7500 complicated objects to the context was taking over 30 minutes. Setting the above properties (so disabling EF checks and change tracking) reduced the import down to seconds.
But, again, I stress only use this if you are doing inserts. If you need to mix inserts with updates/deletes you can split your code into two paths and disable the EF checks for the insert part and then re-enable the checks for the update/delete path. I have used this approach succesfully to get around the slow DbSet.Add() behaviour.
Each item in a unit-of-work has overhead, as it must check (and update) the identity manager, add to various collections, etc.
The first thing I would try is batching into, say, groups of 500 (change that number to suit), starting with a fresh (new) object-context each time - as otherwise you can reasonably expect telescoping performance. Breaking it into batches also prevents a megalithic transaction bringing everything to a stop.
Beyond that; SqlBulkCopy. It is designed for large imports with minimal overhead. It isn't EF though.
There is an extremely easy to use and very fast extension here:
https://efbulkinsert.codeplex.com/
It's called "Entity Framework Bulk Insert".
Extension itself is in namespace EntityFramework.BulkInsert.Extensions. So to reveal the extension method add using
using EntityFramework.BulkInsert.Extensions;
And then you can do this
context.BulkInsert(entities);
BTW - If you do not wish to use this extension for some reason, you could also try instead of running db.Articles.Add(article) for each article, to create each time a list of several articles and then use AddRange (new in EF version 6, along with RemoveRange) to add them together to the dbcontext.
I haven't really tried this, but my logic would be to hold on to ODBC driver to load file into datatable and then to use sql stored procedure to pass table to procedure.
For the first part, try:
http://www.c-sharpcorner.com/UploadFile/mahesh/AccessTextDb12052005071306AM/AccessTextDb.aspx
For the second part try this for SQL procedure:
http://www.builderau.com.au/program/sqlserver/soa/Passing-table-valued-parameters-in-SQL-Server-2008/0,339028455,339282577,00.htm
And create SqlCommnand object in c# and add to its Parameters collection SqlParameter that is SqlDbType.Structured
Well, I hope it helps.

Categories