EF and Async - Weird live scenario - c#

I'm implementing async all over my cloud-based project.
I'm now trying to figure out why my TransactionScope keeps crashing randomly. The messages are "Cannot perform this operation because there are pending operations" or "The operation is not valid for the state of the transaction" an other similar.
I say that it crashes randomly, because if you retry the operation, it works eventually...
At first I implemented the TransactionScopeAsyncFlowOption.Enabled overload... the fail ratio decreased.
Then I made the whole operation use the same DbContext (previous guys created a new one for each CRUD operation, like selecting a user? new context for you! now you want the sales of that user? let me get that using a new context! Create a new sale? Lets do that on this new context here... and so on...). fail ratio declined even further.
Then i decided to await as soon as possible (previously i was firing some queries at the start of the operation, and only waiting right before using the result). That significantly improved the fail ratio.
Now, I got a message in my logs that signaled a FK mismatch... Thats really weird because this is a very solid app and FK logic going wrong is a very basic mistake. Looking at the log, i see something like "Error for client CLIENT_A. Complete message: (bla bla bla) The conflict occurred in database db_CLIENT_B"!!!
In my multi-tenant app, each tenant has it's own database, so CLIENT_A should have problems only with db_CLIENT_A. We are very meticulous about this.
That is a really serious problem. That means that either unity container is giving the wrong instance of DbContext (it's configured for single instance per request) or there's a serious problem going on regarding the async/await and parallel and distinct operations... I think it could be mix, taking into account that DbContext is not thread-safe and neither is the Resolve, although it's getting called only once (resolve for DBContext happens very soon on the pipeline)
So my question is: What can i do to figure this out?
PS: ON the last 7 days, i have 5 logs for this. This might've happen more times (the switching) but if the other database has a compatible FK... well, i will here for that in a couple of days when managers start emitting financial reports...

This is caused by Unity. This happens when I call 'Resolve' within a async scope.

Related

Terminating a .net-core WebApp if there is a critical error

In order to make deployment easier I want my application to self create pre-defined roles on the database.
If for some reason it is the first time running on the current database but it fails to create the roles, PE, there was a connection error, the application immediately terminates with an Environment.Exit(-1);.
Is this a bad practice? If yes, why? What are my alternatives?
EDIT for clarity: I'm logging every exception/error with log4net.
Yes it is bad practice, however it depends on who uses this. Are you the only one or are people going to be disturbed by a force quit with no explanation?
Showing an alert/message with a little bit of explanations or an error code would be useful in a lot of cases. Log the error you get from the database if possible and if the users from your app know what to do with it you can also share it in the alert/message.
As long as you know why the application terminated, it could be fine. If this is in Production, you also should get an alert if this happens.
At the very least, you should log the error. For a DB connection, you can add logic to try to reconnect either for some time or forever depending on your requirements. Depending on the error you get back from the database, you can either try again or not.
In my applications, I take an approach of forever attempts but make sure to log any errors I get back such as bad credentials or timeout. It is also good to have a 3rd system to monitor for errors and alert you.

ASP.NET Core Return HTTP Response And Continue Background Worker With Same Context

Sorry ahead of time, this is a bit of a lengthy setup/question. I am currently working on an API using C# ASP.NET Core 2.1. I have a POST endpoint which takes about 5-10 seconds to execute (which is fine). I need to add functionality which could take a considerable amount of time to execute. My current load testing takes an additional 3 minutes. To be honest production could take a bit longer because I can't really get a good answer as to how many of these things we can expect to have to process. From an UX perspective, it is not acceptable to wait this long as the front end is waiting for the results of the existing POST request. In order to maintain an acceptable UX.
All services are set up as transient using the default ASP.NET Core DI container. This application is using EF Core and is set up in the same fashion as the services (sorry I am not at work right now and forgot the exact verbiage within the Setup file).
I first tried to just create a background worker, but after the response was sent to the client, internal objects would start to be disposed (i.e. entity db context) and it would eventualy throw errors when continuing to try executing code using said context (which makes sense since they were being disposed).
I was able to get a background worker mostly working by using the injected IServiceScopeFactory (default ASP.NET Core implementation). All my code executes successfully until I try saving to the DB. We have overridden the SaveChangesAsync() method so that it will automatically update the properties CreatedByName, CreatedTimestamp, UpdatedByName, and UpdatedTimestamp to the currently tracked entities respectively. Since this logic is used by an object created from the IServiceScopeFactory, it seems like it does not share the same HttpContext and therefore, does not update the CreatedByName and UpdatedByName correctly (tries to set these to null but the DB column does not accept null).
Right before I left work, I created a something that seemed to work, but it seems very dirty. Instead of using the IServiceScopeFactory within my background worker to create a new scope, I created an impersonated request using the WebClient object which pointed to an endpoint within the same API that was currently being executed. This did allow the response to be sent back to the client in a timely manor, and this did continue executing the new functionality on the server (updating my entities correctly).
I apologize, I am not currently at work and cannot provide code examples at this moment, but if it is required in order to fully answer this post, I will put some on later.
Ideally, I would like to be able to start my request, process the logic within the existing POST, send the response back to the client, and continue executing the new functionality using the same context (including the HttpContext which contains identity information). My question is, can this be done without creating an impersonated request? Can this be accomplished with a background worker using the same context as the original thread (I know that sounds a bit weird)? Is their another approach that I am completely missing? Thanks ahead of time.
Look in to Hangfire pretty easy to use library for background tasks.

How can I prevent 'System.Transactions.TransactionException' error when using NServiceBus

My program makes use of NServiceBus as the service bus.
Now, when I run a some part of my program, it fires a command to initiate a process. This process involves data look up from a database (with A lot of data) by 3 separate handlers (classes) in the program. So, in some way, they are happening in parallel. As these 3 classes receives and handles the same command, then starts work
Searching through similar posts on stack overflow, I've come across a number of suggestions. Including 'increasing the timeout time' in both the config and machine.config. Done this to know avail.
This post! made me realise it could be an issue with NServiceBus and MSDTC.
I've also attached visual studio debugger to the program process and witnessed the exception taking place at every point where I'm querying a repository class - which queries the database.
System.Transactions.TransactionException occurred
HResult=-2146233087
Message=The operation is not valid for the state of the transaction.
Source=System.Transactions
StackTrace:
at
System.Transactions.TransactionState.EnlistVolatile(InternalTransaction
tx, IEnlistmentNotification enlistmentNotification, EnlistmentOptions
enlistmentOptions, Transaction atomicTransaction)
InnerException:
I'm tempted to just have a try catch everywhere. But that's me getting desperate. And, I'm ignoring a lot of data.
Please, any ideas?
All response will be appreciated.

The Mystery of the Vanishing EF Call

Today I got an emergency call from the users on our ASP.NET production system. Some users (not all) were unable to enter certain data. The user posted the data, and the system then froze; the call never returned.
We tried to repro the problem on the QA system (which has a fresh restore of production data), and could not. I then ran from my dev environment and connected directly to the production DB, masquerading as one of the affected users. Again, no problem. Conclusion: there must be some kind of issue in the production environment, probably somewhere in the IIS process that's hosting the website.
So I fired up Visual Studio on the production server, and attached to the IIS process (Kids, don't do this at home!), set a breakpoint in the offending code, logged in as the user, and attempted to save the data. Hit the breakpoint and stepped line by line, until I hit a line of code like this:
try
{
...
using (var db = new MyDataContext())
{
...
var fooToUpdate = db.Foos.Single(f => f.ID == fooId); // <-- THIS LINE
...
}
}
catch (Exception ex)
{
// some error logging
}
After hitting "step" on that line, the thread simply vanished. Disappeared without a trace. I put a sniffer on the database, and no query was fired; needless to say there was no DB locking involved. No exception was thrown. The code entered Entity Framework and never left.
The way the data is is that every user has a different and unique fooId for every day, so no other user will have the same fooId. Most users were able to load their Foo, but a select handful of users fail consistently to load their personal Foo. I tried running the query to load the Foo in a SSMS window; no trouble at all. The only time it fails is in this particular IIS process, on the production server.
Now, I could just recycle the app pool or restart IIS, and that would probably paper over the problem. But something similar happened a week ago, and we couldn't trace it then, either. So we reset IIS then, hoping the problem would go away. And it did, for a week. And now it's back.
Does anyone have any ideas how it is possible for a thread to simply vaporize like this? Is Norman Bates hiding behind the EF door?
Given the fact that the thread did not magically vaporize, we could speculate some of the more likely options:
The debugger had a hard time following the production code compiled in Release mode. Just because debugging Release code works 90% of the time, don't fall under the illusion that it is dependable. Optimized code can very quickly throw the debugger off the track of actual execution. When this happens, it will look like the thread just vanished.
Assuming the thread does legitimately enter the call and not return (which seems to be supported by the original complaint of the application "freezing"), then the most likely scenario is a deadlock of some type. EntityFramework deadlocks are not common, but not unheard of either. The most common issues I'm aware of usually involve TransactionScope or CommitableTransaction. Are you using any transactions in the omitted code sections?
Turns out that the EF part was a red herring after all. I went and downloaded Telerik's JustDecompile and JustCode, in the hope of stepping into the EF code, but when I stepped in to that line, I found myself not in the Single() extension method, but inside one of my own method calls - that I thought I had executed on the previous line. Evidently the code was not perfectly in sync with the version in production.
LESSON 1: If you attach to a process, your execution point may not be where you think it is, if your code is not identical to the code that was
compiled into that process.
So anyway, now that I could step into the code without decompiling anything, the first thing I noticed was:
lock (_lockObj)
{
...
}
And when I tried to step into it, it froze there. Smoking gun.
So somewhere, some other thread is locking this object. Looked at other places where the lock is invoked, leading to a spaghetti of dependencies, along with another code-locked segment, with several DB calls and even a transaction boundary. It could be a code lock / db transaction deadlock, though a brief scan of the code in the DB transaction failed to pick up any contenders within the life of the transaction for blocking anything else. Plus, there's the evidence of the DB not showing any blocking or open transactions. Rather, it may just be the the fact of a few hundred queued up long-running processes, all inside code locks inside code locks, and in the end it all looks something like the West Side Highway at 17:05 on a Friday, with a jackknifed trailer truck lying across 3 lanes approaching the GW bridge.
LESSON 2: Code locks are dangerous, not only - but especially - when used in conjunction with DB transactions. Try to find ways to make your code thread safe without using code locks. And if you really must use code locks, make sure you get in and out as quickly as possible. Don't give your thread a magazine to read while it's occupying the only stall, so to speak.

nhibernate transaction management in asp.net WebForms

I've been doing some research on transaction management for NHibernate for ASP.Net applications (Webforms). Most articles I found tend to favour the transaction-per-request approach. While I understand the session-per-request, and am totally in favour of it. However, I don't exactly understand the reasoning behind the transaction-per-request.
My applications use version-field to do concurrency checks. If a problem, such a StaleObjectException, or anything else is thrown, this is thrown once you call Transaction.Commit(). My issue is that if this is in the end of request, you cannot easily know which method actually had the problematic code, and it is extremely difficult to rollback from the issue.
For example, in a StaleObjectException, normally one needs just to re-run the method, with the updated data.
Any ideas on this, and any practices for transaction management? I tend to favour opening as much transactions as possible, based on the business logic. The issue with multiple transactions is where to actually begin/end the transaction.
The transaction-per-request paradigm makes sense when you step back and think of the underlying nature of a web application. It is a request/response system, nothing more. A user submits a request for "something", which the application translates into performing an action or a series of actions, then the application sends a response to the user to indicate its current updated state.
In this model each request for action to the application can (should arguably) be atomic. After all, from the perspective of the user, conceptually what failed when there was an error? The request failed. Upon such a failure, the application should respond in two ways:
Roll back any partial changes associated with processing the request so that the persisted data isn't left in a "partial" or "undefined" state.
Inform the user (in the response) that the request has failed in some way.
My issue is that if this is in the end of request, you cannot easily know which method actually had the problematic code, and it is extremely difficult to rollback from the issue.
Can you show an example with code? I wonder if the problem might be addressed by other elements of the application's design than the transaction structure. Perhaps the processing of requests isn't properly atomic or encapsulated.
(Comments indicate that the below may be more personal opinion and is normal behavior for NHibernate)
For example, in a StaleObjectException, normally one needs just to re-run the method, with the updated data.
While I'm certainly no expert on NHibernate, I hesitate to understand what's being said here. Generally when an exception occurs, simply "trying again" is very often not a good way to handle the exception.

Categories