Random error: Attempted to read or write protected memory - c#

We have a C# .Net application using WCF services. And the application is deployed in our production server under a Windows Service Application. One part of the module is responsible for creating shape files ((*.shp, *.dbf) for a smaller area the workers will be working today and send them down to a PDA.
To write the shape files, we use a third party dll, NetTopologySuite
GisSharpBlog.NetTopologySuite.IO.ShapefileWriter
which is also in C#. (I am not sure whether any dll it reference use unmanaged code.)
The system might work fine for a while say for a week. Then suddenly we get an exception saying
Attempted to read or write protected memory.
This is often an indication that other memory is corrupt.
from the Write method, where we write the geometry collection to shape files.
sfw.Write(FileName, new GeometryCollection(gc.ToArray()));
(GeometryCollection is also from a third party dll, GeoAPI.dll)
This error brings down the whole service and makes it unfunctional. Then we would just restart the service and try to run the same data again, it would work fine for another week till it crash again. It happens only in production and at random times. We were not able to find the cause of the issue.
Many forums suggest that it might be because of memory leaks in some unmanaged code. But we couldn't find which one.
We are also ready to rewrite the part that create new shape files.
Please help me to resolve this issue.
Let me know if more details are required. Thanks in advance.

In my experience, that message was a result of a memory leak. This is what I'd do if I am in your situation especially since you are working on a third-party DLL.
1) Monitor your WCF server and see what is going on with the DLLHost.exe and the aspnet services in the task manager. I have a feeling that your third-party DLL has a memory leak that causes these 2 services to bloat and reach the limit of your servers memory. This is the reason why it works for a while and then suddenly just stopped working.
2) Identify a good schedule on when you can recycle your servers memory and application pool. Since the issue is rampant, you might want to do this every midnight or when no one is actively using it.
3) Write a good error logging code to know exactly what is happening during the time it bogged down. I would put the following information on the error logs: The parameters that you are passing, the user who encountered that problem etc. This is so you will know exactly what is happening.
4) Check the Event Viewer as maybe there is some information in there that can pinpoint the problem.
4) After doing 1, 2, and 3 and I will call your third-party DLL vendor and see what they can do to help you. You might need to provide the information that you collected from 1, 2, 3 and 4 items from above.
Good luck and I hope this will help.

I think you have some unmanaged code in the third libraries that is getting an address protected by the system or used by other applications.

You have an Access Violation (pointer to memory not belonging to your application space, including null/mass - 0x0 - address) in one of your third-party DLLs.
Or else, it's maybe some unmanaged COMObject you're using that causes this error.

The random nature of this error, would suggest to me that it may be a matter of threads. Specifically the Write method of ShapefileWriter might have been called, got delayed in a thread then you call Close. The delayed Write method then tries to write over a closed (and protected) file, which could result in the error you see.
This is purely speculation since there's not much code to make a better guess, but I've experienced this issue using video writing libraries, so it might be the same in your case.

Check to make sure you don't have threads within threads. That is what happened when I encountered this error. See this link for more information: Attempted to read or write protected memory. This is often an indication that other memory is corrupt

Related

Might GC.Collect() be warranted in this particular case?

Disclaimer: Yes, I know that the general answer to whether or not to use GC.Collect() is a resounding "NO!". This is the first time in several years of programming that I ever consider using it at all.
Well then, here's the situation: We have developed a C# scripting tool based on the Microsoft.CodeAnalysis.CSharp.Scripting libraries (v3.6.0). It's a Winform GUI with editor etc., not unlike others out there. We use it for the validation of integrated circuits, meaning that its primary task is interfacing lab equipment such as power supplies, pattern generators, meters and the like. For the communication to said instruments we predominantly rely on National Instrument's VISA framework, albeit not exclusively. Some devices are controlled directly via DLLs from their respective manufacturers. In general, this system is working beautifully and by now it is successfully used by quite a lot of design engineers who do not know the first thing about the intricacies of .NET and C#.
At this point I should explain that the user can simply write a method (i.e. on "top-level") and then execute it. The Roslyn-part behind this is that the input is fed to CSharpScript.Create() and then compiled. The execution of a method is done via Script.ContinueWith("method name"). Inside of such a method the user can construct an object like, say, new VISA("connection string"), which connects to the device and then communicate with the device via this object. Nothing forces him or her to care about disposing the object (i.e. closing the connection).
Now, the problem is this: recently, very sporadic crashes of the GUI application have occurred with no feedback at all from the system - the form just closes and that's it. By trial-and-error we are currently 99% sure that if all connection objects are explicitely disposed within a method, the crashes do not occur. So, rewriting the method to something like this fixes the problem:
using(var device = new VISA("connection string"))
{
device.Query("IDN?");
}
The reason why I look into the GC's direction at all is that there is no discernible correlation to any actions from the user. The guys might run such methods for an hour without a problem and then, when scrolling in the editor, when no method is currently being executed, the GUI closes without comment. And that's why I'd like to get some input from people more knowledgeable about Roslyn and the GC:
Are there known issues with this scripting library and GC? (I would very much assume that there aren't)
Since the explicit disposal of objects seem to prevent the issue, might this be one of the extremely scarce situations where the use of GC.Collect() might be warranted? (admittedly, I could not yet test whether that also prevents the problem thanks to of home office)
Any ideas what can cause a .NET application to crash without any kind of feedback and how to obtain more information about such a crash? (the scripting engine is a separate DLL, as are the device drivers; the GUI only handles the graphics)
I am fully aware that this is a rather vague description of the problem with very little source code. This is due to the fact that the application comprises of quite a lot of source code and I have no idea what might be relevant here. Also, all namespaces in the above text refer to Microsoft.CodeAnalysis.CSharp.Scripting, except for VISA, which is self-defined. Obviously, I will gladly answer any follow-up questions for getting to the bottom of this.
Thanks in advance.
Short answer: No. It's not only not warranted, it's completely missing the actual issue.
Further explanation: #canton7 instantly hit the nail on the head when writing
I'd argue that your application shouldn't crash even if a finalizer does end up being called
The root issue hid inside a 3rd party DLL in form of an, at the very least, suboptimal implementation of IDisposable. Once I zoomed in on that, it was rather easy to produce a workaround for that.
My original question is so very misguided that I'd like to state the one that I should have asked:
How do I trace a crash of my C# application when my application's logging does not show anything?
This question has been answered comprehensively in a number of posts. In my case, the crash could be seen in the Windows event log.

COM Add-in: Resolve the error DisconnectedContext in WinWord.exe

I built an add-on to Microsoft Word. When the user clicks a button, it runs a number of processes that export a list of Microsoft Word documents to Filtered HTML. This works fine.
Where the code falls down is in processing large amounts of files. After the file conversions are done and I call the next function, the app crashes and I get this information from Visual Studio:
Managed Debugging Assistant 'DisconnectedContext' has detected a problem in 'C:\Program Files\Microsoft Office\root\Office16\WINWORD.EXE'.
Additional information: Transition into COM context 0x56255b88 for
this RuntimeCallableWrapper failed with the following error: System
call failed. (Exception from HRESULT: 0x80010100
(RPC_E_SYS_CALL_FAILED)). This is typically because the COM context
0x56255b88 where this RuntimeCallableWrapper was created has been
disconnected or it is busy doing something else. Releasing the
interfaces from the current COM context (COM context 0x56255cb0). This
may cause corruption or data loss. To avoid this problem, please
ensure that all COM contexts/apartments/threads stay alive and are
available for context transition, until the application is completely
done with the RuntimeCallableWrappers that represents COM components
that live inside them.
After some testing, I realized that if I simply remove all the code after the file conversions, there are no problems. To resolve this, I place the remainder of my code in yet another button.
The problem is I don't want to give the user two buttons. After reading various other threads, it sounds like my code has a memory or threading issue. The answers I am reading do not help me truly understand what to do next.
I feel like this is what I want to do:
1- Run conversion.
2- Close thread/cleanup memory issue from conversion.
3- Continue running code.
Unfortunately, I really don't know how to do #2 or if it is even possible. Your help is very much appreciated.
or it is busy doing something else
The managed debugging assistant diagnostic you got is pretty gobbledygooky but that's the part of the message that accurately describes the real problem. You have a firehose problem, the 3rd most common issue associated with threading. The mishap is hard to diagnose because this goes wrong inside the Word plumbing and not your code.
Trying not to commit the same gobbledygook sin myself, what goes wrong is that the interop calls you make into the Office program are queued, waiting for their turn to get executed. The underlying "system call" that the error code hints at is PostMessage(). Wherever there is a queue, there is a risk that the queue gets too large. Happens when the producer (your program) is adding items too the queue far faster than the consumer (the Office program) removes them. The firehose problem. Unless the producer slows down, the queue will grow without bounds and something is going to fail if it is allowed to grow endlessly, at a minimum the process runs out of memory.
It is not allowed to get close to that problem. The underlying queue that PostMessage() uses is protected by the OS. Windows fails the call when the queue already contains 10,000 messages. That's a fatal error that RPC does not know how to recover from, or rather should not try to recover from. Something is amiss and it isn't pretty. It returns an error code to your program to tell you about it. That's RPC_E_SYS_CALL_FAILED. Nothing much better happens in your program, the CLR doesn't know how to recover from it either, nor does your code. So the show is over, the interop call you made got lost and was not executed by Word.
Finding a completely reliable workaround for this awkward problem is not that straight-forward. Beware that this can happen on any interop call, so catching the exception and trying again is pretty drastically unpractical. But do keep in mind that the Q+D fix is very simple. The plain problem is that your program is running too fast, slowing it down with a Thread.Sleep() or Task.Delay() call is quite crude but will always fix the issue. Well, assuming you delay enough.
I think, but don't know for a fact because nobody ever posts repro code, that this issue is also associated with using a console mode app or a worker thread in your program. If it is a console mode app then try applying the [STAThread] attribute to your Main() method. If it is a worker thread then call Thread.SetApartmentState() before starting the thread, but beware it is very important to also create the Application interface on that worker thread. Not otherwise a workaround for an add-in.
If neither of those workarounds is effective or too unpractical then consider that you can automagically slow your program down, and ensure the queue is emptied, by occasionally reading something back from the Office program. Something silly, any property getter call will do. Necessarily you can't get the property value until the Office program catches up. That can still fail, there is also a 60 second time-out on the interop call. But that's something you can fix, you can call CoRegisterMessageFilter() in your program to install a callback that runs when the timeout trips. Very gobbledygooky as well, but the cut-and-paste code is readily available.

How to debug an application which suddenly terminates without any feedback?

The application uses Xamarin.Android, which may be a big problem in itself. The problem is that sometimes it just quits (process is being terminated) and there's nothing in the log that can be associated with it. (although I guess that it's related to running out of memory, but I can't yet prove it — according to DDMS, most of the times all is OK, and if Xamarin.Android uses another pool of memory, then I don't know how to measure it)
I've searched the code base for "Environment.Exit" and, of course, didn't found anything.
What are the options for finding the culprit of such thing?
You could try to use the garbage collector by yourself. Just run
Runtime.getRuntime().gc();
The Runtime instance has also a method to read the free memory space. So you could figure out by yourself whether it's a memory problem.
EDIT:
Oh I read that Xamarin uses the C# language. But I'm quite sure that C# has similar methods.
When you say log, are you referring to an application log, or the device log?
When tracking down these sorts of bugs, I've always found aLogCat invaluable.
I open it, clear all the current logs, then use my application up to the point where it crashes. Then I quickly go back to aLogCat, pause it and scroll up to where the error is - it's usually found in the nearest red/orange blocks.
There's a blog post here about how I found attributes left out by the Xamarin linker using this method.

Even using sgen on my service class still results in agonizingly slow constructor

So I'm trying to speed up our applications startup times -- and I've identified a major bottleneck to work on. Each of our webservice client classes takes forever and a day to instantiate. Some investigation revealed this is entirely due to the SoapHttpClientProtocol running GenerateXMLMappings. I started searching for information on this and found this SO post Slow SoapHttpClientProtocol constructor
I was ready to sound the trumpets since my issues mirrored what was talked about there to the letter. I went through every step listed in the first post to use sgen to pre-generate a serializer dll, and then removed the various tags from the code and built that into a normal dll which I referenced in the applciation as a normal reference (as opposed to a web reference). However after all this, I don't see any difference when profiling the application. Tons of time is still soaked up doing GenerateXMLMappings as part of the SoapHttpClientProtocol constructor.
I have verified that it is in fact using my custom webservice client dll. I have also verified that it is at least looking for the XmlSerializers dll (if I do not include the file I can see a filenotfound is spit up about it).
Does anyone have detailed info about how the SoapHttpClientProtocol constructor decides what it needs to do? This is a really frustrating problem because the whole process seems to be blackboxed with no good way to see what is actually going on internally.
Thanks in advance for any help -- I'm completely against a wall on this one.
I hit this every so often. I'll be happy to guess, but guesses are usually wrong.
To find what the problem really is I just run the app under the IDE and pause it a few times while it's being slow, to see what it's doing. That's this technique.
OK, here are the guesses, which I've seen but for you are probably wrong.
Fetching strings from resources during load.
Notifications gone mad while building data structure.
Initializing 3rd-party grids/controls, even with empty data.
Parsing/Writing XML more than you thought.
Zipping/Unzipping more than you thought.

IIS hosted web service method call randomly dies

We have an IIS hosted web method which is randomly dying on us about 10% of the time. In trying to debug this we've added Log.Debug() messages in front of every real code line and it appears to be dying on random lines.
Has anyone seen this or have an idea on how to debug this?
[Additional Details]
We've spent a lot of time looking at it and have discovered the following...
We have a seperate self-hosted WCF Service that access the same database and lives on the same machine. When it is under heavy load the web method croaks every time. If it's not under load then things usually work fine (but not 100%).
High CPU doesn't seem to be part of the problem. We ran a small app that created a high cpu load and the web service did not die.
The web service dies when we either new up an XmlSerializer (without doing the sgen precomp) OR have NHibernate create a SessionFactory. The only two things these things have in common is that they 1) seem like things people commonly do.. 2) seem like they would be fairly intensive.
We've added a Global.asax to try to capture Application_End and Application_Error but neither event gets fired. This to me implies that we're not dealing with a normal application pool resetting?
Sounds like it might be a threading issue. You are using informative debug messages -- you should try to reproduce the issue while running the debugger and breaking on all exceptions. Make sure you check all the windows logs for information on why the app pool crashed.
Per comment: It's hard to say, but many things can cause a thread to appear to "just die." Memory issues: are you doing any interop? Improper marshaling: are you touching data on another thread? But, I will play the probabilities and ask if you're sure your handling any exception that might be happening and logging it. Are you sure you are not gobbling up an exception and not reporting it? Somewhere down low? Is this a permissions issue? Are you running partial trust or on a low privilege user account?
Figured it out.. two problems really..
We added Global.asax but it didn't get copied over which explains why we weren't seeing any messages. We fixed this and found out that...
Our WCF log was being written out to the bin directory of the IIS Web Service. In retrospect this is kind of silly since the WS is an old school web service. The WCF stuff is in the same directory only for some reason that is unknown to us since the initial person who set things up is gone..
Lesson learned.. Somewhere there is a message that explains everything.. you just have to find it.

Categories