I have a problem with an application that I wrote in .NET/C#. It consists of a server which manages a few other machines, and runs tests on them. It is a windows forms application. In order to run tests with proper error handling, for each machine I have two threads: one for running tests and one that pings it continuously. Each machine has a running queue, in which tasks are stored, tasks that will be run on that particular machine.
The issue is that after some time, when more than a few tasks are present in the queue, the memory it consumes(process explorer, task manager) gradually increases from about 50-100MB to 1.6-1.8 GB. At about this limit almost every transaction(file copy on share, remote WMI access) with the remote machines fails with either "Not enough storage" or "Out of memory". I tried some tools in order to localize the string and the closest I got was .Net Memory Profiler. That wasn't of great help, because the largest amount of memory was residing in "Private Data - Unidentified". This I'm guessing it's unmanaged data, because I can evaluate every other data(from my program) down to each string and int, and every instance of it.
Can anyone suggest a tool I can use in order to properly localize the leak and maybe fix it. It would help me a lot if I would know the DLL(from my app)/Thread that uses that memory, or at least if I can view somehow what is in that memory.
Note: A lot of posts are out there about the two exceptions: Not enough storage, and Out of memory. Most of them suggest increasing the IRPStackSize on the 'server' machine(in my case, clients). I have IRPStackSize of 50(0x32) on all of the machines, including the server.
EDIT
Regarding the comments: yes, I do maintain a log, but nothing strange happens. Using a memory profiler I discovered that my application, the .NET side uses about 20MB of memory when the unmanaged part is well over 1GB. With the help of WinDbg I found out what resides in that extra memory(in most of it). In order to access the machines and run different tests on them I use WMI, for which I have a wrapper. Everything I use is being disposed(using statements, and for some actually calling the Dispose method. Strangely though, the memory is filled with clones of this class. Does anyone know why a class would clone itself in memory.
Note: the rate at which the memory usage increases in about 5MB/s, so it's not really over a long period of time. I also wonder why it is not being freed by the garbage collector. I am using C# classes to work with WMI, not COM, nor unmanaged code. Also, among the objects on the heap I see a lot of data belonging to wmiutils, CWbemError. Oddly enough, google doesn't even know the word(no results for CWbemError)
Related
I am having problems figuring out the source of a memory problem in a complex C# based Windows service. Unfortunately the problem does not occur all the time and I still don't exactly know the conditions which cause it to happen. Sometimes when I check the system ressources used by the service, it takes up multiple gigabytes of memory until the point where it throws OutOfMemory-exceptions everywhere because there isn't any memory left.
I have a paid version of .NET Memory Profiler available but so far it has been useless because the whole system becomes slow and unstable when the service uses too much memory so I cannot attach the memory profiler to the application.
The solution of the application consists of more than 30 individual projects and hundreds of thousands lines of code so there is no way for me to find the source of the problem by simply looking through the source code.
So far the only thing I was able to do is creating a memory dump (.dmp file) of the process while it was using a lot of memory. Is there a way to analyze this dump or anything else that would help me narrow down the source of this problem?
If you could identify some central methods in the main classes of your legacy projects and you have some kind of logging already in place, you could log the total memory (managed and unmanaged, if your application opens such resources) by calling
Process.GetCurrentProcess().PrivateMemorySize64
At least that would give you a feeling if the memory problem is "diffuse", e. g. by not release objects for garbage collection etc., if if the memory problem occurs just in certain use cases (jump in memory consumption, if a certain action happens). Then you could nail it down by more logging and investigating the corresponding code sections. Its tedious, but when you can not use code instrumentation as you have said, I find it effective. If you want to analyze a specific situation with a memory dump, you can use WinDbg for analyzing it, but that takes some effort for the first time to learn, and would be a separate topic (see https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/debugger-download-tools).
I've got an application that:
Targets C# 6
Targets .net 4.5.2
Is a Windows Forms application
Builds in AnyCPU Mode beacuse it...
Utilizes old 32 bit libraries that cannot be upgraded to 64 bit, unmanaged memory
Uses DevExpress, a third party control vendor
Processes many gigabytes of data daily to produce reports
After a few hours of use in jobs that have many plots, the application eventually runs out of memory. I've spend quite a long time cleaning up many leaks found in the code and have gotten the project to a state where at the worst case it may be using upwards 400,000K of memory at any given time, according to performance counters. Processing this data has not yielded any issues at this point since data is processed in Jagged arrays, preventing any issues with the Large Object Heap.
Last time this happened the user was using ~305,000K of memory. The application is so "out of memory" that the error dialog cannot even draw the error icon in the MessageBox that comes up, the space where the icon would usually be is all black.
So far I've done the following to clean this up:
Windows forms utilize the Disposed event to ensure that resources are cleaned up, dispose is called manually when required
Business objects utilize IDisposable to remove references
Verified cleanup using ANTS memory profiler and SciTech memory profiler.
The low memory usage suggests this is not the case but I wanted to see if I saw anything that could be helpful, I could not
Utilized the GCSettings.LargeObjectHeapCompactionMode property to remove any fragmentation from processing data that may be fragmented in the Large Object Heap (LoH)
Nearly every article that I've used to get to this point suggests that out of memory actually means out of contiguous address space and given the amount that's in use, I agree with this. I'm not sure what to do at this point since from what I understand (and am probably very wrong about) is that the garbage collector clears this up to make room as the process moves along, with the exception of the LoH, which is cleaned up manually now using the new LargeObejctHeapCompactionMode property introduced in .net 4.5.1.
What am I missing here? I cannot build to 64 bit due to the old 32 bit libraries that contain proprietary algorithms that we do not have access to even dream of producing a 64 bit version of. Are there any modes in these profiles I should be using to identify exactly what is growing out of control here?
If this address space cannot be cleared up does this mean that all c# applications will eventually run "out of memory" because of this?
Nearly every article that I've used to get to this point suggests that out of memory actually means out of contiguous address space and given the amount that's in use, I agree with this.
This is a reasonable hypothesis, but even reasonable hypotheses can be wrong. Yours probably is wrong. What should you do?
Test it with science. That is, look for evidence that falsifies your hypothesis. You want to assume that it is anything else, and be forced by the evidence you've gathered that your hypothesis is not false.
So:
at the point where your application runs out of memory, is it actually out of contiguous free pages of the necessary size? It sure sounds like your observations do not indicate that this is true, so the hypothesis is probably false.
What is other evidence that the hypothesis might be false?
"After a few hours of use in jobs that have many plots, the application eventually runs out of memory."
"Uses DevExpress, a third party control vendor"
"the error dialog cannot even draw the error icon in the MessageBox"
None of this sounds like an out of memory problem. This sounds like a third party control library leaking OS handles for graphics objects. Unfortunately, such leaks usually surface as "out of memory" errors and not "out of handles" errors.
So, that's a new hypothesis. Look for evidence for and against this hypothesis too. You're doing a good job by using a memory profiler. Use a handle profiler next.
If this address space cannot be cleared up does this mean that all c# applications will eventually run "out of memory" because of this?
Nope. The GC does a good job of cleaning up managed memory; lots of applications have no problem running forever without leaking.
I am working to diagnose a series of OutOfMemoryException problems within an application of ours. This is an internal 32-bit (x86) OWIN-hosted WebAPI that runs within a console application and talks to a series of hardware components in parallel. For that reason it's creating around 20 instances of a library, and the sharp increase in "virtual size" memory matches when those instances are created.
From the output of Process Explorer, and dotMemory, it does not appear that we're allocating that much actual memory within this application:
From reading many, many SO answers I think I understand that our problem is either from fragmentation within the G0, G1, G2 & LOH heaps, or we're possibly bumping into the 2GB addressable memory limit for a 32-bit process running on Windows 7. This application works in batches where it collects a bunch of data from hardware devices, creates collections in memory to aggregate that data into a single object, and then saves it to be retrieved by a client app. This activity is the cause of the spikes in the dotMemory visual, but these data structures are not enormous, which I think the dotMemory chart shows.
Looking at the heaps has shown they rarely grow beyond 10-15MB in size, and I don't see much evidence that the LOH is growing too large or being severely fragmented. I'm really struggling with how to proceed to better understand what's happening here.
So my question is two-fold:
Is it conceivable that we could be hitting that 2GB limit for virtual memory, and that's a cause for these memory exceptions?
If that is a possible cause then am I right in thinking a 64-bit build would get around that?
We are exploring moving to a 64-bit build, but that would require updating some low-level libraries we use to also be 64-bit. It's certainly an option we will explore eventually (if not sooner), but we're trying to understand this situation better before investing the time required.
Update after setting the LARGEADDRESSFLAG
Based a recommendation I set that flag on the binary and interestingly saw the virtual size jump immediately to nearly 3GB. I don't know if I should be alarmed by that?!
I will monitor the application with this configuration for the next several hours.
In my case the advice provided by #ThomasWeller was indeed correct and enabling the "large address aware" flag has allowed this application to run for several days without throwing memory exceptions.
I have an event driven app that I was tasked with maintaining.
About 100 events run every 30 seconds, on separate timers. Over time the events alias into a constant stream of about 1-3 events per second.
Memory usage does not appear dependent on the number of events firing in any given second.
Each event polls data from a Webservice, checks the data using a LINQ2SQL DataContext against the previously polled data (I do not dispose or null out the DataContext when done), and if the data is different, updates the database and pushes the new data as an XML message to receiver service via TCP.
This app appears to have a memory leak which
only manifests after 30m+ of running (either debug or release)
won't manifest when profiling [I'm using .NET Memory Profiler 4.5]
Characteristics:
On startup the program uses ~30MB. As time progresses this Memory usage in Task Manager will begin pogoing, first only slightly, between 50 and 150MB, and eventually gets worse, oscillating between 200MB and 1GB+. When this happens, it happens a few times within a second or two, then settles down at ~150MB for the next 10-20 or so seconds.
I've been trying to catch this behavior in action using memory profiling. So far I've been unsuccessful, I can't get the app to pogo or oscillate in memory usage anywhere near like I can when the profiler isn't watching.
However, I've been noticing a square-wave sort of pattern on the memory usage as the Garbage Collector stages 1 and 2 run that looks very similar to what I see in Task Manager, except the memory usage oscillations in the square-wave are 10MB wide, instead of 800MB+ (200MB to 1GB+). Now, according to Google Images, Garbage collection in a properly functioning app looks more like a sawtooth wave than square.
I frankly don't see any way that my app could be pogoing between 200MB and 1GB+ of memory usage within a second and NOT be spiking the CPU to 100%.
I have read about some problems that can manifest between garbage collection + event handling, but I have several paths I could go investigate and am trying to narrow down which one to spend time on. I'm still pretty slow at .NET and haven't developed the "intuition" I have for embedded devices running C that generally helps me filter what I should investigate first.
What if FEELS like is perhaps some event handlers are losing and re-gaining references to [massive amounts of data] (I don't know how this could even happen?) seeing as memory usage appears to spike back up to 1GB soon after the garbage collector runs and drops memory usage back to 200MB.
Previous versions of this app did not have these problems. Two changes I have made since then include
utilizing LINQ2SQL instead of our own data manager (which had an ADORecordSetHelper object we utilized to execute hardcoded SQL statements)
changing the piece of software we use to send the TCP XML messages to a receiver.
Due to the simplicity of the what we're doing in #2, it COULD be the source of the problem but this memory usage behavior makes me think otherwise.
I guess my main questions at this point are
Should I be calling dispose on my LINQ2SQL DataContexts before I return from the method I create them in?
Should I null them out instead?
if an exception were occuring somewhere in a method after creation of a DataContext, could it cause the DataContext to be kept in memory indefinitely?
if I store a result from a LINQ query to a value-type (ie int not var), is it lazy-loaded then, or lazy'd when the variable is used?
how possible is it for event-driven frameworks to hypothetically lose and regain references?
edit: the events have instance-based subscriptions like discussed here and are never unsubscribed for the life of the app.
edit2: finally managed to catch it in the profiler, appears to be a 200MB system.string that's being created somehow. Thanks everyone for ruling out GC behavior.
Most of the times, memory leaks are caused by weird references between objects (events and delegates are also included here).
What I think you could try is the following:
Run the application and reproduce the issue. When the private working set of memory hits a very high value, right click the process on task manager and select "Create Dump File". This will be a lot less intrusive than profiling the application live.
Download WinDBG and run it.
Open the memory dump by going to the File menu and selecting Open dump file (I cannot remember exactly what the name of the menu options is... should be easy to spot though).
Run the following commands:
.symfix
.loadby sos clr
!dumpheap -type [YourAssemblyNameSpacePrefix] -stat
The last command will give you all the instances in memory which are not CLR types, only your types. Look at the types which have a very high number of instances and try to see if anything doesn't look right.
If you see a very high number of objects of the same type run the following command which will show you all instances' addresses:
!dumheap -type [TheFullObjectTypeName]
You will need to select one single instance address. Now run the following command to see the references to that instance:
!gcroot [InstanceAddress]
Repeat step 6 a few times for different instances so that you can confirm the leak is coming from the same place or to help you identify what is causing those instances to not be collected (still being referenced by other objects).
If you don't see anything weird with your own types, change the !dumpheap command in step 4 to: !dumpheap -stat. This way you are not filtering by type and you will also see CLR types and third party libraries types.
This is a little bit complex but hopefully I was able to give you a method to help you figure out how to find memory leaks.
I have a C#.NET service running in production. The service functions as a TCP server to which clients register and make requests against. In looking at the Task Manager, it appears to be leaking about 10MB/day. I don't seem to notice these in dev (perhaps because of far less traffic and client activity). In searching around I've read that the Task Manager can be seriously wrong, but I'm not sure how accurate this is or in what circumstances the TM would display incorrect information.
To solve this problem I need to more closely monitor memory consumption. The problem is that the leak only seems to appear in production, where the deployed service was built for Release. Also since it's a service that can't be run directly be VS with an attached profiler/debugging, I'm not sure how to best pinpoint the problem with something more precise than TM.
Any group wisdom would be much appreciated, thanks.
EDIT:
I've added perfmon counters for the privates bytes of the service (7MB to start out) as well as CLR mem in all heaps (30MB to start out)
Task manager says the total memory to be ~37MB so this seems to make sense
The first part of this is to let the service go for a day and check out my counters again.
If my private bytes get huge but CLR mem is roughly static this would indicate an unmanaged leak. If both get huge then it's a managed leak.
Thanks guys.
Your first task is figuring out if the process is leaking memory. You can do this with perfmon measuring the Private Bytes
http://www.goldstarsoftware.com/papers/CapturingVirtualBytesToALogFile.pdf
If the graph is consistently rising (for say half an hour ) you have a memory leak. You can then use other counters to figure out if this is a .NET leak (.NET memory) though this is unlikely. I find that in most of these cases, there is a COM component that is being invoked but not released.
If you truly have a memory leak (and this isn't just variable memory usage)- the process will shutdown with an out of memory exception after running for a while.
You need one of the below MemoryProfilers in order to monitor it;
http://www.jetbrains.com/profiler/
http://www.red-gate.com/products/dotnet-development/ants-memory-profiler/
There are other choices but these are very capable and you can profile remote application's memory with them (at least JetBrains's solution handles that)
Follow this guide: http://blogs.msdn.com/b/tess/archive/2008/03/25/net-debugging-demos-lab-7-memory-leak.aspx
It goes over exactly what you're describing, a memory leak in production. As was mentioned you have to first determine whether it's unmanaged code or managed code that's leaking using perfmon and Private Bytes.
In general make sure for networking objects you're wrapping them in using statements so that they're properly disposed.
A workflow I often use for managed memory leaks is to start the server on a test machine, hit it with a known amount of connections (say 123,456 connections). Then take a memory snapshot by going to task manager and right clicking on the process name and selecting 'create dump'. Open this dump with WinDBG and SOS and run the command !dumpheap -stat. Look for objects that have a multiple of 123,456 instances. Should these objects still be in memory? If not run a !gcroot on an instance of those objects to find why it's still in memory.
Get a dump of the memory when its in a leak state using the Task Manager right click on the process and select create dump file. You can also use ProcDump which gives you more options.
Use SOS Extensions in either WinDebug or Visual Studio to inspect the memory.