I call a piece of an unmanaged C++ code from my C# application
to calculate fast fourier transform of a discrete time signal.
I make a call something like this
IntPtr ptr = ComputeFFTW(packetSig, packetSig.Length, (int)samplFrequency,(int)fftPoints);
unsafe
{
double *dPtr = (double*)ptr;
for(int l = 0; l < fftData.Length; l++)
{
fftData[l] = dPtr[l];
}
}
Though this snippet of code works fine and gives me the desired results, i can see that there is sort of performance hit (memory leak) is incurred while calculation is in progress. The CLR fails to reclaim the local (double) variables and my application gobbles up RAM space considerably.
Can anyone of you suggest the places where i might be doing it wrong.
From my side, I ran my application using ANTS Mem Profiler and i can see on the snapshot that the double objects nearly claim >150MB of the mem space. Is this a normal behaviour ??
Class Name Live Size (bytes) Live Instances
Double[] 150,994,980 3
Any help is appreciated in this regard
Srivatsa
Since the C++ function allocates memory you will have to manually free that chunk in your C# application (free the pointer). A better way to do invoke unmanaged code is to allocate all the variables and memory chunks (Temp parameters too) in your C# application and pass them to your C++ code as parameters. In this way you wont have any memory issues with your unmanaged code.
You can use Marshal.Copy(IntPtr, Double[], Int32, Int32) method to copy array of double values from unmanaged ptr to managed ffData array:
IntPtr ptr = ComputeFFTW(packetSig, packetSig.Length, (int)samplFrequency,(int)fftPoints);
Marshal.Copy(ptr, fftData, 0, fftData.Length);
If ComputeFFTW returns pointer to dynamically allocated memory, you need to release it after using. Make this in unmanaged code, add function like Release and pass ptr to it.
Related
I need to create a array that is aligned to a 64 byte boundary. I need to do this as I'm calling a DLL which uses AVX, which requires the data to be aligned. Essentially I need to do this in C#:
void* ptr = _aligned_malloc(64 * 1024, 64);
int8_t* memory_ptr = (int8_t*)ptr;
I'm pretty sure I can't create an array to such a boundary naturally in C#. So one option is to create an byte array that is x+64 long, and then 'create' an array that overlays it, but with an offset at the required boundary.
The problem is how do I accomplish this, and not have a memory leak? (Memory leaking is the reason I'd rather not use the DLL to create a reference to the array and pass it to C#. Unless there is a good way to do so?)
Using the helpful answers below, this is what I have, hopefully it helps others:
public class Example : IDisposable
{
private ulong memory_ptr;
public unsafe Example()
{
memory_ptr = (ulong)NativeMemory.AlignedAlloc(0x10000, 64);
}
public unsafe Span<byte> Memory => new Span<byte>((void*)memory_ptr, 0x10000);
public unsafe void Dispose()
{
NativeMemory.Free((void*)memory_ptr);
}
}
As mentioned, .NET 6 has NativeMemory.AlignedAlloc. You need to make sure to call AlignedFree otherwise you could get a leak.
void* a = default;
try
{
a = NativeMemory.AlignedAlloc(size * sizeof(long), 64);
var span = new Span<long>(a, size);
// fill span
// call DLL with span
}
finally
{
NativeMemory.AlignedFree(a);
}
A pinned GCHandle is another option for older versions of .NET. You then need to calculate the starting aligned offset with the following code, where alignment would be 64 in your case.
var ptr = (long)handle.AddrOfPinnedObject();
var offset = (int) ((ptr + alignment - 1) / alignment * alignment - ptr) / sizeof(long);
Again you need to make sure to call handle.Free in a finally.
To avoid the memory leak, first you need to pin the array. Pinning prevents the object pointed to from moving on the garbage-collected heap.
There's an example of something similar to what you're doing here.
However, that example doesn't go far enough as it only pins without controlling the initial memory allocation. To also prevent the memory leak, instead use GCHandle.Alloc with GCHandleType.Pinned. Like this.
I have a C++ dll which is reading video frames from a camera. These frames get allocated in the DLL returned via pointer to the caller (a C# program).
When C# is done with a particular frame of video, it needs to clean it up. The DLL interface and memory management is wrapped in a disposable class in C# so its easier to control things. However, it seems like the memory doesn't get freed/released. The memory footprint of my process grows and grows and in less than a minute, I get allocation errors in the C++ DLL as there isn't any memory left.
The video frames are a bit over 9 MB each. There is a lot of code, so I'll simply provide the allocation/deallocations/types/etc.
First : Allocation in C++ of raw buffer for the camera bytes.
dst = new unsigned char[mFrameLengthInBytes];
Second : transfer from the raw pointer back to across the DLL boundary as an unsigned char * and into an IntPtr in C#
IntPtr pFrame = VideoSource_GetFrame(mCamera, ImageFormat.BAYER);
return new VideoFrame(pFrame, .... );
So now the IntPtr is passed into the CTOR of the VideoFrame class. Inside the CTOR the IntPtr is copied to an internal member of the class as follows :
IntPtr dataPtr;
public VideoFrame(IntPtr pDataToCopy, ...)
{
...
this.dataPtr = pDataToCopy;
}
My understanding is that is a shallow copy and the class now references the original data buffer. The frame is used/processed/etc. Later, the VideoFrame class is disposed and the following is used to clean up the memory.
Marshal.FreeHGlobal(this.dataPtr);
I suspect the problem is that... dataPtr is an IntPtr and C# has no way to know that the underlying buffer is actually 9 MB, right? Is there a way to tell it how much memory to release at that point? Am I using the wrong C# free method? Is there one specifically for this sort of situation?
You need to call the corresponding "free" method in the library you're using.
Memory allocated via new is part of the C++ runtime, and calling FreeHGlobal won't work. You need to call (one way or the other) delete[] against the memory.
If this is your own library then create a function (eg VideoSource_FreeFrame) that deletes the memory. Eg:
void VideoSource_FreeFrame(unsigned char *buffer)
{
delete[] buffer;
}
And then call this from C#, passing in the IntPtr you got back.
You need to (in c++) delete dst;. That means you need to provide an API that the C# code can call, like FreeFrame(...), which does exactly that.
I agree with the first answer. Do NOT free it in C# code, using any magical, liturgical incantations. Write a method in C++ that free's the memory, and call it from your C# code. Do NOT get into the habit of allocationg memory in one heap (native) and freeing it another heap (managed), that's just bad news.
Remember one of the rules from the book effective C++: Allocate memory in the constructor, and deallocate in the destructor. And if you can't do it in the destructor, do it in an in-class method, not some global (or even worse) friend function.
Please have a look at the following c# code:
double* ptr;
fixed(double* vrt_ptr = &vertices[0])
{
fixed(int* tris_ptr = &tris[0])
{
ptr = compute(vrt_ptr, 5, (double*)tris_ptr, 5);
// compute() is a native C++ function
}
}
Debug.Log("Vertices Recieved: " + *ptr);
/* and so on */
I am having garbage value from *ptr. I have a suspicion that the array assigned to ptr by compute doesn't retain outside fixed block. Is it so?? Or is it due to some other problem?
This is not valid code, the garbage collector can only update the value of the vrt_ptr and tris_ptr variables. But the unmanaged code uses a copy of these pointers, the value of the copy cannot be updated by the GC. So if a garbage collection occurs while the unmanaged code is running, possible for example when other threads in the program trigger a collection, then the unmanaged code will read garbage data through the pointer copy. Very hard to diagnose, it doesn't happen very often.
You must pin the vertices and tris arrays. In your case already ably done by the pinvoke marshaller, simply by passing the arrays directly without using fixed. Fix:
double* ptr = compute(vertices, 5, tris, 5);
Adjust the pinvoke declaration accordingly, replacing double* with double[].
You'll now also have to the deal with the likely reason you wrote this code in the first place. There is no scenario where casting an int[] to double[] is ever valid, the likely reason you got a garbage result early before that GC disaster could strike. If you can't update the declaration of tris for some reason then you must create a double[] before the call.
I would like to calculate how many bytes my function fills so that I can inject it into another process using CreateRemoteThread(). Once I know the number of bytes, I can write them into the remote process using the function's pointer. I have found an article online (see http://www.codeproject.com/KB/threads/winspy.aspx#section_3, chapter III) where they do the following in C++ :
// ThreadFunc
// Notice: - the code being injected;
//Return value: password length
static DWORD WINAPI ThreadFunc (INJDATA *pData)
{
//Code to be executed remotely
}
// This function marks the memory address after ThreadFunc.
static void AfterThreadFunc (void) {
}
Then they calculate the number of bytes ThreadFunc fills using :
const int cbCodeSize = ((LPBYTE) AfterThreadFunc - (LPBYTE) ThreadFunc);
Using cbCodeSize they allocate memory in the remote process for the injected ThreadFunc and write a copy of ThreadFunc to the allocated memory:
pCodeRemote = (PDWORD) VirtualAllocEx( hProcess, 0, cbCodeSize, MEM_COMMIT, PAGE_EXECUTE_READWRITE );
if (pCodeRemote == NULL)
__leave;
WriteProcessMemory( hProcess, pCodeRemote, &ThreadFunc, cbCodeSize, &dwNumBytesXferred );
I would like to do this in C#. :)
I have tried creating delegates, getting their pointers, and subtracting them like this:
// Thread proc, to be used with Create*Thread
public delegate int ThreadProc(InjectionData param);
//Function pointer
ThreadFuncDeleg = new ThreadProc(ThreadFunc);
ThreadFuncPtr = Marshal.GetFunctionPointerForDelegate(ThreadFuncDeleg);
//FunctionPointer
AfterThreadFuncDeleg = new ThreadProc(AfterThreadFunc);
IntPtr AfterThreadFuncDelegPtr= Marshal.GetFunctionPointerForDelegate(AfterThreadFuncDeleg);
//Number of bytes
int cbCodeSize = (AfterThreadFuncDelegPtr.ToInt32() - ThreadFuncPtr.ToInt32())*4 ;
It just does not seem right, as I get a static number no matter what I do with the code.
My question is, if possible, how does one calculate the number of bytes a function's code fills in C#?
Thank you in advance.
I don't think it is possible due dynamic optimization and code generation in .NET. You can try to measure IL-code length but when you try to measure machine-depended code length in general case it will fail.
By 'fail' I mean you can't get correct size that provide any meaning by using this technique dynamically.
Of course you can go with finding how NGEN, JIT compile works, pdb structure and try to measure. You can determine size of your code by exploring generated machine code in VS for example.
How to see the Assembly code generated by the JIT using Visual Studio
If you really need to determine size, start with NET Internals and Code Injection / NET Internals and Native Compiling but I can't imagine why you ever want it.
Be aware all internals about how JIT works exactly is subject to change so depending solution can be broken by any future version of .NET.
If you want to stick with IL: check Profiling Interfaces (CLR Profiling API), and a bit old articles: Rewrite MSIL Code on the Fly with the .NET Framework Profiling API and No Code Can Hide from the Profiling API in the .NET Framework 2.0. There are also some topics about CLR Profiling API here on SO.
But simplest way to explore assembly is Reflection API, you want MethodBody there. So you can check Length of MethodBody.GetILAsByteArray and you'll find method length in IL-commands.
I'm developing an application which currently have hundreds of objects created.
Is it possible to determine (or approximate) the memory allocated by an object (class instance)?
You could use a memory profiler like
.NET Memory Profiler (http://memprofiler.com/)
or
CLR Profiler (free) (http://clrprofiler.codeplex.com/)
A coarse way could be this in-case you wanna know whats happening with a particular object
// Measure starting point memory use
GC_MemoryStart = System.GC.GetTotalMemory(true);
// Allocate a new byte array of 20000 elements (about 20000 bytes)
MyByteArray = new byte[20000];
// Obtain measurements after creating the new byte[]
GC_MemoryEnd = System.GC.GetTotalMemory(true);
// Ensure that the Array stays in memory and doesn't get optimized away
GC.KeepAlive(MyByteArray);
process wide stuff could be obtained perhaps like this
long Process_MemoryStart = 0;
Process MyProcess = System.Diagnostics.Process.GetCurrentProcess();
Process_MemoryStart = MyProcess.PrivateMemorySize64;
hope this helps ;)
The ANTS memory profiler will tell you exactly how much is allocated for each object/method/etc.
Here's a related post where we discussed determining the size of reference types.
You can also use WinDbg and either SOS or SOSEX (like SOS with with a lot more commands and some existing ones improved) WinDbg extensions. The command you would use to analyze an object at a particular memory address is !objsize
One VERY important item to remember is that !objsize only gives you the size of the class itself and DOES NOT necessarily include the size of the aggregate objects contained inside the class - I have no idea why it doesn't do this as it is quite frustrating and misleading at times.
I've created 2 Feature Suggestions on the Connect website that ask for this ability to be included in VisualStudio. Please vote for the items of you would like to see them added as well!
https://connect.microsoft.com/VisualStudio/feedback/details/637373/add-feature-to-debugger-to-view-an-objects-memory-footprint-usage
https://connect.microsoft.com/VisualStudio/feedback/details/637376/add-feature-to-debugger-to-view-an-objects-rooted-references
EDIT:
I'm adding the following to clarify some info from the answer provided by Charles Bretana:
the OP asked about the size of an 'object' not a 'class'. An object is an instance of a class. Maybe this is what you meant?
The memory allocated for an object does not include the JITted code. The JIT code lives in its own 'JIT Code Heap'.
The JIT only compiles code on a method by method basis - not at a class level. So if a method never gets called for a class, it is never JIT compiled and thus never has memory allocated for it on the JIT Code Heap.
As an aside, there are about 8 different heaps that the CLR uses:
Loader Heap: contains CLR structures and the type system
High Frequency Heap: statics, MethodTables, FieldDescs, interface map
Low Frequency Heap: EEClass, ClassLoader and lookup tables
Stub Heap: stubs for CAS, COM wrappers, P/Invoke
Large Object Heap: memory allocations that require more than 85k bytes
GC Heap: user allocated heap memory private to the app
JIT Code Heap: memory allocated by mscoreee (Execution Engine) and the JIT compiler for managed code
Process/Base Heap: interop/unmanaged allocations, native memory, etc
HTH
Each "class" requires enough memory to hold all of it's jit-compiled code for all it's members that have been called by the runtime, (although if you don't call a method for quite some time, the CLR can release that memory and re-jit it again if you call it again... plus enough memory to hold all static variables declared in the class... but this memory is allocated only once per class, no matter how many instances of the class you create.
For each instance of the class that you create, (and has not been Garbage collected) you can approximate the memory footprint by adding up the memory usage by each instance-based declared variable... (field)
reference variables (refs to other objects) take 4 or 8 bytes (32/64 bit OS ?)
int16, Int32, Int64 take 2,4, or 8 bytes, respectively...
string variable takes extra storage for some meta data elements, (plus the size of the address pointer)
In addition, each reference variable in an object could also be considered to "indirectly" include the memory taken up on the heap by the object it points to, although you would probably want to count that memory as belonging to that object not the variable that references it...
etc. etc.
To get a general sense for the memory allocation in your application, use the following sos command in WinDbg
!dumpheap -stat
Note that !dumpheap only gives you the bytes of the object type itself, and doesn't include the bytes of any other object types that it might reference.
If you want to see the total held bytes (sum all the bytes of all objects referenced by your object) of a specific object type, use a memory profiler like dot Trace - http://www.jetbrains.com/profiler/
If you can - Serialize it!
Dim myObjectSize As Long
Dim ms As New IO.MemoryStream
Dim bf As New Runtime.Serialization.Formatters.Binary.BinaryFormatter()
bf.Serialize(ms, myObject)
myObjectSize = ms.Position
There is the academic question of What is the size of an object at runtime? And that is interesting, but it can only be properly answered by a profiler that is attached to the running process. I spent quite a while looking at this recently and determined that there is no generic method that is accurate and fast enough that you would ever want to use it in a production system. Simple cases like arrays of numerical types have easy answers, but beyond this the best answer would be Don't bother trying to work it out. Why do you want to know this? Is there other information available that could serve the same purpose?
In my case I ended up wanting to answer this question because I had various data that were useful, but could be discarded to free up RAM for more critical services. The poster boys here are an Undo Stack and a Cache.
Eventually I concluded that the right way to manage the size of the undo stack and the cache was to query for the amount of available memory (it's a 64-bit process so it is safe to assume it is all available) and then allow more items to be added if there is a sufficiently large buffer of RAM and require items to be removed if RAM is running low.
For any Unity Dev lurking around for an answer, here's a way to compare two different class memory allocations inspired by #varun's answer:
void Start()
{
var totalMemory = System.GC.GetTotalMemory(false);
var class1 = new Class1[100000];
System.GC.KeepAlive(class1);
for (int i = 0; i < 100000; i++)
{
class1[i] = new Class1();
}
var newTotalMemory = System.GC.GetTotalMemory(false);
Debug.Log($"Class1: {newTotalMemory} - {totalMemory} = {newTotalMemory - totalMemory}");
var class2 = new Class2[100000];
System.GC.KeepAlive(class2);
for (int i = 0; i < 100000; i++)
{
class2[i] = new Class2(10, 10);
}
var newTotalMemory2 = System.GC.GetTotalMemory(false);
Debug.Log($"Class2: {newTotalMemory2} - {newTotalMemory} = {newTotalMemory2 - newTotalMemory}");
}