How to get around Marshal.Copy (32bit) length limit?

How to get around Marshal.Copy (32bit) length limit? - c#

I'm trying to move data back/forth between managed (C#) and unmanaged (C++ Win32) code. I can use Marshal.Copy and it works just fine until the datasets get bigger > 2GB as Marshal.Copy has a signed 32 bit int (2GB) limit for length.
Any idea how to get around this? Currently I use AllocHGlobal(IntPtr) on the managed side and .ToPointer() on the unmanaged side. If I can't use Marshal.Copy to move large data (> 2GB) back/forth what can I use?

My first reaction was: why are your copying 2GB+ of data?
Perhaps your application constraints won't allow it, but it seems to me that if your data set is that larger than what is allowed by the framework you should not be looking for trick to get around the framework. How about another method of access altogether?
There are numerous ways around this problem. For starters you could wrap the memory in a stream and pull the data into the unmanaged code. You could also create your own interface to bring the data in piece meal. Memory mapped files come to mind as well.
Without knowing the specific contstraints of the application, maybe you cannot change the unmanaged code, I would suggest finding another method rather than working around the framework.

Related

Marshal C# struct into a byte[]

Is there a way to serialize a C# structure, annotated with [StructLayout], into a managed byte array, i. e. a byte[], either premade or with allocation?
I can see marshaling to unmanaged memory then copying, but that's ugly.

Checkout MemoryMarshal.Cast<TFrom, TTo>(). It will easily allow you to change from byte/short/int/long arrays to Structures and back.
https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.memorymarshal?view=netcore-3.1

From my experiences mixing managed and unmanaged data is all about clearly defining the transition from one space to another.
When i have had the requirement of going from native to managed or the other way the first step has always been to copy data to the 'target' space and then forward that.
I assume that you are already familiar with the interop services since you mentioned copying and [StructLayout].
https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal.ptrtostructure?view=netcore-3.1#System_Runtime_InteropServices_Marshal_PtrToStructure_System_IntPtr_System_Object_
If you find a better way please do tell

Working with strings between C# and C++

I have a problem that I think is very common among low-level developers whose start working with high-level languages. I made some research but didn't find a suitable solution to my problem.
I have a client-server application and I want to re-create all the client, from "scratch" using C# but as I want to keep the server as it is, the "new" updated client have to follow the actual protocols of sending and receiving data packets.
If the packets just contains unmanaged types (byte, int, etc) it should be ok, but my problem comes up when I have to work with strings between the managed client in C# and the unmanaged server in C/C++.
I already have found some solutions for this problem, but I really don't know which is the best or if there is a better solution that i didn't notice.
The first approach is use the fixed keyword to create structs that represents each packet of the application and doing this the C# allows me to easily and with very low cost (I guess) do my pointer casts (from byte* to packet_structure* and vice-versa) without worry about all the marshaling and other expensive methods. I think this approach is particular bad because forces me to use a sbyte data type on the buffers where should be a string in the packet and then make a cast to bring these bytes to the managed world to be properly processed.
The other approach I have made was to declare the struct of each packet as managed but marked with MarshalAs attributes to make use of the Marshal's methods for converting the data. This approach I think is more expensive, considering that the application is a game and the client, which I'm doing, is a real-time application. I think that make extensive use of marshaling on each packet can be very much expensive for the application.
Is there any simple and beautiful solution which I could follow something like C style or any other solution that my inexperience is denying me? In C a very simple pointer cast should be enough to free me the problems.
I'm in love with C# but I'm veeeery afraid I couldn't do this with this platform :(
Anyone could please shed me some light? :/
Thanks very much for your time, any kind of help will be VERY appreciated.

What is unsafe code in C# and why would you use it? [duplicate]

Read this question today about safe and unsafe code I then read about it in MSDN but I still don't understand it. Why would you want to use pointers in C#? Is this purely for speed?

There are three reasons to use unsafe code:
APIs (as noted by John)
Getting actual memory address of data (e.g. access memory-mapped hardware)
Most efficient way to access and modify data (time-critical performance requirements)

Sometimes you'll need pointers to interface your C# to the underlying operating system or other native code. You're strongly discouraged from doing so, as it is "unsafe" (natch).
There will be some very rare occasions where your performance is so CPU-bound that you need that minuscule extra bit of performance. My recommendation would be to write those CPU-intesive pieces in a separate module in assembler or C/C++, export an API, and have your .NET code call that API. An possible additional benefit is that you can put platform-specific code in the unmanaged module, and leave the .NET platform agnostic.

I tend to avoid it, but there are some times when it is very helpful:
for performance working with raw buffers (graphics, etc)
needed for some unmanaged APIs (also pretty rare for me)
for cheating with data
For example of the last, I maintain some serialization code. Writing a float to a stream without having to use BitConverter.GetBytes (which creates an array each time) is painful - but I can cheat:
float f = ...;
int i = *(int*)&f;
Now I can use shift (>>) etc to write i much more easily than writing f would be (the bytes will be identical to if I had called BitConverter.GetBytes, plus I now control the endianness by how I choose to use shift).

There is at least one managed .Net API that often makes using pointers unavoidable. See SecureString and Marshal.SecureStringToGlobalAllocUnicode.
The only way to get the plain text value of a SecureString is to use one of the Marshal methods to copy it to unmanaged memory.

Passing an array pointer for placing in a hashmap in C++

I'm kinda new to C++ (coming from C#).
I'd like to pass an array of data to a function (as a pointer).
void someFunc(byte *data)
{
// add this data to a hashmap.
Hashtable.put(key, data)
}
This data will be added into a hashmap (some key-value based object).
In C#, i could just add the passed reference to a dictionary and be done with it.
Can the same be done in C++ ? or must i create a COPY of the data, and only add that to the data structure for storing it ?
I have seen this pattern in some code examples, but i am not 100% sure why it is needed, or whether it can be avoided at certain times.

Not sure where your key is coming from... but the std::map and the std::unordered_map are probably what you are looking for.
Now the underlying data structure of the std::map is a binary tree, while the std::unordered_map is a hash.
Furthermore, the std::unordered_map is an addition in the C++ 11 standard.

It all depends how you pass the data and how it is created. If the data is created on the heap(by using new) you can just put the pointer or reference you have to it in your table. On the other hand if the function takes the argument by value you will need to make a copy at some point, because if you store the address of a temp bad things will happen :).
As for what data structure to use, and how they work, I've found one of the best references is cppreference
Heap allocation should be reserved for special cases. stack allocation is faster and easier to manage, you should read up on RAII(very important). As for other reading try finding out stuff on dynamic vs. automatic memory allocation.
Just found this read specifically saying C# to C++ figured it'd be perfect for you, good luck c++ can be one of the more difficult languages to learn so don't assume anything will work the same as it does in C#. MSDN has a nice C# vs. C++ thing yet

How to share a big byte array between C++ and C#

I need to share a huge (many megabytes) byte array between a C++ program residing in a DLL and a C# program.
I need realtime performance, so it is very important I can share it between the two in an efficient way, so making a new copy for it in C# every time after the data is manipulated in C++ is not an option, yet the examples I have found so far seems to depend on this.
Is it possible to share the array in an efficient way? And if so, how?

In current versions of .NET, any multi-megabyte array will end up on the large object heap and never move. However, to be safe, you should pin the array as fejesjoco said. Then the C++ code can save a pointer into the .NET array and update it in-place.

Use memory mapped file. System.IO.MemoryMappedFiles in .NET and CreateFileMapping in C++.

Does the .NET marshaler pass a copy, not a reference? If so, then call GCHandle.Alloc(array, GCHandleType.Pinned), then you can get the address of this pinned object and pass that to the DLL as a pointer.

In vb.net, I've accomplished this by passing a reference to the first byte. There's no type-checking to ensure that what's passed is really an array of the proper size, but as long as the routine is running the array will be pinned.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.