I am receiving a buffer and i want from it to create a new buffer ( concatenating bytes prefixed,infixed and postfixed) and send it later on to a socket.
Eg:
Initial buffer: "aaaa"
Final buffer: "$4\r\naaaa\r\n" (Redis RESP Protocol - Bulk Strings)
How can i transform the span to memory ? (I do not know if i should use stackalloc given the fact that i do not know how big the input buffer is.I figured it would be faster).
private static readonly byte[] RESP_BULK_ID =BitConverter.GetBytes('$');
private static readonly byte[] RESP_FOOTER = Encoding.UTF8.GetBytes("\r\n");
static Memory<byte> GetNodeSpan(in ReadOnlyMemory<byte> payload) {
ReadOnlySpan<byte> payloadHeader = BitConverter.GetBytes(payload.Length);
Span<byte> result = stackalloc byte[
RESP_BULK_ID.Length +
payloadHeader.Length +
RESP_FOOTER.Length +
payload.Length +
RESP_FOOTER.Length
];
Span<byte> cursor = result;
RESP_BULK_ID.CopyTo(cursor);
cursor=cursor.Slice(RESP_BULK_ID.Length);
payloadHeader.CopyTo(cursor);
cursor = cursor.Slice(payloadHeader.Length);
RESP_FOOTER.CopyTo(cursor);
cursor = cursor.Slice(RESP_FOOTER.Length);
payload.Span.CopyTo(cursor);
cursor = cursor.Slice(payload.Span.Length);
RESP_FOOTER.CopyTo(cursor);
return new Memory<byte>(result.AsBytes()) // ?can not convert from span to memory ,and cant return span because it can be referenced outside of scope
}
P.S : Should i use old-school for loops instead of CopyTo?
Memory<T> is designed to have some managed object (for example an array) as a target. Converting Memory<T> to Span<T> then simply pins target object in memory and uses it's address to construct Span<T>. But opposit conversion is not possible - because Span<T> can point to part of memory that does not belong to any managed object (unmanaged memory, stack, etc.), it is not possible to directly convert Span<T> to Memory<T>. (There is actually way to do this, but it involves implementing your own MemoryManager<T> similar to NativeMemoryManager, is unsafe and dangerous and I'm pretty sure it is not what you want).
Using stackalloc is a bad idea for two reasons:
Since you don't know size of the payload in advace, you could easily get StackOverflowException if payload is too big.
(as comment in your source code already suggests) It is terrible idea trying to return something allocated on the stack of current method, as it would likely result in either corrupted data or application crash.
The only way to return result on the stack would require caller of GetNodeSpan to stackalloc memory in advance, convert it to Span<T> and pass it as an additional argument. Problem is that (1) caller of GetNodeSpan would have to know how much to allocate and (2) would not help you convert Span<T> to Memory<T>.
So to store the result, you will need object allocated on the heap. The simple solution is just to allocate new array, instead of stackalloc. Such array can then be used to construct Span<T> (used for copying) as well as Memory<T> (used as a method result):
static Memory<byte> GetNodeSpan(in ReadOnlyMemory<byte> payload)
{
ReadOnlySpan<byte> payloadHeader = BitConverter.GetBytes(payload.Length);
byte[] result = new byte[RESP_BULK_ID.Length +
payloadHeader.Length +
RESP_FOOTER.Length +
payload.Length +
RESP_FOOTER.Length];
Span<byte> cursor = result;
// ...
return new Memory<byte>(result);
}
The obvious drawback is that you have to allocate new array for each method call. To avoid this, you can use memory pooling, where allocated arrays are reused:
static IMemoryOwner<byte> GetNodeSpan(in ReadOnlyMemory<byte> payload)
{
ReadOnlySpan<byte> payloadHeader = BitConverter.GetBytes(payload.Length);
var result = MemoryPool<byte>.Shared.Rent(
RESP_BULK_ID.Length +
payloadHeader.Length +
RESP_FOOTER.Length +
payload.Length +
RESP_FOOTER.Length);
Span<byte> cursor = result.Memory.Span;
// ...
return result;
}
Please note that this solution returns IMemoryOwner<byte> (instead of Memory<T>). Caller can access Memory<T> with IMemoryOwner<T>.Memory property and must call IMemoryOwner<byte>.Dispose() to return array back to pool when memory is no longer needed. Second thing to notice is that MemoryPool<byte>.Shared.Rent() can actually return array that is longer than required minimum. Thus your method will probably need to also return actual length of the result (for example as an out parameter), because IMemoryOwner<byte>.Memory.Length can return more than was actually copied to the result.
P.S.: I would expect for loop to be marginally faster only for copying very short arrays (if at all), where you can save a few CPU cycles by avoiding method call. But Span<T>.CopyTo() uses optimized method that can copy several bytes at once and (i strongly believe) uses special CPU instructions for copying blocks of memory and therefore should be much faster.
Related
I need to create a array that is aligned to a 64 byte boundary. I need to do this as I'm calling a DLL which uses AVX, which requires the data to be aligned. Essentially I need to do this in C#:
void* ptr = _aligned_malloc(64 * 1024, 64);
int8_t* memory_ptr = (int8_t*)ptr;
I'm pretty sure I can't create an array to such a boundary naturally in C#. So one option is to create an byte array that is x+64 long, and then 'create' an array that overlays it, but with an offset at the required boundary.
The problem is how do I accomplish this, and not have a memory leak? (Memory leaking is the reason I'd rather not use the DLL to create a reference to the array and pass it to C#. Unless there is a good way to do so?)
Using the helpful answers below, this is what I have, hopefully it helps others:
public class Example : IDisposable
{
private ulong memory_ptr;
public unsafe Example()
{
memory_ptr = (ulong)NativeMemory.AlignedAlloc(0x10000, 64);
}
public unsafe Span<byte> Memory => new Span<byte>((void*)memory_ptr, 0x10000);
public unsafe void Dispose()
{
NativeMemory.Free((void*)memory_ptr);
}
}
As mentioned, .NET 6 has NativeMemory.AlignedAlloc. You need to make sure to call AlignedFree otherwise you could get a leak.
void* a = default;
try
{
a = NativeMemory.AlignedAlloc(size * sizeof(long), 64);
var span = new Span<long>(a, size);
// fill span
// call DLL with span
}
finally
{
NativeMemory.AlignedFree(a);
}
A pinned GCHandle is another option for older versions of .NET. You then need to calculate the starting aligned offset with the following code, where alignment would be 64 in your case.
var ptr = (long)handle.AddrOfPinnedObject();
var offset = (int) ((ptr + alignment - 1) / alignment * alignment - ptr) / sizeof(long);
Again you need to make sure to call handle.Free in a finally.
To avoid the memory leak, first you need to pin the array. Pinning prevents the object pointed to from moving on the garbage-collected heap.
There's an example of something similar to what you're doing here.
However, that example doesn't go far enough as it only pins without controlling the initial memory allocation. To also prevent the memory leak, instead use GCHandle.Alloc with GCHandleType.Pinned. Like this.
I'm using a library which has a function SendBuffer(int size, IntPtr pointer) with IntPtr as a parameter.
var list = new List<float>{3, 2, 1};
IntPtr ptr = list.getPointerToInternalArray();
SendBuffer(ptr, list.Count);
How to get IntPtr from the array stored in List<T> (and/or T[])?
If this is a P/Invoke call to an unmanaged code you should retrieve the pinned address of the buffer (to prevent that GC relocate the buffer) and pass this to the method:
// use an array as a buffer
float[] buffer = new float[]{3, 2, 1};
// pin it to a fixed address:
GCHandle handle = GCHandle.Alloc(buffer, GCHandleType.Pinned);
try
{
// retrieve the address as a pointer and use it to call the native method
SendBuffer(handle.AddrOfPinnedObject(), buffer.Length);
}
finally
{
// free the handle so GC can collect the buffer again
handle.Free();
}
The array is sent every frame and it's big
In that case it might be warranted to access the internal backing array that List uses. This is a hack and brittle in the face of future .NET versions. That said .NET uses a very high compatibility bar and they probably would not change a field name in such a core type. Also, for performance reasons it is pretty much guaranteed that List will always use a single backing array for its items. So although this is a high risk technique it might be warranted here.
Or, better yet, write your own List that you control and that you can get the array from. (Since you seem to be concerned with perf I wonder why you are using List<float> anyway because accessing items is slower compared to a normal array.)
Get the array, then use fixed(float* ptr = array) SendBuffer(ptr, length) to pin it and pass it without copying memory.
There is not need to use the awkward and slow GCHandle type here. Pinning using fixed uses an IL feature to make this super fast. Should be near zero cost.
No guarantee that the internal representation of a List<T> is going to be a single array... in fact it's pretty likely that it's not. So you need to create a local array copy using ToArray in order for this to work.
Once you have, there are a couple of options.
First you can use the fixed keyword to pin the array and get a pointer to it:
T[] buffer = theList.ToArray();
unsafe
{
fixed (T* p = buffer)
{
IntPtr ptr = (IntPtr)p;
SomeFunction(ptr);
}
}
Alternatively you can tell the garbage collector to fix the data in memory until you're done with the operation, like this:
GCHandle pinned = GCHandle.Alloc(buffer, GCHandleType.Pinned);
IntPtr ptr = pinned.AddrOfPinnedObject();
SomeFunction(ptr);
pinnedArray.Free();
(Or see taffer's answer with more error handling).
In both cases you need to finish with the value before returning, so you can't use either method to get an IntPtr to the array as a return value. Doing it this way minimizes the opportunity for that pointer to be used for evil.
I have a function, which generates and returns a MemoryStream. After generation the size of the MemoryStream is fixed, I dont need to write to it anymore only output is required. Write to MailAttachment or write to database for example.
What is the best way to hand the object around? MemoryStream or Byte Array? If I use MemoryStream I have to reset the position after read.
If you have to hold all the data in memory, then in many ways the choice is arbitrary. If you have existing code that operates on Stream, then MemoryStream may be more convenient, but if you return a byte[] you can always just wrap that in a new MemoryStream(blob) anyway.
It might also depend on how big it is and how long you are holding it for; MemoryStream can be oversized, which has advantages and disadvantages. Forcing it to a byte[] may be useful if you are holding the data for a while, since it will trim off any excess; however, if you are only keeping it briefly, it may be counter-productive, since it will force you to duplicate most (at an absolute minimum: half) of the data while you create the new copy.
So; it depends a lot on context, usage and intent. In most scenarios, "whichever works, and is clear and simple" may suffice. If the data is particularly large or held for a prolonged period, you may want to deliberately tweak it a bit.
One additional advantage of the byte[] approach: if needed, multiple threads can access it safely at once (as long as they are reading) - this is not true of MemoryStream. However, that may be a false advantage: most code won't need to access the byte[] from multiple threads.
The MemoryStream class is used to add elements to a stream.
There is a file pointer; It simulates random access, it depends on how it is implemented. Therefore, a MemoryStream is not designed to access any item at any time.
The byte array allows random access of any element at any time until it is unassigned.
Next to the byte [], MemoryStream lives in memory (depending on the name of the class). Then the maximum allocation size is 4 GB.
Finally, use a byte [] if you need to access the data at any index number. Otherwise, MemoryStream is designed to work with something else that requires a stream as input while you just have a string.
Use a byte[] because it's a fixed sized object making it easier for memory allocation and cleanup and holds relatively no overhead - especially since you don't need to use the functions of the MemoryStream. Further you want to get that stream disposed of ASAP so it can release the possible unmanaged resources it may be using.
I'm sending some packets of data across the network and they arrive in byte[]s, lets say the structure is
[int, int, byte, int]
If this was c++ I would declare a struct* and point to the byte[]. I'm doing this project in c# and I'm not sure whether it is worth it with marshalling overhead, or if there is a better way to handle it in c#, I'm all ears.
update, for clarity
Basically, what he is doing
Marshaling a Byte array to a C# structure
Except I'm wondering if it is worth it.
I think marshaling is the best option. You could parse the byte array by yourself using BitConverter, but that would require more work on your part and is not as flexible.
The only real reason to do it that way would be to squeeze every last bit of performance out of the system. In my opinion, you're better off writing it using BitConverter to make sure it's working. Then, if getting the data is a performance bottleneck, consider doing the marshaling.
For example, given a struct:
struct MyStruct
{
private int f1;
private int f2;
private byte f3;
private int f4;
public MyStruct(int i1, int i2, byte b1, int i4)
{
f1 = i1;
f2 = i2;
f3 = b1;
f4 = i4;
}
// assume there are public get accessors
}
Then you can get create a new one from the buffer with:
var s = new MyStruct(BitConverter.ToInt32(buff, 0),
BitConverter.ToInt32(buff, 4),
BitConverter.ToUInt8(buff, 8),
BitConverter.ToInt32(buff, 9));
That's a whole lot easier to write and verify than the marshaling, and probably will be fast enough for your needs.
Well, I guess everyone has their own 'favourite' way. When receiving protocol units over a byte stream in any OO language, I usually fire every received byte into a 'ProtocolUnit' class instance by calling its 'bool addByte() method. A state-machine in the class handles the bytes and error/sanity checks the assembled fields. If a ProtocolUnit has been received in its entirety, the addByte() method function returns true to indicate to the caller that a PDU has been correctly assembled. Usually, the instance is then queued off to whatever is going to handle it and a new ProtocolUnit created, (or depooled), so it can start to assemble the next PDU.
It's implicit that the start of a message can be identified so that, in case of an error, the state-machine can either reset itself, so dumping the erroneous data, or by returning 'true' to the addByte() call, setting a suitable errorMessage that the caller can check to decide what to do, (eg. if the errorMess property is "" then queue to handler else queue to error logger).
I'm sure that you consider this a massive overkill, but it works for me :)
Rgds,
Martin
PS try to avoid protocols where the length is transmitted at the start and is the only way to identify message start/end. This is very fragile and prone to explosions, especially with non-secure transports like UDP. Even with TCP, I have known a x****x router that would occasionally add a null to packets...
As an alternative to BitConverter, wrap every byte[] in a MemoryStream, and extract the fields using a BinaryReader. Similar, but the stream maintains the offsets for you.
In C#, does the following save any memory?
private List<byte[]> _stream;
public object Stream
{
get
{
if (_stream == null)
{
_stream = new List<byte[]>();
}
return _stream;
}
}
Edit: sorry, I guess I should have been more specific.
Specifically using "object" instead of List... I thought that would kinda clue itself in because it's a weird thing to do.
It saves a very small amount of memory. The amount of memory an empty List<byte[]> is going to take up is byte size.
The reason why is that your reference variable _stream only needs to allocate enough memory to hold a reference to an object. Once an object is allocated, it will take up a certain amount of memory which may grow or shrink over time, such as when new byte[]s are added to the List. However the memory taken up by the reference to that object will remain the same size.
This is simpler and less prone to corner cases that cause you headaches:
private List<byte[]> _stream = new List<byte[]>();
public object Stream
{
get
{
return _stream;
}
}
Although, in most cases it's not really optimal to be returning references to private members when they are collections/arrays, etc. Better to return _stream.AsReadOnlyCollection().
Save memory compared to what?
byte[][] _stream;
maybe? Then no, a List<T> will take up more memory since it is an array at its heart (which isn't necessarily exactly the size of its contents, but usually larger) and some statekeeping needs to be done too.
That is a lazy loading. You will create the stream only when someone requests it. It will not create the stream (in your case a list) unless is required.
One might say that it saves some memory because it will not use any unless required. So before using the stream there is no memory allocated for it.
If your edit indicates that you are asking whether the use of the object keyword instead of List<byte[]> as the type of the property saves memory, no, it doesn't. And your if block only saves a negligible amount of memory (and cpu at instantiation) until the first time the property is called. And it does make the first call to that property slightly slower. Consider returning a null instead if it makes sense for the property. And, like another answerer suggested, it may be better to keep the property read-only unless you'd like other classes to be altering it. In general, I'd say attempts at optimization like this are mostly misguided and make your code less maintainable.
Are you sure a Stream wouldn't be just a byte[] or a List of byte? Or even better, a MemoryStream? :) I think you are somewhat confused, so a bigger example and some scenario details will help a lot.
What are objects really
I'd suggest thinking in objects as structs... and object references as pointers to that structure.
If you instantiate an object you are reserving memory for an "struct" with all its fields (and a reference to the class it's implementing), plus all memory reserved by the constructor (other objects, arrays, etc...).
In List you are reserving memory for state keeping (I don't know how it's implemented in C#) and the initial internal array, maybe of ten references. So... if you count its something like (assuming 32 bits runtime, I'm not a .net specialist):
pointer to class: 4 bytes
pointer to array: 4 bytes
array of initialCapacity references: 40 bytes
So in my estimation it's about 48 bytes. But it depends on the implementation.
As SoloBold says: most of times it's not worthy.