I'm trying to read text files in a very optimized way. Right now, I'm looking at StreamReader.Read(Span<char>) and StreamReader.Read(char[], int, int).
I know Span<>s were designed to be faster by not allocating data on the heap. But I don't understand the benefit offered by using a Span<> here.
Don't both versions require the read character to be copied one by one to my buffer? So does Span<> offer any advantage here?
Related
I've researched a bit and it seems that the common wisdom says that structs should be under 16 bytes because otherwise they incur a performance penalty for copying. With C#7 and ref return it became quite easy to completely avoid copying structs altogether. I assume that as the struct size gets smaller, passing by ref has more overhead that just copying the value.
Is there a rule of thumb about when passing structs by value becomes faster than by ref? What factors affect this? (Struct size, process bitness, etc.)
More context
I'm working on a game with the vast majority of data represented as contiguous arrays of structs for maximum cache-friendliness. As you might imagine, passing structs around is quite common in such a scenario. I'm aware that profiling is the only real way of determining the performance implications of something. However, I'd like to understand the theoretical concepts behind it and hopefully write code with that understanding in mind and profile only the edge cases.
Also, please note that I'm not asking about best practices or the sanity of passing everything by ref. I'm aware of "best practices" and implications and I deliberately choose not to follow them.
Addressing the "duplicate" tag
Performance of pass by value vs. pass by reference in C# .NET - This question discusses passing a reference type by ref which is completely different to what I'm asking.
In .Net, when if ever should I pass structs by reference for performance reasons? - The second question touches the subject a bit, but it's about a specific size of the struct.
To answer the questions from Eric Lippert's article:
Do you really need to answer that question? Yes I do. Because it'll affect how I write a lot of code.
Is that really the bottleneck? Probably not. But I'd still like to know since that's the data access pattern for 99% of the program. In my mind this is similar to choosing the correct data structure.
Is the difference relevant? It is. Passing large structs by ref is faster. I'm just trying to understand the limits of this.
What is this “faster” you speak of? As in giving less work to the CPU for the same task.
Are you looking at the big picture? Yes. As previously stated, it affects how I write the whole thing.
I know I could measure a lot of different combinations. And what does that tell me? That X is faster thatn Y on my combination of [.NET Version, process bitness, OS, CPU]. What about Linux? What about Android? What about iOS? Should I benchmark all permutations on all possible hardware/software combinations?
I don't think that's a viable strategy. Therefore I ask here where hopefully someone who knows a lot about CLR/JIT/ASM/CPU can tell me how that works so I can make informed decisions when writing code.
The answer I'm looking for is similar to the aforementioned 16 byte guideline for struct sizes with the explanation why.
generally, passing by reference should be faster.
when you pass a struct by reference, you are only passing a pointer to the struct, which is only a 32/64 bit integer.
when you pass a struct by value, you need to copy the entire struct and then pass a pointer to the new copy.
unless the struct is very small, for example, an int, passing by reference is faster.
also, passing by value would increase the number of calls to the os for memory allocation and de-allocation, these calls are time-consuming as the os has to check a registry for available space.
If you pass around structs by reference then they can be of any size. You are still dealing with a 8 (x64 assumed) byte pointer. For highest performance you need a CPU cache friendly design which is is called Data Driven Design.
Games often use a special Data Driven Design called Entity Component System. See the book Pro .NET Memory Management by Konrad Kokosa Chapter 14.
The basic idea is that you can update your game entities which are e.g. Movable, Car, Plane, ... share common properties like a position which is for all entities stored in a contigous array. If you need to increment the position of 1K entities you just need to lookup the array index of the position array of all entities and update them there. This provides the best possible data locality. If all would be stored in classes the CPU prefetcher would be lost by the many this pointers for each class instance.
See this Intel post about some reference architecture: https://software.intel.com/en-us/articles/get-started-with-the-unity-entity-component-system-ecs-c-sharp-job-system-and-burst-compiler
There are plenty of Entity Component Systems out there but so far I have seen none using ref structs as their main working data structure. The reason is that all popular ones are existing much longer than C# 7.2 where ref structs were introduced.
I finally found the answer. The breaking point is System.IntPtr.Size. In Microsoft's own words from Write safe and efficient C# code:
Add the in modifier to pass an argument by reference and declare your design intent to pass arguments by reference to avoid unnecessary copying. You don't intend to modify the object used as that argument.
This practice often improves performance for readonly value types that are larger than IntPtr.Size. For simple types (sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal and bool, and enum types), any potential performance gains are minimal. In fact, performance may degrade by using pass-by-reference for types smaller than IntPtr.Size.
The basic building block type of my application decomposes into a type (class or structure) which contains some standard value types (int, bool, etc..) and some array types of standard value types where there will be a small (but unknown) number of elements in the collection.
Given that I have many instances of the above building block, I would like to limit the memory usage of my basic type by using an array/collection as a Value Type instead of the standard Reference Type. Part of the problem is that my standard usage will be to have the arrays containing zero, one or two elements in them and the overhead of the array reference type in this scenario is prohibitive.
I have empirically observed and research has confirmed that the array wrapper itself introduces unwanted (by me, in this situation) overhead in each instance.
How do I make a collection a Value Type / Struct in .NET?
Side Note: it is interesting that Apple's Swift language has arrays as value types by default.
Pre-Emptive Comment
I am fully aware that the above is a non-standard way of using the .NET framework and is very bad practice etc...so it's not necessary to comment to that effect. I really just want to know how to achieve what I am asking.
The fixed keyword referenced in the docs seems to be what you're looking for. It has the same constraints on types as structs do, but it does require unsafe.
internal unsafe struct MyBuffer
{
public fixed char fixedBuffer[128];
}
If you wanted to also have a fixed array of your struct it would be more complicated. fixed only supports the base value types, so you'd have to drop into manual memory allocation.
A mix of ideas from a DirectBuffer and a BufferPool could work.
If you use a buffer pool then fixing buffers in memory is not a big issue because buffers become effectively long-lived and do not affect GC compaction as much as if you were fixing every new byte[] without a pool.
The DirectBuffer uses flyweight pattern and adds very little overhead. You could read/write any blittable struct directly using pointers. Other than SBE, Flatbuffers and Cup'n Proto also use such approach as far as I understand. In the linked implementation you should change the delegate so that it returns a discarded byte[] to the pool.
Big advantage of such solution is zero-copy if you need to interop with native code or send data over network. Additionally, you could allocate a single buffer and work with offsets/lengths in ArraySegment-like fashion.
Update:
I have re-read the question and realized that it was specifically about collections as value types. However the main rationale seems to be memory pressure, so this answer could be an alternative solution for memory, even though DirectBuffer is a class.
We have to interop with native code a lot, and in this case it is much faster to use unsafe structs that don't require marshaling. However, we cannot do this when the structs contain fixed size buffers of nonprimitive types.
Why is it a requirement from the C# compiler that fixed size buffers are only of the primitive types? Why can a fixed size buffer not be made of a struct such as:
[StructLayout(LayoutKind.Sequential)]
struct SomeType
{
int Number1;
int Number2;
}
Fixed size buffers in C# are implemented with a CLI feature called "opaque classes". Section I.12.1.6.3 of Ecma-335 describes them:
Some languages provide multi-byte data structures whose contents are manipulated directly by
address arithmetic and indirection operations. To support this feature, the CLI allows value types
to be created with a specified size but no information about their data members. Instances of
these “opaque classes” are handled in precisely the same way as instances of any other class, but
the ldfld, stfld, ldflda, ldsfld, and stsfld instructions shall not be used to access their contents.
The "no information about their data members" and "ldfld/stfld shall not be used" are the rub. The 2nd rule puts the kibosh on structures, you need ldfld and stfld to access their members. The C# compiler cannot provide an alternative, the layout of a struct is a runtime implementation detail. Decimal and Nullable<> are out because they are structs as well. IntPtr is out because its size depends on the bitness of the process, making it difficult for the C# compiler to generate the address for the ldind/stind opcode used to access the buffer. Reference types references are out because the GC needs to be able to find them back and can't by the 1st rule. Enum types have a variable size that depend on their base type; sounds like a solvable problem, not entirely sure why they skipped it.
Which just leaves the ones mentioned by the C# language specification: sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double or bool. Just the simple types with a well defined size.
What is a fixed buffer?
From MSDN:
In C#, you can use the fixed statement to create a buffer with a fixed size array in a data structure. This is useful when you are working with existing code, such as code written in other languages, pre-existing DLLs or COM projects. The fixed array can take any attributes or modifiers that are allowed for regular struct members. The only restriction is that the array type must be bool, byte, char, short, int, long, sbyte, ushort, uint, ulong, float, or double.
I'm just going to quote Mr. Hans Passant in regards to why a fixed buffer MUST be unsafe. You might see Why is a fixed size buffers (arrays) must be unsafe? for more information.
Because a "fixed buffer" is not a real array. It is a custom value type, about the only way
to generate one in the C# language that I know. There is no way for
the CLR to verify that indexing of the array is done in a safe way.
The code is not verifiable either. The most graphic demonstration of
this:
using System;
class Program {
static unsafe void Main(string[] args) {
var buf = new Buffer72();
Console.WriteLine(buf.bs[8]);
Console.ReadLine();
}
}
public struct Buffer72 {
public unsafe fixed byte bs[7];
}
You can arbitrarily access the stack frame in this example. The standard buffer overflow injection
technique would be available to malicious code to patch the function
return address and force your code to jump to an arbitrary location.
Yes, that's quite unsafe.
Why can't a fixed buffer contain non-primitive data types?
Simon White raised a valid point:
I'm gonna go with "added complexities to the compiler". The compiler would have to check that no .NET specific functionality was applied to the struct that applied to enumerable items. For example, generics, interface implementation, even deeper properties of non-primitive arrays, etc. No doubt the runtime would also have some interop issues with that sort of thing too.
And Ibasa:
"But that is already done by the compiler." Only partly. The compiler can do the checks to see if a type is managed but that doesn't take care of generating code to read/write structs to fixed buffers. It can be done (there's nothing stopping it at CIL level) it just isn't implemented in C#.
Lastly, Mehrdad:
I think it's literally because they don't want you to use fixed-size buffers (because they want you to use managed code). Making it too easy to interop with native code makes you less likely to use .NET for everything, and they want to promote managed code as much as possible.
The answer appears to be a resounding "it's just not implemented".
Why's it not implemented?
My guess is that the cost and implementation time just isn't worth it to them. The developers would rather promote managed code over unmanaged code. It could possibly be done in a future version of C#, but the current CLR lacks a lot of the complexity needed.
An alternative could be the security issue. Being that fixed buffers are immensely vulnerable to all sorts of problems and security risks should they be implemented poorly in your code, I can see why the use of them would be discouraged over managed code in C#. Why put a lot of work into something you'd like to discourage the use of?
I understand your point of view...on the other hand I suppose that it could be some kind of forward compatibility reserved by Microsoft. Your code is compiled to MSIL and it is bussiness of specific .NET Framework and OS to layout it in memory.
I can imagine that it may come new CPU from intel which will require to layout variables to every 8 bytes to gain the optimal performance. In that case there will be need in future, in some future .NET Framework 6 and some future Windows 9 to layout these struct in different way. In this case, your example code would be pressure for Microsoft not to change the memory layout in the future and not speed up the .NET framework to modern HW.
It is only speculation...
Did you tried to set FieldOffset? See C++ union in C#
I'm kinda new to C++ (coming from C#).
I'd like to pass an array of data to a function (as a pointer).
void someFunc(byte *data)
{
// add this data to a hashmap.
Hashtable.put(key, data)
}
This data will be added into a hashmap (some key-value based object).
In C#, i could just add the passed reference to a dictionary and be done with it.
Can the same be done in C++ ? or must i create a COPY of the data, and only add that to the data structure for storing it ?
I have seen this pattern in some code examples, but i am not 100% sure why it is needed, or whether it can be avoided at certain times.
Not sure where your key is coming from... but the std::map and the std::unordered_map are probably what you are looking for.
Now the underlying data structure of the std::map is a binary tree, while the std::unordered_map is a hash.
Furthermore, the std::unordered_map is an addition in the C++ 11 standard.
It all depends how you pass the data and how it is created. If the data is created on the heap(by using new) you can just put the pointer or reference you have to it in your table. On the other hand if the function takes the argument by value you will need to make a copy at some point, because if you store the address of a temp bad things will happen :).
As for what data structure to use, and how they work, I've found one of the best references is cppreference
Heap allocation should be reserved for special cases. stack allocation is faster and easier to manage, you should read up on RAII(very important). As for other reading try finding out stuff on dynamic vs. automatic memory allocation.
Just found this read specifically saying C# to C++ figured it'd be perfect for you, good luck c++ can be one of the more difficult languages to learn so don't assume anything will work the same as it does in C#. MSDN has a nice C# vs. C++ thing yet
I read that System.Drawing.Point is a value type. I do not understand. Why?
There are rules that microsoft tries to follow about this, they explain them very well in the MSDN, see Choosing Between Classes and Structures (The book is even better as it had lot of interesting comments)
Even if Point isn't a so good sample of this :
Struct should logically represents a single value (In this case a position, even if it have 2 components, but Complex numbers could also be separated in 2 parts and they are prime candidates for being structs)
Struct should have an instance size smaller than 16 bytes. (Ok, 2x4=8)
Struct should not have to be boxed frequently. (Ok this one is right)
BUT, Struct should be immutable (Here is the part where they don't follow their own rules, but i guess that micro-optimization gained over the rules, that anyway were written later)
As i said i guess that the fact that they haven't respected the "immutable" part is both because there weren't rules when System.Drawing was written and for speed as graphic operations could be quite sensitive to this.
I don't know if they were right or not to do it, maybe they measured some common algorithms and found that they lost too much performance in allocating temporary object and copying them over. Anyway such optimizations should only be done after carefully measuring real-world usage of the class/struc.
It's a Structure. Just like DateTime. And structures are value-types.
The reason for this is almost certainly that the System.Drawing.Point (and PointF) types are used for drawing through the .NET GDI(+) Wrappers, which requires marshalling. Marshalling value types (ie. structs) so that the Native libraries can use them is faster than marshalling heap allocated objects (ie. Classes).
From the MSDN (Performance Considerations for Run-Time Technologies in the .NET Framework
):
One extremely important thing to note is that ValueTypes require no marshalling in interop scenarios. Since marshalling is one of the biggest performance hits when interoperating with native code, using ValueTypes as arguments to native functions is perhaps the single biggest performance tweak you can do.
Well, I don't specifically know Microsofts reasons, but it makes sense. It is a fixed-size structure containing a small amount of immutable data. I would rather have such a thing allocated on the stack, where it is easy to allocate and easy to free. Making it a class and putting it on the heap means it has to be managed by the GC, which creates a significant amount of overhead for such a trivial thing.
In C#, struct types are considered as value types, to allow for user-defined value types. It is the case for Drawing.Point.