MSDN says that a class that would be 16 bytes or less would be better handled as a struct [citation].
Why is that?
Does that mean that if a struct is over 16 bytes it's less efficient than a class or is it the same?
How do you determine if your class is under 16 bytes?
What restricts a struct from acting like a class? (besides disallowing parameterless constructors)
There are a couple different answers to this question, and it is a bit subjective, but some reasons I can think of are:
structs are value-type, classes are reference type. If you're using 16 bytes for total storage, it's probably not worth it to create memory references (4 to 8 bytes) for each one.
When you have really small objects, they can often be pushed onto the IL stack, instead of references to the objects. This can really speed up some code, as you're eliminating a memory dereference on the callee side.
There is a bit of extra "fluff" associated with classes in IL, and if your data structure is very small, none of this fluff would be used anyway, so it's just extra junk you don't need.
The most important difference between a struct and a class, though, is that structs are value type and classes are reference type.
By "efficient", they're probably talking about the amount of memory it takes to represent the class or struct.
On the 32-bit platform, allocating an object requires a minimum of 16 bytes. On a 64-bit platform, the minimum object size is 24 bytes. So, if you're looking at it purely from the amount of memory used, a struct that contains less than 16 bytes of data will be "better" than the corresponding class.
But the amount of memory used is not the whole story. Value types (structs) are fundamentally different than reference types (classes). Structs can be inconvenient to work with, and can actually cause performance problems if you're not careful.
The real answer, of course, is to use whichever works best in your situation. In most cases, you'll be much better off using classes.
Check this link, I found it on one of the answers in SO today: .NET Type Internals. You can also try searching SO and Googling for "reference types vs value types" for differences between structs and classes.
What restricts a struct from acting like a class?
There are many differences. You cannot inherit from a struct, for example.
You can't have virtual methods, so you cannot use a struct to implement an interface. Instance methods in structs can access struct's private fields, but apart from that they behave a lot like auxilirary "helper" functions (for immutable structs, they sometimes don't even need to access private data). So I find them to be not as near as "valuable" as class methods.
structs are different from classes because they are stored on the stack, and not on the heap. That means that every time you call a method with the struct as parameter, a copy is created and passed to the method. That is why large structs are extremely inefficient.
I would actively discourage to use structs nevertheless, because it could cause some subtle bugs: e.g. when you change a field of a struct, its not going to be reflected for the caller (because you only changed the copy) - which is completely different behavior to classes.
So the 16 bytes I think is a reasonable maximum size of a struct, but still in most cases it is better to have a class. If you still want to create a struct, try to make it immutable at least.
This is due to the different way that the CLR handles structs and classes. Structs are value types which means they live on the stack rather than in the managed heap. It is a good rule of thumb to keep structs small because once you start passing them as method arguments you will incur overhead as structs are copied in their entirety when passed to a method.
Since classes pass a copy of their reference to methods they incur much less overhead when used as method arguments.
The best way to determine the size of your class is to total the number of bytes required by all the members of your class plus an extra 8 bytes for CLR overhead stuff (the sync block index and the reference to the type of the object).
In memory, the struct will hold the data directly, while a class will behave more like a pointer. That alone makes an important difference, since passing the struct as a parameter to a method will pass its values (copy them on the stack), while the class will pass the reference to the values. If the struct is big, you will be copying a lot of values on each method call. When it is really small copying the values and using them directly will be probably faster than copying the pointer and having to grab them from another place.
About restrictions: you can't assign it to null (although you can use Nullable<>) and you have to initialize it right away.
Copying an instance of a struct takes less time than creating a new instance of a class and copying data from an old one, but class instances can be shared and struct instances cannot. Thus, "structvar1 = structvar2" requires copying new struct instance, whereas "classvar1 = classvar2" allows classvar1 and classvar2 refer to the same struct instance (without having to create a new one).
The code to handle the creation of new struct instances is optimized for sizes up to 16 bytes. Larger structs are handled less efficiently. Structs are a win in cases where every variable that holds a struct will hold an independent instance (i.e. there's no reason to expect that any particular two variables will hold identical instances); they are not much of a win (if they're a win at all) in cases where many variables could hold the same instance.
Related
In System.Data.Linq, EntitySet<T> uses a couple of ItemList<T> structs which look like this:
internal struct ItemList<T> where T : class
{
private T[] items;
private int count;
...(methods)...
}
(Took me longer than it should to discover this - couldn't understand why the entities field in EntitySet<T> was not throwing null reference exceptions!)
My question is what are the benefits of implementing this as a struct over a class?
Lets assume that you want to store ItemList<T> in an array.
Allocating an array of value types (struct) will store the data inside the array. If on the other hand ItemList<T> was a reference type (class) only references to ItemList<T> objects would be stored inside the array. The actualy ItemList<T> objects would be allocated on the heap. An extra level of indirection is required to reach an ItemList<T> instance and as it simply is a an array combined with a length it is more efficient to use a value type.
After the inspecting the code for EntitySet<T> I can see that no array is involved. However, an EntitySet<T> still contains two ItemList<T> instances. As ItemList<T> is a struct the storage for these instances are allocated inside the EntitySet<T> object. If a class was used instead the EntitySet<T> would have contained references pointing to EntitySet<T> objects allocated separately.
The performance difference between using one or the other may not be noticable in most cases but perhaps the developer decided that he wanted to treat the array and the tightly coupled count as a single value simply because it seemed like the best thing to do.
For small critical internal data structures like ItemList<T>, we often have the choice of using either a reference type or a value type. If the code is written well, switching from one to the other is of a trivial change.
We can speculate that a value type avoids heap allocation and a reference type avoids struct copying so it's not immediately clear either way because it depends so much on how it is used.
The best way to find out which one is better is to measure it. Whichever is faster is the clear winner. I'm sure they did their benchmarking and struct was faster. After you've done this a few times your intuition is pretty good and the benchmark just confirms that your choice was correct.
Maybe its important that...quote about struct from here
The new variable and the original
variable therefore contain two
separate copies of the same data.
Changes made to one copy do not affect the other copy.
Just thinking, dont judge me hard :)
There are really only two reasons to ever use a struct, and that is either to get value type semantics, or for better performance.
As the struct contains an array, value type semantics doesn't work well. When you copy the struct you get a copy of the count, but you only get a copy of the reference to the array, not a copy of the items in the array. Therefore you would have to use special care whenever the struct is copied so that you don't get inconsistent instances of it.
So, the only remaining valid reason would be performance. There is a small overhead for each reference type instance, so if you have a lot of them there may be a noticable performance gain.
One nifty feature of such a structure is that you can create an array of them, and you get an array of empty lists without having to initialise each list:
ItemList<string>[] = new ItemList<string>[42];
As the items in the array are zero-filled, the count member will be zero and the items member will be null.
Purely speculating here:
Since the object is fairly small (only has two member variables), it is a good candidate for making it a struct to allow it to be passed as a ValueType.
Also, as #Martin Liversage points out, by being a ValueType it can be stored more efficiently in larger data structures (e.g. as an item in an array), without the overhead of having an individual object and a reference to it.
In our application we have a Queue which is defined as the following:
private static Queue RawQ = new Queue();
Then two different types of objects are put onto the Queue, one are objects from a class (class A) and one are objects from a struct (struct B).
When we process the data from the Queue, we use typeof to check the item from the Queue belongs to which type (class A or struct B).
My questions:
for objects from class A, only their references are copied to the Queue and for object from struct B, their values are copied to the Queue, am I right?
for a Queue, some items are references which is small and some items are values which are much bigger (about 408 Bytes). This will waste many memory space if the Queue is not small?
do you have a better way to do the same thing?
thanks,
for objects from class A, only their references are copied to the Queue and for object from struct B, their values are copied to the Queue, am I right?
Correct. Actually, when you add a struct B to the queue, it is boxed first. In other words, your B instance is copied onto the managed heap, and a reference to the copy is put on the queue.
for a Queue, some items are references which is small and some items are values which are much bigger (about 408 Bytes). This will waste many memory space if the Queue is not small?
Possibly - boxing the B instance takes a copy, which uses more memory than not taking a copy. It depends what happens to the original.
408 bytes is very large for a .NET struct; the general rule of thumb is that structs shouldn't be bigger than 16 bytes. The reason is similar to this: large structs introduce overhead due to copying and boxing.
do you have a better way to do the same thing?
I'd question whether B needs to be a struct in the first place. Another rule of thumb (mine, this time): you probably don't need ever need a struct in .NET code.
1.for objects from class A, only their references are copied to the Queue and
for object from struct B, their values
are copied to the Queue, am I right?
That is correct. Except that value types would be boxed.
2.for a Queue, some items are references which is small and some
items are values which are much bigger
(about 408 Bytes). This will waste
many memory space if the Queue is not
small?
That is mostly correct. The boxing will add another 8 bytes (4 for the syncblock and 4 for the type information) so for large structs that is insignificant, but for smaller structs that would represent a larger ratio.
3.do you have a better way to do the same thing?
The best thing to do is convert that large struct into a class. There is no hard rule for knowing when to choose a struct or class based on size, but 32 bytes seems to be a common threshold. Of course, you could easily justify larger structs based on whether you really wanted value-type semantics, but 408 bytes is probably way beyond that threshold. If the type really needs value semantics you could make it an immutable class.
Another change you could make is to use the generic Queue class instead. Value types are not boxed as they would be with the normal Queue. However, you would still be copying that large struct even with the generic version.
From the C# spec:
Since structs are not reference types,
these operations are implemented
differently for struct types. When a
value of a struct type is converted to
type object or to an interface type
that is implemented by the struct, a
boxing operation takes place.
So, to answer 1) the queue contains boxed structs, not the actual struct values.
The answer to 2) falls out of that, a boxed struct and a reference have the same size in a queue's actual allocation.
For 3), I'd need more information. It would be preferable to have the same type in a queue and have polymorphic operations that are handled both by classes and structs in the appropriate ways. Excessive case statements and typeof() calls suggest that your program is more procedural than object-oriented. Maybe that's what you want, but C# is optimized for a OO approach.
I was trying to double check this, but here's what I belive happens:
The System.Collections.Queue class holds a collection of type Object which is a reference type. Therefore, when you pass an instance of a Struct to your queue it gets boxed as an object. This creates a copy on the Heap, and provides a refernce pointer (which is what the Queue sees). So, the Queue itself does not get too big, but if you're doing a lot of these operations, you'll end up (according to Microsoft) with a memory and performance hit over the boxing/unboxing.
See the C# Language Specification for more.
Just wondering why we need struct if class can do all struct can and more? put value types in class has no side effect, I think.
EDIT: cannot see any strong reasons to use struct
A struct is similar to a class, with the following key differences:
A struct is a value type, whereas a
class is a reference type.
A struct does not support inheritance
(other than implicitly deriving from
object).
A struct can have all the members a
class can, except the following:
A parameterless constructor
A finalizer
Virtual members
A struct is used instead of a class when value type semantics are desirable. Good examples of structs are numeric types, where it is more natural for assignment to copy a value rather than a reference. Because a struct is a value type, each instance does not require instantiation of an object on the heap. This can be important when creating many instances of a type.
Custom value types aren't absolutely necessary - Java does without them, for example. However, they can still be useful.
For example, in Noda Time we're using them pretty extensively, as an efficient way of representing things like instants without the overhead of an object being involved.
I wouldn't say that "class can do all struct can and more" - they behave differently, and should be thought of differently.
Why use a struct when a class works? Because sometimes a class doesn't work.
In addition to the performance reasons mentioned by Reed Copsey (short version: fewer objects that the GC needs to track, allowing the GC to do a better job), there is one place where structures must be used: P/Invoke to functions that require by-value structures or structure members.
For example, suppose you wanted to invoke the CreateProcess() function. Further suppose that you wanted to use a STARTUPINFOEX structure for the lpStartupInfo parameter to CreateProcess().
Well, what's STARTUPINFOEX? This:
typedef struct _STARTUPINFOEX {
STARTUPINFO StartupInfo;
PPROC_THREAD_ATTRIBUTE_LIST lpAttributeList;
} STARTUPINFOEX, *LPSTARTUPINFOEX;
Notice that STARTUPINFOEX contains STARTUPINFO as its first member. STARTUPINFO is a structure.
Since classes are reference types, if we declared the corresponding C# type thus:
[StructLayout(LayoutKind.Sequential)]
class STARTUPINFO { /* ... */ }
class STARTUPINFOEX { public STARTUPINFO StartupInfo; /* ... */ }
The corresponding in-memory layout would be wrong, as STARTUPINFOEX.StartupInfo would be a pointer (4 bytes on ILP32 platforms), NOT a structure (as is required, 68 bytes in size on ILP32 platforms).
So, to support invoking arbitrary functions which accept arbitrary structures (which is what P/Invoke is all about), one of two things are necessary:
Fully support value types. This allows C# to declare a value type for STARTUPINFO which will have the correct in-memory layout for marshaling (i.e. struct support, as C# has).
Some alternate syntax within P/Invokeable structures which would inform the runtime marshaler that this member should be laid out as a value type instead of as a pointer.
(2) is a workable solution (and may have been used in J/Direct in Visual J++; I don't remember), but given that proper value types are more flexible, enable a number of performance optimizations not otherwise achievable, and make sensible use within P/Invoke scenarios, it's no surprise that C# supports value types.
In general, use a class.
Only use a struct when you absolutely need value type semantics.
Structs are also often required for performance reasons. Arrays of structs take quite a bit less memory, and get much better cache coherency than an array of object references. This is very important if you're working with something like a rendering system, and need to generate 5 million vertices, for example.
For details, see Rico Mariani's Performance Quiz + Answers.
Structs are useful simply because they are passed by value, which can actually be useful in certain algorithms. This is actually something that classes CAN'T do.
struct ArrayPointer<T>
{
public T[] Array;
public int Offset;
}
You can pass this structure to a method and the method can alter the value of Offset for its own needs. Meanwhile, once you return from the method, it will behave as if Offset never changed.
Struct Vs Class in C#
C# Struct usage tips?
Why do we need struct? (C#)
For the average application developer, using classes is the norm. On the surface, classes make structs seem unnecessary, but when you dig deeper into the nitty-gritty details they are actually quite different.
See here for some differences between structs and classes.
The most well-known difference is that classes are reference-types and structs are value-types. This is important because it allows a library developer to control how instances of a data-type can be used.
1, performance. In some cases using structures we'll get much better performance.
2, data immutability. By changing some property of a struct you'll get a new struct. This is very useful in some cases
3, better control of in memory representation. We can exactly define how the struct is located in memory and this allows us to serialize and deserialize some binary data fast and efficiently.
It is almost a must to use for interop with underlying data structure used by win32 API. I suppose a very big reason to have it in .net could be due to that reason.
On a blog entry, Luca Bolognese ponders this idea about structs vs. classes as member fields:
The reason to use a struct is to not
allocate an additional object on the
stack. This allows this solution to be
as 'performant' as simply having coded
the fields on the class itself. Or at
least I think so ...
How accurate is this?
First, doesn't he mean allocate an additional object on the heap?
And, wouldn't using structs marginally decrease allocation time (especially if the class members were lazy-loaded) at the expense of increasing the memory footprint?
Would you recommend this practice of structs over classes in certain situations?
I think he did mean to write heap, he just wrote stack by mistake.
The question of structs or classes comes up quite often. Here's one example. I think it's best to follow this commonly given advice:
MSDN has the answer: Choosing Between Classes and Structures.
Basically, that page gives you a 4-item checklist and says to use a class unless your type meets all of the criteria.
Do not define a structure unless the type has all of the following characteristics:
* It logically represents a single value, similar to primitive types (integer, double, and so on).
* It has an instance size smaller than 16 bytes.
* It is immutable.
* It will not have to be boxed frequently.
Are there times you shouldn't follow this advice? Maybe, but only after you've proved it by profiling.
I have a type which I consider use it as struct.
It represents single value
It is immutable
But the problem is, it has 6 fields of int.
So which solution I should use for this type?
keep using struct?
change to class?
or pack 6 integers into an array of int, so it has only one field
EDIT
Size of struct with 6 integer fields is 24 bytes, which is huge to pass around.
Recommend size for struct is not more than 16 bytes
It depends how you are going to use it?
Are you going to allocate a lot of it vs. pass it around a lot?
Is it going to be consumed by 3rd party code? In this case, classes typically give you more flexibility.
Do you want struct vs. class semantics? For example, non-nullable?
Would you benefit from having a couple of pre-created instances of this class that can be re-use to represent special cases? Similar to String.Empty. In this case you would benefit from a class.
It is hard to answer just from the information you provided in your question.
Be careful of boxing. If your struct is going to be consumed by a method which expects an Object, it will be coerced into an object and you'll have a potential performance hit.
Here's a reference that explains it in more detail.
I'd make it a class (#2) and then you wouldn't have to worry about it.
Using an array of six integers (#3) would probably make your code harder to read. A class with descriptive identifiers for each int would be much better.
Without seeing your struct, it's difficult to say anything definitively. But I suspect you should leave this as a struct.
How about a WriteOnce<int[]> ?
I would suggest writing a little benchmark to measure the performance of the different options, this is the only way to know for sure. You may be surprised at the results (I often am).
(I'm assuming that your concern here is performance.)
If the data holder is going to be immutable, the struct-versus-class question will most likely depend upon the average number of references that would exist to each instance. If one has an array of TwentyFourByteStruct[1000], that array will take 24,000 bytes, regardless of whether every element holds a different value, all elements hold the same value, or somewhere in-between. If one has an array of TwentyFourByteClass[1000], that array will take 4,000 or 8,000 bytes (for 32/64-bit systems), and each distinct instance of TwentyFourByteClass which is created will take about 48 bytes. If all of the array elements happen to hold a reference to the same TwentyFourByteClass object, the total will be 4,048 or 8,048 bytes. If all of the array elements hold references to different TwentyFourByteClass objects, the total will be 52,000 or 56,000 bytes.
As for run-time performance, the best performance you can get will generally be passing structures by reference. Passing structures by value will require copying them, which can get expensive for structures larger than 16 bytes (.net includes optimizations for structures 16 bytes or smaller), but the cost of a value type by reference is the same whether it is 1 byte or 16,000 bytes.
In general, when storing more than two pieces of related data I like to make a class that binds them together. Especially if I will be passing them around as a unit.