Is struct field layout consistent with endianness in C#?

Is struct field layout consistent with endianness in C#? - c#

When I first learned endianness, I was very confused at how it worked. I finally explained it to myself by the following metaphor:
On a big-endian machine, an int[4] would be arranged like this:
| int[4] |
|int1|int2|int3|int4|
While on little-endian machines, it would be laid out like
| int[4] |
|1tni|2tni|3tni|4tni|
That way the layout of the array would be consistent in memory, while the values themselves would be arranged differently.
Now to the real question: I am writing more optimized versions of BinaryReader and BinaryWriter in my .NET library. One of the problems I have run into is the implementation of Write(decimal). A decimal contains 4 int fields: flags, hi, lo, and mid, in that order. So basically on your typical little-endian machine it would look like this in memory:
| lamiced |
|sgalf|ih|ol|dim|
My question is, how would the CLR arrange the struct on big-endian machines? Would it arrange it so that the basic layout of the decimal would be conserved, like so
| decimal |
|flags|hi|lo|mid|
or would it completely reverse the binary arrangement of the decimal, like
| decimal |
|mid|lo|hi|flags|
?
Don't have a big-endian machine nearby, otherwise I'd test it out myself.
edit: TL;DR does the following code print -1 or 0 on big-endian machines?
struct Pair
{
public int a;
public int b;
}
unsafe static void Main()
{
var p = default(Pair);
p.a = -1;
Console.WriteLine(*(int*)&p);
}

It's not entirely clear what your actual question is.
Regarding the relationship between the layout of fields in a data structure and endianness, there is none. Endianness does not affect how fields in a data structure are laid out, only the order of bytes within a field.
I.e. in answer to this:
does the following code print -1 or 0 on big-endian machines?
… the output will be -1.
But you seem to be also or instead asking about the effect of endianness on the in-memory representation of the Decimal type. Which is a somewhat different question.
Regarding the endianness of the Decimal in-memory representation, I'm not aware of any requirement that .NET provide consistent implementations of the Decimal type. As commenter Hans Passant points out, there are multiple ways to view the current implementation; either as the CLR code you referenced, or as the more detailed declaration seen in e.g. wtypes.h or OleDb.h (another place a DECIMAL type appears, which has the same format as elsewhere). But in reality, as far as .NET is concerned, you are not promised anything about the in-memory layout of the type.
I would expect, for simplicity in implementation, the fields representing the 3 32-bit mantissa components may be affected by endianness, individually. (The sign and scale are represented as individual bytes, so endianness would not affect those). That is, while the order of the individual 32 bit fields would remain the same — high, low, mid — the bytes within each field will be represented according to the current platform's endianness.
But if Microsoft for some bizarre reason decided they wanted the .NET implementation to deviate from the native implementation (seems unlikely, but let's assume it for the sake of argument) and always use little-endian for the fields even on big-endian platforms, that would be within their rights.
For that matter, they could even rearrange the fields if they wanted to: their current order appears to me to be a concession to the de facto x86 standard of little-endianness, such that on little-endian architectures the combination of low and mid 32-bit values can be treated as a single 64-bit value without swapping words, so if they decided to deviate from the wtypes.h declaration, they might well decide to just make the mantissa a single 96-bit, little-endian or big-endian value.
Again, I'm not saying these actions are in any way likely. Just that they are theoretically possible and are just easy, obvious examples (a subset of all possible examples) of why writing managed code that assumes such private implementation details is probably not a good idea.
Even if you had access to a big-endian machine that could run .NET libraries (*) and so could test the actual behavior, today's current behavior doesn't offer you any guarantees of future behavior.
(*) (I don't even know of any…pure big-endian CPUs are fairly uncommon these days, and I can't think of a single one off the top of my head that is supported by Microsoft as an actual .NET platform.)
So…
I am skeptical that it is practical to author implementations of BinaryReader and BinaryWriter that are observably more optimized than those found in .NET already. The main reason for using these types is to handle I/O, and that necessarily means interacting with external systems that are orders of magnitude slower than the CPU that is handling the actual conversions to and from byte representations (and even the GC operations to support those conversions). Even if the existing Microsoft code were in some way hypothetically inefficient, in practice I doubt it would matter much.
But if you must implement these yourself, it seems to me that the only safe way to deal with the Decimal type is to use the Decimal.GetBits() method and Decimal.Decimal(int[]) constructor. These use clearly-documented, endian-independent mechanisms to convert the Decimal type. They are based on int, the in-memory representation of which will of course vary according to endianness, but your code will never need to worry about that, because it will only have to deal with entire int values, not their byte-wise representations.

Related

I just noticed I get different hashcodes from objects depending on if I build for x86 or 64. Can I do that aswell?

I noticed that hashcodes I got from other objects were different when I built for a either x86 or x64.
Up until now I have implemented most of my own hashing functions like this:
int someIntValueA;
int someIntValueB;
const int SHORT_MASK = 0xFFFF;
public override int GetHashCode()
{
return (someIntValueA & SHORT_MASK) + ((someIntValueB & SHORT_MASK) << 16);
}
Will storing the values in a long and getting the hashcode from that give me a wider range as well on 64-bit systems, or is this a bad idea?
public override int GetHashCode()
{
long maybeBiggerSpectrumPossible = someIntValueA + (someIntValueB << 32);
return maybeBiggerSpectrumPossible.GetHashCode();
}

No, that will be far worse.
Suppose your int values are typically in the range of a short: between -30000 and +30000. And suppose further that most of them are near the middle, say, between 0 and 1000. That's pretty typical. With your first hash code you get all the bits of both ints into the hash code and they don't interfere with each other; the number of collisions is zero under typical conditions.
But when you do your trick with a long, then you rely on what the long implementation of GetHashCode does, which is xor the upper 32 bits with the lower 32 bits. So your new implementation is just a slow way of writing int1 ^ int2. Which, in the typical scenario has almost all zero bits, and hence collisions all over the place.

The approach you suggest won't make anything any better (quite the opposite).
However…
SpookyHash is for example designed to work particularly quickly on 64-bit systems, because when working out the math the author was thinking about what would be fast on a 64-bit system, xxHash has 32-bit and 64-bit variants that are designed to give comparable quality of hash at better speed for 32-bit and 64-bit computation respectively.
The general idea of making use of the differences performances of different arithmetic operations on different machines is a valid one.
And your general idea of making use of a larger intermediary storage in hash calculation is also a valid one as long as those extra bits make their way into subsequent operations.
So at a very general level, the answer is yes, even if your particular implementation fails to come through with that.
Now, in practice, when you're sitting down to write a hashcode implementation should you worry about this?
Well it depends. For a while I was very bullish about using algorithms like SpookyHash, and it does very well (even on 32-bit systems) when the hash is based on a large amount of source data. But on the other hand it can be better, especially when used with smaller hash-based sets and dictionaries, to be crappy really fast than fantastic slowly. So there isn't an one-solution-fits-all answer. With just two input integers your initial solution is likely to beat a super-avalancy algorithm like xxHash or SpookyHash for many uses. You could perhaps do better if you also had a >> 16 to rotate rather than shift (fun fact, some jitters are optimised for that), but we're not touching on 64- vs 32-bit versions in that at all.
The cases where you do find a big possible improvement with taking a different approach in 64- and 32-bit are where there's a large amount of data to mix in, especially if it's in a blittable form (like string or byte[]) that you can access via a long* or int* depending on framework.
So, generally you can ignore the question of bitness, but if you find yourself thinking "this hashcode has to go through so much stuff to get an answer; can I make it better?" then maybe it's time to consider such matters.

do datatype choices affect performance?

I have an object model that I use to fill results from a query and that I then pass along to a gridview.
Something like this:
public class MyObjectModel
{
public int Variable1 {get;set;}
public int VariableN {get;set;}
}
Let's say variable1 holds the value of a count and I know that the count will never get to become very large (ie. number of upcoming appointments for a certain day). For now, I've put these data types as int. Let's say it's safe to say that someone will book less than 255 appointments per day. Will changing the datatype from int to byte affect performance much? Is it worth the trouble?
Thanks

No, performance will not be affected much at all.
For each int you will be saving 3 bytes, or 6 in total for the specific example. Unless you have many millions of these, the savings in memory are very small.
Not worth the trouble.
Edit:
Just to clarify - my answer is specifically about the example code. In many cases the choices will make a difference, but it is a matter of scale and will require performance testing to ensure correct results.
To answer #Filip's comment - There is a difference between compiling an application to 64bit and selecting an isolated data type.

Using a integer variable smaller than an int (System.Int32) will not provide any performance benefits. This is because most integer operations in the CLR will promote the variable to an int prior to performing the operation. int is considered the "natural" integer size on the systems for which the CLR was developed.
Consider the following code:
for (byte appointmentIndex = 0; appointmentIndex < Variable1; appointmentIndex++)
ProcessAppointment(appointmentIndex);
In the compiled code, the comparison (appointmentIndex < Variable1) and the increment (appointmentIndex++) will (most likely) be performed using 32-bit integers. Even if the optimizer uses a smaller data type, the CPU itself will require additional work to use the smaller data type.
If you are storing an array of values, then using a smaller data type could help save space, which might give a performance advantage in some scenerios.

It will affect the amount of memory allocated for that variable. In my personal opinion, I don't think it's worth the trouble in the example case.
If there were a huge number of variables, or a database table where you could really save, then yes, but not in this case.
Besides, after years of maintenance programming, I can safely say that it's rarely safe to assume an upper limit on anything. if there's even a remote chance that some poor maintenance programmer is going to have to re-write the app because of trying to save a trivial amount of resources, it's not worth the pay-off.

The .NET runtime optimizes the use of Int32 especially for counters etc.
.NET Integer vs Int16?

Contrary to popular belief, making your data type smaller does not make access faster. In fact, it's slower. Look at bool, it's implemented as an int.
This is because internally, your CPU works with native-word-sized registers (32/64 bit these days), and you're forcing it to convert your data back and forth for no reason (well only when writing the result in memory, but it's still a penalty you could easily avoid).
Fiddling with integer widths only affects memory access, and caching specifically. This is the kind of stuff you can only figure out by profiling your application and looking at page fault counters in particular.

I agree with the other answers that performance won't be worth it. But if you're going to do it at all, go with a short instead of a byte. My rule of thumb is to pick the highest number you can imagine, multiply by 10, then use that as the basis to pick your value. So if you can't possibly imagine a value higher than 200, then use 2000 as your basis, which would mean you'd need a short.

Why is writing to a 24-bit struct not atomic (when writing to a 32-bit struct appears to be)?

I am a tinkerer—no doubt about that. For this reason (and very little beyond that), I recently did a little experiment to confirm my suspicion that writing to a struct is not an atomic operation, which means that a so-called "immutable" value type which attempts to enforce certain constraints could hypothetically fail at its goal.
I wrote a blog post about this using the following type as an illustration:
struct SolidStruct
{
public SolidStruct(int value)
{
X = Y = Z = value;
}
public readonly int X;
public readonly int Y;
public readonly int Z;
}
While the above looks like a type for which it could never be true that X != Y or Y != Z, in fact this can happen if a value is "mid-assignment" at the same time it is copied to another location by a separate thread.
OK, big deal. A curiosity and little more. But then I had this hunch: my 64-bit CPU should actually be able to copy 64 bits atomically, right? So what if I got rid of Z and just stuck with X and Y? That's only 64 bits; it should be possible to overwrite those in one step.
Sure enough, it worked. (I realize some of you are probably furrowing your brows right now, thinking, Yeah, duh. How is this even interesting? Humor me.) Granted, I have no idea whether this is guaranteed or not given my system. I know next to nothing about registers, cache misses, etc. (I am literally just regurgitating terms I've heard without understanding their meaning); so this is all a black box to me at the moment.
The next thing I tried—again, just on a hunch—was a struct consisting of 32 bits using 2 short fields. This seemed to exhibit "atomic assignability" as well. But then I tried a 24-bit struct, using 3 byte fields: no go.
Suddenly the struct appeared to be susceptible to "mid-assignment" copies once again.
Down to 16 bits with 2 byte fields: atomic again!
Could someone explain to me why this is? I've heard of "bit packing", "cache line straddling", "alignment", etc.—but again, I don't really know what all that means, nor whether it's even relevant here. But I feel like I see a pattern, without being able to say exactly what it is; clarity would be greatly appreciated.

The pattern you're looking for is the native word size of the CPU.
Historically, the x86 family worked natively with 16-bit values (and before that, 8-bit values). For that reason, your CPU can handle these atomically: it's a single instruction to set these values.
As time progressed, the native element size increased to 32 bits, and later to 64 bits. In every case, an instruction was added to handle this specific amount of bits. However, for backwards compatibility, the old instructions were still kept around, so your 64-bit processor can work with all of the previous native sizes.
Since your struct elements are stored in contiguous memory (without padding, i.e. empty space), the runtime can exploit this knowledge to only execute that single instruction for elements of these sizes. Put simply, that creates the effect you're seeing, because the CPU can only execute one instruction at a time (although I'm not sure if true atomicity can be guaranteed on multi-core systems).
However, the native element size was never 24 bits. Consequently, there is no single instruction to write 24 bits, so multiple instructions are required for that, and you lose the atomicity.

The C# standard (ISO 23270:2006, ECMA-334) has this to say regarding atomicity:
12.5 Atomicity of variable references
Reads and writes of the following data types shall be atomic: bool, char, byte, sbyte, short, ushort,
uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type
in the previous list shall also be atomic. Reads and writes of other types, including long, ulong, double,
and decimal, as well as user-defined types, need not be atomic. (emphasis mine) Aside from the library functions designed
for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or
decrement.Your example X = Y = Z = value is short hand for 3 separate assignment operations, each of which is defined to be atomic by 12.5. The sequence of 3 operations (assign value to Z, assign Z to Y, assign Y to X) is not guaranteed to be atomic.
Since the language specification doesn't mandate atomicity, while X = Y = Z = value; might be an atomic operation, whether it is or not is dependent on a whole bunch of factors:
the whims of the compiler writers
what code generation optimizations options, if any, were selected at build time
the details of the JIT compiler responsible for turning the assembly's IL into machine language. Identical IL run under Mono, say, might exhibit different behaviour than when run under .Net 4.0 (and that might even differ from earlier versions of .Net).
the particular CPU on which the assembly is running.
One might also note that even a single machine instruction is not necessarily warranted to be an atomic operation—many are interruptable.
Further, visiting the CLI standard (ISO 23217:2006), we find section 12.6.6:
12.6.6 Atomic reads and writes
A conforming CLI shall guarantee that read and write access to properly
aligned memory locations no larger than the native word size (the size of type
native int) is atomic (see §12.6.2) when all the write accesses to a location are
the same size. Atomic writes shall alter no bits other than those written. Unless
explicit layout control (see Partition II (Controlling Instance Layout)) is used to
alter the default behavior, data elements no larger than the natural word size (the
size of a native int) shall be properly aligned. Object references shall be treated
as though they are stored in the native word size.
[Note: There is no guarantee
about atomic update (read-modify-write) of memory, except for methods provided for
that purpose as part of the class library (see Partition IV). (emphasis mine)
An atomic write of a “small data item” (an item no larger than the native word size)
is required to do an atomic read/modify/write on hardware that does not support direct
writes to small data items. end note]
[Note: There is no guaranteed atomic access to 8-byte data when the size of
a native int is 32 bits even though some implementations might perform atomic
operations when the data is aligned on an 8-byte boundary. end note]

x86 CPU operations take place in 8, 16, 32, or 64 bits; manipulating other sizes requires multiple operations.

The compiler and x86 CPU are going to be careful to move only exactly as many bytes as the structure defines. There are no x86 instructions that can move 24 bits in one operation, but there are single instruction moves for 8, 16, 32, and 64 bit data.
If you add another byte field to your 24 bit struct (making it a 32 bit struct), you should see your atomicity return.
Some compilers allow you to define padding on structs to make them behave like native register sized data. If you pad your 24 bit struct, the compiler will add another byte to "round up" the size to 32 bits so that the whole structure can be moved in one atomic instruction. The downside is your structure will always occupy 30% more space in memory.
Note that alignment of the structure in memory is also critical to atomicity. If a multibyte structure does not begin at an aligned address, it may span multiple cache lines in the CPU cache. Reading or writing this data will require multiple clock cycles and multiple read/writes even though the opcode is a single move instruction. So, even single instruction moves may not be atomic if the data is misaligned. x86 does guarantee atomicity for native sized read/writes on aligned boundaries, even in multicore systems.
It is possible to achieve memory atomicity with multi-step moves using the x86 LOCK prefix. However this should be avoided as it can be very expensive in multicore systems (LOCK not only blocks other cores from accessing memory, it also locks the system bus for the duration of the operation which can impact disk I/O and video operations. LOCK may also force the other cores to purge their local caches)

using uint vs int [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have observed for a while that C# programmers tend to use int everywhere, and rarely resort to uint. But I have never discovered a satisfactory answer as to why.
If interoperability is your goal, uint shouldn't appear in public APIs because not all CLI languages support unsigned integers. But that doesn't explain why int is so prevalent, even in internal classes. I suspect this is the reason uint is used sparingly in the BCL.
In C++, if you have an integer for which negative values make no sense, you choose an unsigned integer.
This clearly signifies that negative numbers are not allowed or expected, and the compiler will do some checking for you. I also suspect in the case of array indices, that the JIT can easily drop the lower bounds check.
However, when mixing int and unit types, extra care and casts will be needed.
Should uint be used more? Why?

int is shorter to type than uint.

Your observation of why uint isn't used in the BCL is the main reason, I suspect.
UInt32 is not CLS Compliant, which means that it is wholly inappropriate for use in public APIs. If you're going to be using uint in your private API, this will mean doing conversions to other types - and it's typically easier and safer to just keep the type the same.
I also suspect that this is not as common in C# development, even when C# is the only language being used, primarily because it is not common in the BCL. Developers, in general, try to (thankfully) mimic the style of the framework on which they are building - in C#'s case, this means trying to make your APIs, public and internal, look as much like the .NET Framework BCL as possible. This would mean using uint sparingly.

Normally int will suffice. If you can satisfy all of the following conditions, you can use uint:
It is not for a public API (since uint is not CLS compliant).
You don't need negative numbers.
You (might) need the additional range.
You are not using it in a comparison with < 0, as that is never true.
You are not using it in a comparison with >= 0, as that is never false.
The last requirement is often forgotten and will introduce bugs:
static void Main(string[] args)
{
if (args.Length == 0) return;
uint last = (uint)(args.Length - 1);
// This will eventually throw an IndexOutOfRangeException:
for (uint i = last; i >= 0; i--)
{
Console.WriteLine(args[i]);
}
}

1) Bad habit. Seriously. Even in C/C++.
Think of the common for pattern:
for( int i=0; i<3; i++ )
foo(i);
There's absolutely no reason to use an integer there. You will never have negative values. But almost everyone will do a simple loop that way, even if it contains (at least) two other "style" errors.
2) int is perceived as the native type of the machine.

I prefer uint to int unless a negative number is actually in the range of acceptable values. In particular, accepting an int param but throwing an ArgumentException if the number is less than zero is just silly--use a uint!
I agree that uint is underused, and I encourage everyone else to use it more.

I program at a lower level application layer where ints rarely get above 100, so negative values are not an issue (e.g. for i < myname.length() type stuff) it's just an old C habit - and shorter to type as mentioned above. However, in some cases, when interfacing to hardware where I'm dealing with event flags from devices, the uint is important in cases where a flag may use the left (highest) most bit.
Honestly, for 99.9% of my work I could easily use ushort, but int, you know, sounds sounds a lot better than ushort.

I have made a Direct3D 10 wrapper in C# & need to use uint if I want to create very large vertex buffers. Large buffers in the video card can not be represented with a signed int.
UINT is very useful & is silly to say otherwise. If anyone thinks just because they have never needed to use uint no one else will, you are wrong.

I think it is just laziness. C# is inherently a choice for development on desktops and other machines with relatively much resources.
C and C++, however, has deep roots in old systems and embedded systems where memory is sparse, so programmers are used to think carefully what datatype to use.
C# programmers are lazy, and since there are enough resources in general, nobody really optimizes memory usage (in general, not always of course). Event if a byte would be sufficient, a lot of C# programmers, including me, just use int for simplicity. Moreover, a lot of API functions accept ints, so it prevents casting.
I agree that choosing the correct datatype is good practice, but I think the main motivation is laziness.
Finally, choosing an integer is more mathematically correct. Unsigned ints don't exist in math (only natural numbers). And since most programmers have a mathematical background, using an integer is more natural.

I think a big part of the reason is that when C first came out most of the examples used int for brevity's sake. We rejoiced at not having to write integer like we did with Fortran and Pascal, and in those days we routinely used them for mundane things like array indices and loop counters. Unsigned integers were special cases for large numbers that needed that last extra bit. I think it's a natural progression that C habits continued into C# and other new languages like Python.

Some languages (e.g. many versions of Pascal) regard unsigned types as representing numeric quantities; an operation between an unsigned type and a signed type of the same size will generally be performed as though the operands were promoted to the next larger type (in some such languages, the largest type has no unsigned equivalent, so such promotion will always be possible).
Other languages (e.g. C) regard N-bit unsigned types as a group which wraps around modulo 2^N. Note that subtracting N from a member of such a group doesn't represent numerical subtraction, but rather yields the group member which, when N is added to it, would yield the original. Arguably, certain operations involving mixtures of signed and unsigned values don't really make sense and should perhaps have been forbidden, but even code which is sloppy with its specifications of things like numeric literals will usually work, and code has been written which mixes signed and unsigned types and, despite being sloppy, does work, that the spec isn't apt to change any time soon.
It's a lot easier to work exclusively with signed types than to work out all the intricacies of interactions between signed and unsigned types. Unsigned types are useful when decomposing large numbers out of smaller pieces (e.g. for serialization) or for reconstituting such numbers, but in general it's better to simply use signed numbers for things that actually represent quantities

I know this is probably an old thread but I wanted to give some clarification.
Lets take an int8 you can store –128 to 127 and it uses 1 byte that is a total of 127 positive numbers.
When you use an int8 one of the bits is used for the negative numbers -128.
When you use a Uint8 you give the negative numbers to the positive so this allows you to use 255 positive numbers with the same amount of storage 1 byte.
The only draw back is the you have now lost the capability to use negative values.
Another problem with this is not all programming languages and databases support this.
The only reason you would use this in my opinion is when you need to be efficient in like gaming programming and you have to store large non negative numbers.
This is why not many programs use this it.
The main reason is storage is not a problem and you can't use it flexibly with other software, plugins, Database, or Api's. Also for example a bank would need negative numbers to store money etc.
I hope this will help someone.

.NET Integer vs Int16?

I have a questionable coding practice.
When I need to iterate through a small list of items whose count limit is under 32000, I use Int16 for my i variable type instead of Integer. I do this because I assume using the Int16 is more efficient than a full blown Integer.
Am I wrong? Is there no effective performance difference between using an Int16 vs an Integer? Should I stop using Int16 and just stick with Integer for all my counting/iteration needs?

You should almost always use Int32 or Int64 (and, no, you do not get credit by using UInt32 or UInt64) when looping over an array or collection by index.
The most obvious reason that it's less efficient is that all array and collection indexes found in the BCL take Int32s, so an implicit cast is always going to happen in code that tries to use Int16s as an index.
The less-obvious reason (and the reason that arrays take Int32 as an index) is that the CIL specification says that all operation-stack values are either Int32 or Int64. Every time you either load or store a value to any other integer type (Byte, SByte, UInt16, Int16, UInt32, or UInt64), there is an implicit conversion operation involved. Unsigned types have no penalty for loading, but for storing the value, this amounts to a truncation and a possible overflow check. For the signed types every load sign-extends, and every store sign-collapses (and has a possible overflow check).
The place that this is going to hurt you most is the loop itself, not the array accesses. For example take this innocent-looking loop:
for (short i = 0; i < 32000; i++) {
...
}
Looks good, right? Nope! You can basically ignore the initialization (short i = 0) since it only happens once, but the comparison (i<32000) and incrementing (i++) parts happen 32000 times. Here's some pesudo-code for what this thing looks like at the machine level:
Int16 i = 0;
LOOP:
Int32 temp0 = Convert_I16_To_I32(i); // !!!
if (temp0 >= 32000) goto END;
...
Int32 temp1 = Convert_I16_To_I32(i); // !!!
Int32 temp2 = temp1 + 1;
i = Convert_I32_To_I16(temp2); // !!!
goto LOOP;
END:
There are 3 conversions in there that are run 32000 times. And they could have been completely avoided by just using an Int32 or Int64.
Update: As I said in the comment, I have now, in fact written a blog post on this topic, .NET Integral Data Types And You

According to the below reference, the runtime optimizes performance of Int32 and recommends them for counters and other frequently accessed operations.
From the book: MCTS Self-Paced Training Kit (Exam 70-536): Microsoft® .NET Framework 2.0—Application Development Foundation
Chapter 1: "Framework Fundamentals"
Lesson 1: "Using Value Types"
Best Practices: Optimizing performance
with built-in types
The runtime optimizes the performance of 32-bit integer types (Int32 and UInt32), so use those types for counters and other frequently accessed integral variables.
For floating-point operations, Double is the most efficient type because those operations are optimized by hardware.
Also, Table 1-1 in the same section lists recommended uses for each type.
Relevant to this discussion:
Int16 - Interoperation and other specialized uses
Int32 - Whole numbers and counters
Int64 - Large whole numbers

Int16 may actually be less efficient because the x86 instructions for word access take up more space than the instructions for dword access. It will depend on what the JIT does. But no matter what, it's almost certainly not more efficient when used as the variable in an iteration.

The opposite is true.
32 (or 64) bit integers are faster than int16. In general the native datatype is the fastest one.
Int16 are nice if you want to make your data-structures as lean as possible. This saves space and may improve performance.

Never assume efficiency.
What is or isn't more efficient will vary from compiler to compiler and platform to platform. Unless you actually tested this, there is no way to tell whether int16 or int is more efficient.
I would just stick with ints unless you come across a proven performance problem that using int16 fixes.

Any performance difference is going to be so tiny on modern hardware that for all intents and purposes it'll make no difference. Try writing a couple of test harnesses and run them both a few hundred times, take the average loop completion times, and you'll see what I mean.
It might make sense from a storage perspective if you have very limited resources - embedded systems with a tiny stack, wire protocols designed for slow networks (e.g. GPRS etc), and so on.

Use Int32 on 32-bit machines (or Int64 on 64-bit machines) for fastest performance. Use a smaller integer type if you're really concerned about the space it takes up (may be slower, though).

The others here are correct, only use less than Int32 (for 32-bit code)/Int64 (for 64-bit code) if you need it for extreme storage requirements, or for another level of enforcement on a business object field (you should still have propery level validation in this case, of course).
And in general, don't worry about efficiency until there is a performance problem. And in that case, profile it. And if guess & checking with both ways while profiling doesn't help you enough, check the IL code.
Good question though. You're learning more about how the compiler does it's thing. If you want to learn to program more efficiently, learning the basics of IL and how the C#/VB compilers do their job would be a great idea.

I can't imagine there being any significant performance gain on Int16 vs. int.
You save some bits in the variable declaration.
And definitely not worth the hassle when the specs change and whatever you are counting can go above 32767 now and you discover that when your application starts throwing exceptions...

There is no significant performance gain in using a data type smaller than Int32, in fact, i read somewhere that using Int32 will be faster than Int16 because of memory allocation

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.