Fastest UInt32 to int conversion in C# - c#

I have a fast bit-level routine that calculates a value, and returns a UInt32. I need to store this value in SQL Server in a 32 bit int field. I don't want to increase the size of the field, and just want to store the "bytes" from this function the int field.
Hundreds of these records are requested at a time, so I need the fastest way to convert a UInt32 to int in a loop. If the left-most bit is set in the UInt32, it should set the sign bit of the int (or do anything "repeatable", really, but the sign bit would probably be easiest).
In other words, I just want the 4 bytes of a UInt32 to become the 4 bytes of an 32 bit int. I could use the BitConverter class, but I'm not sure that's the fastest way. Would it be faster to do this with an unchecked area like this:
UInt32 fld = 4292515959;
unchecked {
return (int)fld;
// -2451337
}
I see the reverse question has been asked here, and was just wondering if the answer would be the same going the other way:
Fastest way to cast int to UInt32 bitwise?

I'd say the unchecked version (like unchecked((int)x)) is the fastest way, since there's no method call. I don't believe there's a faster way.
By the way, UInt32 is just another name for uint... going one way is the same as going another way in terms of performance, so this is really the same as the link you posted.
Edit: I remember observing first-hand an instance of a benchmark where checked was faster than unchecked (and no, it wasn't a debug build, it was a release build with optimizations). I don't know why that happened, but in any case, don't think that you'll gain anything measurable by turning of overflow checking.

unchecked((int)x) is required only casting consts and checked and unchecked produces the same results (if the code can compile).
For example this code
uint data = 4292515959;
int uncheckedData;
int checkedData;
unchecked {
uncheckedData = (int)data;
}
checkedData = (int)data;
Console.WriteLine(data);
Console.WriteLine(uncheckedData);
Console.WriteLine(checkedData);
produces this output
4292515959
-2451337
-2451337
To be more concise, this code can be compiled (same result as unchecked((int)data))
uint data = 4292515959;
checkedData = (int)data;
This code (note the const) can't be compiled (requires unchecked)
const uint data = 4292515959;
checkedData = (int)data;
This code can't be compiled as well (requires unchecked)
checkedData = (int)4292515959;

Related

What is the fastest way to convert int to char

What is the fastest way to convert int to char?
I need a faster way because,
convertingchar = Convert.ToChar(intvalue);
is my slowest part in the program.
I have multiple different int values, that have to be converted to char.
My project is very big and I can't post my function. That's why I post this test function.
My Code so far.
char convertingchar = ' ';
...some code...
public void convert(int intvalue){
convertingchar = Convert.ToChar(intvalue);
}
Running a quick performance test between your Convert.ToChar approach and the mentioned casting one, I find that 65535 tests (char.MaxValue):
Convert: 00:00:00.0005447 total
Cast: 00:00:00.0003663 total
At its best, cast was running for me in about half the time of convert.
The implementation of Convert.ToChar as found in Reference Source reveals the time consumer:
public static char ToChar(int value) {
if (value < 0 || value > Char.MaxValue) throw new OverflowException(Environment.GetResourceString("Overflow_Char"));
Contract.EndContractBlock();
return (char)value;
}
While these checks do certainly serve a purpose, your particular use-cases, especially in such a performance-critical situation, may not require them. That's up to you.
A nice alternative to enforce these checks would be to use a ushort rather than an int. Obviously that may or may not be attainable, but with the same maximum value, this means you'll get compile-time checking for what you were previously depending on ToChar to perform.
Convert.ToChar eventually performs an explicit conversion as (char)value, where value is your int value. Before doing so, it checks to ensure value is in the range 0 to 0xffff, and throws an OverflowException if it is not. The extra method call, value/boundary checks, and OverflowException may be useful, but if not, the performance will be better if you just use (char)value.
This will make sure everything is ok while converting but take some time while making sure of that,
convertingchar = Convert.ToChar(intvalue);
This will convert it without making sure everything is ok so less time,
convertingchar = (char)intvalue;
For example.
Console.WriteLine("(char)122 is {0}", (char)122);
yields:
(char)122 is z
NOTE
Not related to question directly but if you feel that Conversion is slow then you might be doing something wrong. The question is why do you need to convert the lot of the int to char. What you are trying to achieve. There might be better way.
The fastest way, contrary to what the others have noted is to not run any code at all. In all the other cases, there is memory allocated for the int and memory allocated for the char. Thus the best that can be achieved is simply copy the int to the char address.
However this code is 100% faster, since no code is run at all.
[StructLayout(LayoutKind.Explicit)]
public struct Foo
{
[FieldOffset(0)]
public int Integer;
[FieldOffset(0)]
public char Char;
}
https://msdn.microsoft.com/en-us/library/aa288471%28v=vs.71%29.aspx

Assigning value from struct to variable fails

Edit 3 describes the narrowed-down problem after debugging
Edit 4 contains the solution - it's all about type difference between C and C#
Today I came across a curious problem. In C I have the following struct:
typedef struct s_z88i2
{
long Node;
long DOF;
long TypeFlag;
double CValue;
}s_z88i2;
Furthermore I have a function (this is a simplified version):
DLLIMPORT int pass_i2(s_z88i2 inp, int total)
{
long nkn=0,ifg=0,iflag1=0;
double wert=0;
int val;
// Testplace 1
nkn=inp.Node;
ifg=inp.DOF;
iflag1=inp.TypeFlag;
wert=inp.CValue;
// Testplace 2
return 0;
}
The assigned values are used nowhere - I'm aware of that.
When I reach // Testplace 1 the following statement is executed:
char tmpstr[256];
sprintf(tmpstr,"RB, Node#: %li, DOF#: %li, Type#: %li, Value: %f", inp.Node, inp.DOF, inp.TypeFlag, inp.CValue);
tmpstr then is passed to a messagebox. It shows - as one would expect - the values given in my struct I passed to the function in an nice and orderly way. Moving on through the function the values inside the struct get assigned to some variables. On reaching Testplace 2 the following is executed:
sprintf(tmpstr,"RB, Node#: %li, DOF#: %li, Type#: %li, Value: %f",nkn, ifg, iflag1, wert);
Again, tmpstr is passed to a messagebox. However, this doesn't show what one would expect. The values for Node and Typeare still correct. For DOFand Value the displayed values are 0 which leads me to the conclusion that something is going terribly wrong during assigning the values. I somehow sometimes managed to get a way to long number for value whis was as incorrect as 0. But I have not been able to reproduce that mistake during my last tests.
Possible values for inp are e.g. {2,1,1,-451.387}, so the first 1 and -451.387are forgotten.
Does anyone know what I'm doing wrong or how to fix this?
Many thanks in advance!
Edit:
Changed %ito %li but the result did not change. Thank to unwind!
I'm developing this dll with Dev-Cpp using MinGW (unfortunately) because I wasn't able to convince Visual Studio 2012 Pro to compile this properly. Although the documentation of the original source says it is plain ANSI-C. This bugs me a bit because I cannot debug this dll properly with Dev-Cpp. Hence the messageboxes.
Edit 2:
As Neil Townsend suggested, I switched to passing a reference. But this also did not cure the problem. When I access the values in my struct directly everything is fine. When I assign them to variables some get lost.
A short notice on how I'm calling the function. The dll is to be accessed from C#, so I'm meddeling with P/Invoke (as I get it).
[DllImport("z88rDLL", CallingConvention = CallingConvention.Cdecl)]
public static extern int pass_i2(ref s_z88i2 inp, int total);
is my definition in C#. I have lots of other functions imported and they all work fine. It is this function I encounter these Problems for the first time. I call the function via:
s_z88i2 tmpi2 = FilesZ88.z88i2F.ConstraintsList[i];
int res = SimulationsCom.pass_i2(ref tmpi2, FilesZ88.z88i2F.ConstraintsList.Count);
First I set the struct, then I call the function.
Why Oh Why has VS to be picky when it comes to compiling ANSI-C? It certainly would make things easier.
Edit 3:
I can narrow the problem down to sprintf, I think. Having convinced VS to build my dll I was able to step through it. It appears that the values are assigned very nicely indeed to the variables they belong in. If, however I want to print these variables via sprintf they turn out rather empty (0). Curiously, that the value is always 0and not something else. I'm still interested in why sprintfbehaves that way, but I consider my initial problem solved/panic defeated. So thanks everyone!
Edit 4:
As supercat points out below, I had a rethink about type-compatibility between C and C#. I was aware that an int in C# evaluates as a long in C. But after double-checking I found that in C my variables are really FR_INT4 (which I kept out of the original question for reasons of clarity => bad idea). Internally FR_INT4 is defined as: #define FR_INT4 long long, so as a super-long-long. A quick test showed that passing a long from C# gives the best compatibility. So the sprintf-issue can maybe be simplified to the question: "What is the format-identifier of a long long?".
It is %lli which would is quite simple, actually. So I can announce drumroll that my problem really is solved!
sprintf(tmpstr,"RB, Node#: %lli, DOF#: %lli, Typ#: %lli, Wert: %f\n", inp.Node, inp.DOF, inp.TypeFlag, inp.CValue);
returns every value I want. Thank you very much everyone!
Formatting a value of type long with the format specifier %i is not valid. You should use %li.
In C, it is a better approach to pass a reference or pointer to the struct rather than the struct. So:
DLLIMPORT int pass_i2(s_z88i2 *inp, int total) {
long nkn=0,ifg=0,iflag1=0;
double wert=0;
int val;
// Testplace 1
nkn=inp->Node;
ifg=inp->DOF;
iflag1=inp->TypeFlag;
wert=inp->CValue;
// Testplace 2
return 0;
}
You will need to correct the sprintf liness accordingly, inp.X becomes inp->X. To use this function either:
// Option A - create it in a declaration, fill it, and send a pointer to that
struct s_z88i2 thing;
// fill out thing
// eg. thing.Node = 2;
pass_i2(&thing, TOTAL);
or:
// Option B - create a pointer; create the memory for the struct, fill it, and send the pointer
struct s_z88i2 *thing;
thing = malloc(sizeof(struct s_z88i2));
// fill out thing
// eg thing->Node = 2;
pass_i2(thing, TOTAL);
This way pass_i2 will operate on the struct you send it, and any changes it makes will be there on return from pass_i2.
To clarify this as answered:
My struct actually is:
typedef struct s_z88i2
{
long long Node;
long long DOF;
long long TypeFlag;
double CValue;
}s_z88i2;
which requires long to be passed from C# (and not int as I previously thought). Through debugging I found out that the assignment of values behaves as it should, the problem was within sprintf. If I use %lli as format-identifier even this problem is solved.
sprintf(tmpstr,"RB, Node#: %lli, DOF#: %lli, Typ#: %lli, Wert: %f\n", inp.Node, inp.DOF, inp.TypeFlag, inp.CValue);
Is the statement I need to use. So thanks again everyone who contributed!

The new IntPtr.Add method - am I missing the point of the int?

Starting from FW 4.0, the IntPtr structure has the Add method:
public static IntPtr Add(
IntPtr pointer,
int offset
)
Which is great, as it's supposed to address all those questions on IntPtr math we have had (1, 2, probably more).
But why is the offset int?
Must it not be IntPtr? I can easily imagine offsetting a 64-bit pointer by a value which is beyond the int range.
For instance, consider Marshal.OffsetOf:
public static IntPtr OffsetOf(
Type t,
string fieldName
)
It returns an IntPtr as the offset to the structure member. Which makes perfect sense! And you cannot easily use this offset with the new Add method. You'd have to cast it to Int64, then call Add several times in a loop.
Also, it seems to kill the very idea of the IntPtr.Size being irrelevant to a properly written application. You will have to cast the offset to a particular type, such as Int64, at which point you must start managing the size difference. And image what will happen when 128-bit IntPtr appears.
My question here is, why?
Am I correct in my conclusions, or am I missing the point?
It corresponds to a restriction in the x64 architecture. Relative addressing is limited to a signed 32-bit offset value. Matt Pietrek mentions this in this article (near "Luckily, the answer is no"). This restriction also explains why .NET objects are still limited to 2GB in 64-bit mode. Similarly, in native x64 C/C++ code memory allocations are limited as well. It is not that it is impossible, the displacement could be stored in a 64-bit register, it is just that this would make array indexing a lot more expensive.
The mysterious return type of Marshal.OffsetOf() is probably a corner-case. A managed struct could result in an unmanaged version after applying [StructLayout] and [MarshalAs] that's larger than 2GB.
Yes, this wouldn't map well to some future 128-bit architecture. But it is extraordinarily difficult to prep today's software for an arch when nobody knows what it will look like. Perhaps the old adage fits, 16 Terabytes ought to be enough for anybody. And there's lots of room left to grow beyond that, 2^64 is rather a large number. Current 64-bit processors only implement 2^48. Some seriously non-trivial problems need to be solved before machines can move that close.
If you define only:
public static IntPtr Add(IntPtr pointer, IntPtr offset)
then, adding a 32 bits offset to a 64 bits pointer is less readable, IMHO.
Again, if you define
public static IntPtr Add(IntPtr pointer, long offset)
then, adding a 64 bits offset to a 32 bits pointer is also bad.
By the way, Substract returns an IntPtr, so the IntPtr logic is not broken in anyway.

Should I use int or UInt16?

This may be somewhat trivial, but in C# do you prefer int or UInt16 when storing a network port in a variable? Framework classes use int when dealing with a network port although UInt16 actually represents the valid values.
signed (int / short etc, rather that uint / ushort) have the advantage of being CLS compliant, so that is recommended unless you have a good reason.
Re int vs short - in most cases it is more efficient to compute with int (or uint), since all the operators are optimised for this. If you are only storing and retrieving it then this isn't an issue, of course.
if you have 32 bit processor and you will use 16bit value (for memory economy) it will be aligned to 32bit. So I think it is not so important use 16bit uint instead of 32bit value.

In C# is there any significant performance difference for using UInt32 vs Int32

I am porting an existing application to C# and want to improve performance wherever possible. Many existing loop counters and array references are defined as System.UInt32, instead of the Int32 I would have used.
Is there any significant performance difference for using UInt32 vs Int32?
The short answer is "No. Any performance impact will be negligible".
The correct answer is "It depends."
A better question is, "Should I use uint when I'm certain I don't need a sign?"
The reason you cannot give a definitive "yes" or "no" with regards to performance is because the target platform will ultimately determine performance. That is, the performance is dictated by whatever processor is going to be executing the code, and the instructions available. Your .NET code compiles down to Intermediate Language (IL or Bytecode). These instructions are then compiled to the target platform by the Just-In-Time (JIT) compiler as part of the Common Language Runtime (CLR). You can't control or predict what code will be generated for every user.
So knowing that the hardware is the final arbiter of performance, the question becomes, "How different is the code .NET generates for a signed versus unsigned integer?" and "Does the difference impact my application and my target platforms?"
The best way to answer these questions is to run a test.
class Program
{
static void Main(string[] args)
{
const int iterations = 100;
Console.WriteLine($"Signed: {Iterate(TestSigned, iterations)}");
Console.WriteLine($"Unsigned: {Iterate(TestUnsigned, iterations)}");
Console.Read();
}
private static void TestUnsigned()
{
uint accumulator = 0;
var max = (uint)Int32.MaxValue;
for (uint i = 0; i < max; i++) ++accumulator;
}
static void TestSigned()
{
int accumulator = 0;
var max = Int32.MaxValue;
for (int i = 0; i < max; i++) ++accumulator;
}
static TimeSpan Iterate(Action action, int count)
{
var elapsed = TimeSpan.Zero;
for (int i = 0; i < count; i++)
elapsed += Time(action);
return new TimeSpan(elapsed.Ticks / count);
}
static TimeSpan Time(Action action)
{
var sw = new Stopwatch();
sw.Start();
action();
sw.Stop();
return sw.Elapsed;
}
}
The two test methods, TestSigned and TestUnsigned, each perform ~2 million iterations of a simple increment on a signed and unsigned integer, respectively. The test code runs 100 iterations of each test and averages the results. This should weed out any potential inconsistencies. The results on my i7-5960X compiled for x64 were:
Signed: 00:00:00.5066966
Unsigned: 00:00:00.5052279
These results are nearly identical, but to get a definitive answer, we really need to look at the bytecode generated for the program. We can use ILDASM as part of the .NET SDK to inspect the code in the assembly generated by the compiler.
Here, we can see that the C# compiler favors signed integers and actually performs most operations natively as signed integers and only ever treats the value in-memory as unsigned when comparing for the branch (a.k.a jump or if). Despite the fact that we're using an unsigned integer for both the iterator AND the accumulator in TestUnsigned, the code is nearly identical to the TestSigned method except for a single instruction: IL_0016. A quick glance at the ECMA spec describes the difference:
blt.un.s :
Branch to target if less than (unsigned or unordered), short form.
blt.s :
Branch to target if less than, short form.
Being such a common instruction, it's safe to assume that most modern high-power processors will have hardware instructions for both operations and they'll very likely execute in the same number of cycles, but this is not guaranteed. A low-power processor may have fewer instructions and not have a branch for unsigned int. In this case, the JIT compiler may have to emit multiple hardware instructions (A conversion first, then a branch, for instance) to execute the blt.un.s IL instruction. Even if this is the case, these additional instructions would be basic and probably wouldn't impact the performance significantly.
So in terms of performance, the long answer is "It is unlikely that there will be a performance difference at all between using a signed or an unsigned integer. If there is a difference, it is likely to be negligible."
So then if the performance is identical, the next logical question is, "Should I use an unsigned value when I'm certain I don't need a sign?"
There are two things to consider here: first, unsigned integers are NOT CLS-compliant, meaning that you may run into issues if you're exposing an unsigned integer as part of an API that another program will consume (such as if you're distributing a reusable library). Second, most operations in .NET, including the method signatures exposed by the BCL (for the reason above), use a signed integer. So if you plan on actually using your unsigned integer, you'll likely find yourself casting it quite a bit. This is going to have a very small performance hit and will make your code a little messier. In the end, it's probably not worth it.
TLDR; back in my C++ days, I'd say "Use whatever is most appropriate and let the compiler sort the rest out." C# is not quite as cut-and-dry, so I would say this for .NET: There's really no performance difference between a signed and unsigned integer on x86/x64, but most operations require a signed integer, so unless you really NEED to restrict the values to positive ONLY or you really NEED the extra range that the sign bit eats, stick with a signed integer. Your code will be cleaner in the end.
I don't think there are any performance considerations, other than possible difference between signed and unsigned arithmetic at the processor level but at that point I think the differences are moot.
The bigger difference is in the CLS compliance as the unsigned types are not CLS compliant as not all languages support them.
I haven't done any research on the matter in .NET, but in the olden days of Win32/C++, if you wanted to cast a "signed int" to a "signed long", the cpu had to run an op to extend the sign. To cast an "unsigned int" to an "unsigned long", it just had stuff zero in the upper bytes. Savings was on the order of a couple of clock cycles (i.e., you'd have to do it billions of times to have an even perceivable difference)
There is no difference, performance wise. Simple integer calculations are well known and modern cpu's are highly optimized to perform them quickly.
These types of optimizations are rarely worth the effort. Use the data type that is most appropriate for the task and leave it at that. If this thing so much as touches a database you could probably find a dozen tweaks in the DB design, query syntax or indexing strategy that would offset a code optimization in C# by a few hundred orders of magnitude.
Its going to allocate the same amount of memory either way (although the one can store a larger value, as its not saving space for the sign). So I doubt you'll see a 'performance' difference, unless you use large values / negative values that will cause one option or the other to explode.
this isn't really to do with performance rather requirements for the loop counter.
Prehaps there were lots of iterations to complete
Console.WriteLine(Int32.MaxValue); // Max interation 2147483647
Console.WriteLine(UInt32.MaxValue); // Max interation 4294967295
The unsigned int may be there for a reason.
I've never empathized with the use of int in loops for(int i=0;i<bla;i++). And oftentimes I would also like to use unsigned just to avoid checking the range. Unfortunately (both in C++ and for similar reasons in C#), the recommendation is to not use unsigned to gain one more bit or to ensure non-negativity, :
"Using an unsigned instead of an int to gain one more bit to represent positive integers is almost never a good idea. Attempts to ensure that some values are positive by declaring variables unsigned will typically be defeated by the implicit conversion rules"
page 73 from "The C++ Programming Language" by the language's creator Bjarne Stroustrup.
My understanding (I apologize for not having the source at hand) is that hardware makers also have a bias to optimize for integer types.
Nonetheless, it would be interesting to do the same exercise that #Robear did above but using integer with some positivity assert versus unsigned.

Categories