Why Do Bytes Carryover? - c#

I have been playing with some byte arrays recently (dealing with grayscale images). A byte can have values 0-255. I was modifying the bytes, and came across a situation where the value I was assigning to the byte was outside the bounds of the byte. It was doing unexpected things to the images I was playing with.
I wrote a test and learned that the byte carries over. Example:
private static int SetByte(int y)
{
return y;
}
.....
byte x = (byte) SetByte(-4);
Console.WriteLine(x);
//output is 252
There is a carryover! This happens when we go the other way around as well.
byte x = (byte) SetByte(259);
Console.WriteLine(x);
//output is 3
I would have expected it to set it to 255 in the first situation and 0 in the second. What is the purpose of this carry over? Is it just due to the fact that I'm casting this integer assignment? When is this useful in the real-world?

byte x = (byte) SetByte(259);
Console.WriteLine(x);
//output is 3
The cast of the result of SetByte is applying modulo 256 to your integer input, effectively dropping bits that are outside the range of a byte.
259 % 256 = 3
Why: The implementers choose to only consider the 8 least significant bits, ignoring the rest.

When compiling C# you can specify whether the assembly should be compiled in checked or unchecked mode (unchecked is default). You are also able to make certain parts of code explicit via the use of the checked or unchecked keywords.
You are currently using unchecked mode which ignores arithmetic overflow and truncates the value. The checked mode will check for possible overflows and throw if they are encountered.
Try the following:
int y = 259;
byte x = checked((byte)y);
And you will see it throws an OverflowException.
The reason why the behaviour in unchecked mode is to truncate rather than clamp is largely for performance reasons, every unchecked cast would require conditional logic to clamp the value when the majority of the time it is unnecessary and can be done manually.
Another reason is that clamping would involve a loss of data which may not be desirable. I don't condone code such as the following but have seen it (see this answer):
int input = 259;
var firstByte = (byte)input;
var secondByte = (byte)(input >> 8);
int reconstructed = (int)firstByte + (secondByte << 8);
Assert.AreEqual(reconstructed, input);
If firstByte came out as anything other than 3 this would not work at all.
One of the places I most commonly rely upon numeric carry over is when implementing GetHashCode(), see this answer to What is the best algorithm for an overridden System.Object.GetHashCode by Jon Skeet. It would be a nightmare to implement GetHashCode decently if overflowing meant we were constrained to Int32.MaxValue.

The method SetByte is irrelevant, simply casting (byte) 259 will also result in 3, since downcasting integral types is implemented as cutting of bytes.
You can create a custom clamp function:
public static byte Clamp(int n) {
if(n <= 0) return 0;
if(n >= 256) return 255;
return (byte) n;
}

Doing arithmetic modulo 2^n makes it possible for overflow errors in different directions to cancel each other out.
byte under = -12; // = 244
byte over = (byte) 260; // = 4
byte total = under + over;
Console.WriteLine(total); // prints 248, as intended
If .NET instead had overflows saturate, then the above program would print the incorrect answer 255.

The bounds control is not active for a case with direct type cast (when using (byte)) to avoid performance reducing.
FYI, result of most operations with operands of byte is integer, excluding the bit operations. Use Convert.ToByte() and you will get an Overflow Exception and you may handle it by assigning the 255 to your target.
Or you may create a fuction to do this check, as mentioned by another guy below.
If the perfomanse is a key, try to add attribute [MethodImpl(MethodImplOptions.AggressiveInlining)]
to that fuction.

Related

Byte overflow evaluate to zero instead of exception?

static void Main(string[] args)
{
int n;
byte b;
n = 256;
b = (byte) n;
Console.WriteLine(b); //0
}
C# byte range is 0 to 255 and hence I try to cast an int of 256 to byte and see what will happen.
Surprisingly it returns 0 instead of 255 or better yet give me an overflow exception?
UPDATES:
I'm trying it on macos which is Mono, if it matters and .NET Framework 4.7
That is the expected behaviour. If you think about it, 256 is one "1" followed by 8 zeroes in binary. When you take away everything except the least significant 8 bits, you get 8 zeroes, which is the value 0.
From the C# language specification §6.2.1:
For a conversion from an integral type to another integral type, the
processing depends on the overflow checking context (§7.6.12) in which
the conversion takes place:
In a checked context, the conversion succeeds if the value of the source operand is within the range of the destination type, but throws
a System.OverflowException if the value of the source operand is
outside the range of the destination type.
In an unchecked context, the conversion always succeeds, and proceeds as follows.
If the source type is larger than the destination type, then the source value is truncated by discarding its “extra” most significant
bits. The result is then treated as a value of the destination type.
If you want an exception, you can used checked:
b = checked((byte) n);
I would like to complement the previous answer.
Take a look at this:
255 -> 11111111 +
001 -> 00000001
256 -> 100000000
As you can see. We have 256 in binary format, but as your number is eight bits, 1 can't be stored. This leave the number 00000000 which is zero.
This is more theory than C# specific question. But i think this is important to understand.

Get random double (floating point) value from random byte array between 0 and 1 in C#?

Assume I have an array of bytes which are truly random (e.g. captured from an entropy source).
byte[] myTrulyRandomBytes = MyEntropyHardwareEngine.GetBytes(8);
Now, I want to get a random double precision floating point value, but between the values of 0 and positive 1 (like the Random.NextDouble() function performs).
Simply passing an array of 8 random bytes into BitConverter.ToDouble() can yield strange results, but most importantly, the results will almost never be less than 1.
I am fine with bit-manipulation, but the formatting of floating point numbers has always been mysterious to me. I tried many combinations of bits to apply randomness to and always ended up finding the numbers were either just over 1, always VERY close to 0, or very large.
Can someone explain which bits should be made random in a double in order to make it random within the range 0 and 1?
Though working answers have been given, I'll give an other one, that looks worse but isn't:
long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) / long.MaxValue;
The issue with casting from an ulong to double is that it's not directly supported by hardware, so it compiles to this:
vxorps xmm0,xmm0,xmm0
vcvtsi2sd xmm0,xmm0,rcx ; interpret ulong as long and convert it to double
test rcx,rcx ; add fixup if it was "negative"
jge 000000000000001D
vaddsd xmm0,xmm0,mmword ptr [00000060h]
vdivsd xmm0,xmm0,mmword ptr [00000068h]
Whereas with my suggestion it will compile more nicely:
vxorps xmm0,xmm0,xmm0
vcvtsi2sd xmm0,xmm0,rcx
vdivsd xmm0,xmm0,mmword ptr [00000060h]
Both tested with the x64 JIT in .NET 4, but this applies in general, there just isn't a nice way to convert an ulong to a double.
Don't worry about the bit of entropy being lost: there are only 262 doubles between 0.0 and 1.0 in the first place, and most of the smaller doubles cannot be chosen so the number of possible results is even less.
Note that this as well as the presented ulong examples can result in exactly 1.0 and distribute the values with slightly differing gaps between adjacent results because they don't divide by a power of two. You can change them exclude 1.0 and get a slightly more uniform spacing (but see the first plot below, there is a bunch of different gaps, but this way it is very regular) like this:
long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) / ((double)long.MaxValue + 1);
As a really nice bonus, you can now change the division to a multiplication (powers of two usually have inverses)
long asLong = BitConverter.ToInt64(myTrulyRandomBytes, 0);
double number = (double)(asLong & long.MaxValue) * 1.08420217248550443400745280086994171142578125E-19;
Same idea for ulong, if you really want to use that.
Since you also seemed interested specifically in how to do it with double-bits trickery, I can show that too.
Because of the whole significand/exponent deal, it can't really be done in a super direct way (just reinterpreting the bits and that's it), mainly because choosing the exponent uniformly spells trouble (with a uniform exponent, the numbers are necessarily clumped preferentially near 0 since most exponents are there).
But if the exponent is fixed, it's easy to make a double that's uniform in that region. That cannot be 0 to 1 because that spans a lot of exponents, but it can be 1 to 2 and then we can subtract 1.
So first mask away the bits that won't be part of the significand:
x &= (1L << 52) - 1;
Put in the exponent (1.0 - 2.0 range, excluding 2)
x |= 0x3ff0000000000000;
Reinterpret and adjust for the offset of 1:
return BitConverter.Int64BitsToDouble(x) - 1;
Should be pretty fast, too. An unfortunate side effect is that this time it really does cost a bit of entropy, because there are only 52 but there could have been 53. This way always leaves the least significant bit zero (the implicit bit steals a bit).
There were some concerns about the distributions, which I will address now.
The approach of choosing a random (u)long and dividing it by the maximum value clearly has a uniformly chosen (u)long, and what happens after that is actually interesting. The result can justifiably be called a uniform distribution, but if you look at it as a discrete distribution (which it actually is) it looks (qualitatively) like this: (all examples for minifloats)
Ignore the "thicker" lines and wider gaps, that's just the histogram being funny. These plots used division by a power of two, so there is no spacing problem in reality, it's only plotted strangely.
Top is what happens when you use too many bits, as happens when dividing a complete (u)long by its max value. This gives the lower floats a better resolution, but lots of different (u)longs get mapped onto the same float in the higher regions. That's not necessarily a bad thing, if you "zoom out" the density is the same everywhere.
The bottom is what happens when the resolution is limited to the worst case (0.5 to 1.0 region) everywhere, which you can do by limiting the number of bits first and then doing the "scale the integer" deal. My second suggesting with the bit hacks does not achieve this, it's limited to half that resolution.
For what it's worth, NextDouble in System.Random scales a non-negative int into the 0.0 .. 1.0 range. The resolution of that is obviously a lot lower than it could be. It also uses an int that cannot be int.MaxValue and therefore scales by approximately 1/(231-1) (cannot be represented by a double, so slightly rounded), so there are actually 33 slightly different gaps between adjacent possible results, though the majority of the gaps is the same distance.
Since int.MaxValue is small compared to what can be brute-forced these days, you can easily generate all possible results of NextDouble and examine them, for example I ran this:
const double scale = 4.6566128752458E-10;
double prev = 0;
Dictionary<long, int> hist = new Dictionary<long, int>();
for (int i = 0; i < int.MaxValue; i++)
{
long bits = BitConverter.DoubleToInt64Bits(i * scale - prev);
if (!hist.ContainsKey(bits))
hist[bits] = 1;
else
hist[bits]++;
prev = i * scale;
if ((i & 0xFFFFFF) == 0)
Console.WriteLine("{0:0.00}%", 100.0 * i / int.MaxValue);
}
This is easier than you think; its all about scaling (also true when going from a 0-1 range to some other range).
Basically, if you know that you have 64 truly random bits (8 bytes) then just do this:
double zeroToOneDouble = (double)(BitConverter.ToUInt64(bytes) / (decimal)ulong.MaxValue);
The trouble with this kind of algorithm comes when your "random" bits aren't actually uniformally random. That's when you need a specialized algorithm, such as a Mersenne Twister.
I don't know wether it's the best solution for this, but it should do the job:
ulong asLong = BitConverter.ToUInt64(myTrulyRandomBytes, 0);
double number = (double)asLong / ulong.MaxValue;
All I'm doing is converting the byte array to a ulong which is then divided by it's max value, so that the result is between 0 and 1.
To make sure the long value is within the range from 0 to 1, you can apply the following mask:
long longValue = BitConverter.ToInt64(myTrulyRandomBytes, 0);
longValue &= 0x3fefffffffffffff;
The resulting value is guaranteed to lay in the range [0, 1).
Remark. The 0x3fefffffffffffff value is very-very close to 1 and will be printed as 1, but it is really a bit less than 1.
If you want to make the generated values greater, you could set a number higher bits of an exponent to 1. For instance:
longValue |= 0x03c00000000000000;
Summarizing: example on dotnetfiddle.
If you care about the quality of the random numbers generated, be very suspicious of the answers that have appeared so far.
Those answers that use Int64BitsToDouble directly will definitely have problems with NaNs and infinities. For example, 0x7ff0000000000001, a perfectly good random bit pattern, converts to NaN (and so do thousands of others).
Those that try to convert to a ulong and then scale, or convert to a double after ensuring that various bit-pattern constraints are met, won't have NaN problems, but they are very likely to have distributional problems. Representable floating point numbers are not distributed uniformly over (0, 1), so any scheme that randomly picks among all representable values will not produce values with the required uniformity.
To be safe, just use ToInt32 and use that int as a seed for Random. (To be extra safe, reject 0.) This won't be as fast as the other schemes, but it will be much safer. A lot of research and effort has gone into making RNGs good in ways that are not immediately obvious.
Simple piece of code to print the bits out for you.
for (double i = 0; i < 1.0; i+=0.05)
{
var doubleToInt64Bits = BitConverter.DoubleToInt64Bits(i);
Console.WriteLine("{0}:\t{1}", i, Convert.ToString(doubleToInt64Bits, 2));
}
0.05: 11111110101001100110011001100110011001100110011001100110011010
0.1: 11111110111001100110011001100110011001100110011001100110011010
0.15: 11111111000011001100110011001100110011001100110011001100110100
0.2: 11111111001001100110011001100110011001100110011001100110011010
0.25: 11111111010000000000000000000000000000000000000000000000000000
0.3: 11111111010011001100110011001100110011001100110011001100110011
0.35: 11111111010110011001100110011001100110011001100110011001100110
0.4: 11111111011001100110011001100110011001100110011001100110011001
0.45: 11111111011100110011001100110011001100110011001100110011001100
0.5: 11111111011111111111111111111111111111111111111111111111111111
0.55: 11111111100001100110011001100110011001100110011001100110011001
0.6: 11111111100011001100110011001100110011001100110011001100110011
0.65: 11111111100100110011001100110011001100110011001100110011001101
0.7: 11111111100110011001100110011001100110011001100110011001100111
0.75: 11111111101000000000000000000000000000000000000000000000000001
0.8: 11111111101001100110011001100110011001100110011001100110011011
0.85: 11111111101011001100110011001100110011001100110011001100110101
0.9: 11111111101100110011001100110011001100110011001100110011001111
0.95: 11111111101110011001100110011001100110011001100110011001101001

Difference between two large numbers C#

There are already solutions to this problem for small numbers:
Here: Difference between 2 numbers
Here: C# function to find the delta of two numbers
Here: How can I find the difference between 2 values in C#?
I'll summarise the answer to them all:
Math.Abs(a - b)
The problem is when the numbers are large this gives the wrong answer (by means of an overflow). Worse still, if (a - b) = Int32.MinValue then Math.Abs crashes with an exception (because Int32.MaxValue = Int32.MinValue - 1):
System.OverflowException occurred
HResult=0x80131516
Message=Negating the minimum value of a twos complement number is
invalid.
Source=mscorlib
StackTrace: at
System.Math.AbsHelper(Int32 value) at System.Math.Abs(Int32 value)
Its specific nature leads to difficult-to-reproduce bugs.
Maybe I'm missing some well known library function, but is there any way of determining the difference safely?
As suggested by others, use BigInteger as defined in System.Numerics (you'll have to include the namespace in Visual Studio)
Then you can just do:
BigInteger a = new BigInteger();
BigInteger b = new BigInteger();
// Assign values to a and b somewhere in here...
// Then just use included BigInteger.Abs method
BigInteger result = BigInteger.Abs(a - b);
Jeremy Thompson's answer is still valid, but note that the BigInteger namespace includes an absolute value method, so there shouldn't be any need for special logic. Also, Math.Abs expects a decimal, so it will give you grief if you try to pass in a BigInteger.
Keep in mind there are caveats to using BigIntegers. If you have a ludicrously large number, C# will try to allocate memory for it, and you may run into out of memory exceptions. On the flip side, BigIntegers are great because the amount of memory allotted to them is dynamically changed as the number gets larger.
Check out the microsoft reference here for more info: https://msdn.microsoft.com/en-us/library/system.numerics.biginteger(v=vs.110).aspx
The question is, how do you want to hold the difference between two large numbers? If you're calculating the difference between two signed long (64-bit) integers, for example, and the difference will not fit into a signed long integer, how do you intend to store it?
long a = +(1 << 62) + 1000;
long b = -(1 << 62);
long dif = a - b; // Overflow, bit truncation
The difference between a and b is wider than 64 bits, so when it's stored into a long integer, its high-order bits are truncated, and you get a strange value for dif.
In other words, you cannot store all possible differences between signed integer values of a given width into a signed integer of the same width. (You can only store half of all of the possible values; the other half require an extra bit.)
Your options are to either use a wider type to hold the difference (which won't help you if you're already using the widest long integer type), or to use a different arithmetic type. If you need at least 64 signed bits of precision, you'll probably need to use BigInteger.
The BigInteger was introduced in .Net 4.0.
There are some open source implementations available in lower versions of the .Net Framework, however you'd be wise to go with the standard.
If the Math.Abs still gives you grief you can implement the function yourself; if the number is negative (a - b < 0) simply trim the negative symbol so its unsigned.
Also, have you tried using Doubles? They hold much larger values.
Here's an alternative that might be interesting to you, but is very much within the confines of a particular int size. This example uses Int32, and uses bitwise operators to accomplish the difference and then the absolute value. This implementation is tolerant of your scenario where a - b equals the min int value, it naturally returns the min int value (not much else you can do, without casting things to the a larger data type). I don't think this is as good an answer as using BigInteger, but it is fun to play with if nothing else:
static int diff(int a, int b)
{
int xorResult = (a ^ b);
int diff = (a & xorResult) - (b & xorResult);
return (diff + (diff >> 31)) ^ (diff >> 31);
}
Here are some cases I ran it through to play with the behavior:
Console.WriteLine(diff(13, 14)); // 1
Console.WriteLine(diff(11, 9)); // 2
Console.WriteLine(diff(5002000, 2346728)); // 2655272
Console.WriteLine(diff(int.MinValue, 0)); // Should be 2147483648, but int data type can't go that large. Actual result will be -2147483648.

What is the limit of the Value Type BigInteger in C#?

As described in MSDN BigInteger is :
An immutable type that represents an arbitrarily large integer whose
value in theory has no upper or lower bounds.
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
MSDN goes further saying :
an OutOfMemoryException can be thrown for any operation that causes a
BigInteger value to grow too large.
and more :
Although this process is transparent to the caller, it does incur a
performance penalty. In some cases, especially when repeated
operations are performed in a loop on very large BigInteger values
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
I was told that it has ReferenceType obejects inside it, but all I can find here in its definition in VisualStudio is ValueTypes.
What's its real limit ? And even if doesn't have one, how can it "as a value type" manage to store all that amount of data ?
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
No, that's not true. It's a conventional limit, but it's entirely feasible for a value type to take more than that. For example:
public struct Foo {
private readonly int a, b, c, d, e; // Look ma, 20 bytes!
}
However, I strongly suspect that BigInteger actually includes a reference to a byte array:
public struct BigInteger {
private readonly byte[] data;
// Some other fields...
}
(Moslem Ben Dhaou's answer shows one current implementation using int and uint[], but of course the details of this are intentionally hidden.)
So the value of a BigInteger can still be small, but it can refer to a big chunk of memory - and if there isn't enough memory to allocate what's required when you perform some operation, you'll get an exception.
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
Well BigInteger is for integers, so I wouldn't particularly want to use it for anything to do with double... but fundamentally the limitations are going to be around how much memory you've got and the size of array the CLR can cope with. In reality, you'd be talking about enormous numbers before actually hitting the limit for any specific number - but if you have gazillions of smaller numbers, that obviously has large memory requirements too.
As a confirmation to the answer from Jon Skeet, I looked to the source code of BigInteger. It actually contains two internal properties as follow:
internal int _sign;
internal uint[] _bits;
_bits is used by almost all private/public methods within the class which are used to read/write the actual data.
_sign is used to keep the sign of the BigInteger.
The private methods are extensively using binary operators and calculations. Here is a small list of constants used in the class that might reflect a bit the limits:
private const int knMaskHighBit = -2147483648;
private const uint kuMaskHighBit = 2147483648U;
private const int kcbitUint = 32;
private const int kcbitUlong = 64;
private const int DecimalScaleFactorMask = 16711680;
private const int DecimalSignMask = -2147483648;
PS: I should have commented on J.S. answer, but a comment is too short. To view the source code, either download it or decompile System.Numerics.dll.
TL;DR: BigInteger maxvalue is 2^68685922272
In .Net 4.7.2 BigInteger uses an uint array for bits.
An uint holds 32bits of data.
An array's max size is defined as internal const int MaxArrayLength = 0X7FEFFFFF;
7FEFFFFF = 2146435071
Now, to calculate: max size of array x capacity of each uint is: 2146435071 x 32 = 68685922272. But that's only the count of the bits in a BigInteger.
Which means BigInteger's max value is: 2^68'685'922'272 which is stupendusly large (used ' for easier readability).
If they ever decide to increase the array's max size, then it will also increase the max value for BigInteger.
I just did some quick experiments on this. Max seems to be around 2^65,000,000,000 but actual practicality 2146435071
I get a System.OverflowException on the below at 0x1F. It overflowed between E FFFF FFE2 and F 7FFF FFE1. (or somewhere between 2^64,424,509,410 and 2^66,571,993,057)
// Test 1
BigInteger test = 1;
for (int i = 0x00; i < 0xFF; i++)
test <<= 0x7FFFFFFF;
// Test 2
BigInteger.Pow((BigInteger)2, 0x7FEFFFF0); // OK - I think - never finished
BigInteger.Pow((BigInteger)2, 0x7FEFFFFF); // Immediate OutOfMemoryException
I should also note that while ~66,571,993,057 seems to be supported. The usefulness is more like 2^2146435071 because POWER and shifts don't seem to work with a POWER larger then 2,146,435,071(for POW() ) or a shift amount more than 2,147,483,647. Larger shifts can be done but it would take several rounds ruining efficiency. And the other item is slow at those speeds - a single shift was taking about 7 seconds and BigInteger.Pow() took at least 5 minutes.
.Net 5, AMD Threadripper, 32GB RAM, Windows 10 x64

Is there an easy way to convert from 32bit integer to 16bit integer?

I have a 32 bit int and I want to address only the lower half of this variable. I know I can convert to bit array and to int16, but is there any more straight forward way to do that?
It you want only the lower half, you can just cast it: (Int16)my32BitInt
In general, if you're extending/truncating bit patterns like this, then you do need to be careful about signed types - unsigned types may cause fewer surprises.
As mentioned in the comments - if you've enclosed your code in a 'checked' context, or changed your compiler options so that the default is 'checked', then you can't truncate a number like this without an exception being thrown if there are any non-zero bits being discarded - in that situation you'd need to do:
(UInt16)(my32BitInt & 0xffff)
(The option of using signed types is gone in this case, because you'd have to use & 0x7fff which then preserves only 15 bits)
just use this function
Convert.ToInt16()
or just
(Int16)valueasint
You can use implicit conversation to Int16 like;
(Int16)2;
but be careful when you do that. Because Int16 can't hold all possible Int32 values.
For example this won't work;
(Int16)2147483683;
because Int16 can hold 32787 as maximum value. You can use unchecked (C# Reference) keyword such this cases.
If you force an unchecked operation, a cast should work:
int r = 0xF000001;
short trimmed = unchecked((short) r);
This will truncate the value of r to fit in a short.
If the value of r should always fit in a short, you can just do a normal cast and let an exception be thrown.
If you need a 16 bit value and you happen to know something specific like that the number will never be less than zero, you could use a UINT16 value. That conversion looks like:
int x = 0;
UInt16 value = (UInt16)x;
This has the full (positive) range of an integer.
Well, first, make sure you actually want to have the value signed. uint and ushort are there for a reason. Then:
ushort ret = (ushort)(val & ((1 << 16) - 1));

Categories