I'm trying to implement SQL Server Vardecimal decompression. Values stored as 3 digits decimals per every 10 bits. But during implementation I found strange behavior of math. Here is simple test I made
private SqlDecimal Test() {
SqlDecimal mantissa = 0;
SqlDecimal sign = -1;
byte exponent = 0x20;
int numDigits = 0;
// -999999999999999999999999999999999.99999
for (int i = 0; i < 13; i++) {
int temp = 999;
//equal to mantissa = mantissa * 1000 + temp;
numDigits += 3;
int pwr = exponent - (numDigits - 1);
mantissa += temp * (SqlDecimal)Math.Pow(10, pwr);
}
return sign * mantissa;
}
First 2 passes are fine, I have
999000000000000000000000000000000
999999000000000000000000000000000
but third have
999999998999999999999980020000000
Is it some bug in C# SqlDecimal math or am I doing something wrong?
This is an issue with how you're constructing the value to add here:
mantissa += temp * (SqlDecimal)Math.Pow(10, pwr);
The problem starts when pwr is 24. You can see this very clearly here:
Console.WriteLine((SqlDecimal) Math.Pow(10, 24));
The output on my box is:
999999999999999980000000
Now I don't know exactly where that's coming from - but it's simplest to remove the floating point arithmetic entirely. While it may not be efficient, this is a simple way of avoiding the problem:
static SqlDecimal PowerOfTen(int power)
{
// Note: only works for non-negative power values at the moment!
// (To handle negative input, divide by 10 on each iteration instead.)
SqlDecimal result = 1;
for (int i = 0; i < power; i++)
{
result = result * 10;
}
return result;
}
If you then change the line to:
mantissa += temp * PowerOfTen(pwr);
then you'll get the results you expect - at least while pwr is greater than zero. It should be easy to fix PowerOfTen to handle negative values as well though.
Update
Amending the below method to just work with Parse and ToString should improve performance for larger numbers (which would be the general use case for these types):
public static SqlDecimal ToSqlDecimal(this BigInteger bigint)
{
return SqlDecimal.Parse(bigint.ToString());
}
This trick also works for the double returned by the original Math.Pow call; so you could just do:
SqlDecimal.Parse(string.Format("{0:0}",Math.Pow(10,24)))
Original Answer
Obviously #JonSkeet's answer's best, as it only involves 24 iterations, vs potentially thousands in my attempt. However, here's an alternate solution, which may help out in other scenarios where you need to convert large integers (i.e. System.Numeric.BigInteger) to SqlDecimal / where performance is less of a concern.
Fiddle Example
//using System.Data.SqlTypes;
//using System.Numerics; //also needs an assembly reference to System.Numerics.dll
public static class BigIntegerExtensions
{
public static SqlDecimal ToSqlDecimal(this BigInteger bigint)
{
SqlDecimal result = 0;
var longMax = (SqlDecimal)long.MaxValue; //cache the converted value to minimise conversions
var longMin = (SqlDecimal)long.MinValue;
while (bigint > long.MaxValue)
{
result += longMax;
bigint -= long.MaxValue;
}
while (bigint < long.MinValue)
{
result += longMin;
bigint -= long.MinValue;
}
return result + (SqlDecimal)(long)bigint;
}
}
For your above use case, you could use this like so (uses the BigInteger.Pow method):
mantissa += temp * BigInteger.Pow(10, pwr).ToSqlDecimal();
Related
So, I've been learning C# and testing some simple algorithms out. I made this simple class that exposes a recursive Fibonacci number function. I use memoization (Dynamic Programming) to store previously found numbers. Here's the code:
using Godot;
using System.Collections.Generic;
public class Exercise1 : Node {
private BigInteger teste = new BigInteger(1);
private Dictionary<BigInteger,BigInteger> memory = new Dictionary<BigInteger,BigInteger>();
public override void _Ready() {
RunBigIntegerCraziness();
}
private void RunBigIntegerCraziness() {
for (int i = 0; i < 31227; i++) {
GD.Print($"fib number {i} is {fib(new BigInteger(i))}");
}
}
private BigInteger fib(BigInteger n) {
if (memory.ContainsKey(n)) {
return memory[n];
}
if (n <= 2) {
memory[n] = 1;
return 1;
}
memory[n - 2] = fib(n - 2);
memory[n - 1] = fib(n - 1);
return memory[n - 2] + memory[n - 1];
}
}
Ignore the "Godot" part. It's just that I was testing this inside a game project. Everything compiles fine, but I can only calculate up to Fibonacci n. 3226. If I go to numbers equal to 3227 and beyond, I get this Exception:
[...]
fib number 3225 is 43217018697618272220345809139733426666656338842625944764401661804465121290773093888861438958973337206110398501101783011185091567135587979099219045977958276652741787987152919489957724618258731270111934419344108965974546742136386635343927537356176338553654798753734888560554135669621772542530920892471422002609630627040146600381068673360870794221630560104764217344676242315795514744073614579107596818134891238017641931792490286597416223216551326908997909707498658766245465906764466010328772845314077258564566442129155001040721886128505146365749238671331993692655687520038382893117763783477305776640877748401894737521738794911907045829607125767696264441933278046913082328818850
fib number 3226 is 69926605145186078778460989556214883682824163521963475389625827936774132010815037784261200216187927277027753726906285824183824754568491676416709800266452379691484582909141867315765704538889919267496081895355700988068705924800720430980434359659981529442293650167261872365958477365094269319478110803029308487644284790516508517647989046631899202985143468253781566270183590285229230335042129126551683888682955813507183937267823895233985645240278207971782178625906849647650415867576295127507836850507509010403410481726883571748090361307264480634218098397060429202475118649538779225621232854604363989464362465170636407301900981359138471646464444082736135056091569488488491377766743
Unhandled Exception:
System.ArithmeticException: Overflow or underflow in the arithmetic operation.
at BigInteger.op_Addition (BigInteger bi1, BigInteger bi2) [0x000fa] in :0
at Exercise1.fib (BigInteger n) [0x000a8] in /Users/rafael/gamedev/godot/mytests/CSharpStudy/study_classes/Exercise1.cs:31
at Exercise1.RunBigIntegerCraziness () [0x00006] in /Users/rafael/gamedev/godot/mytests/CSharpStudy/study_classes/Exercise1.cs:15
at Exercise1._Ready () [0x00001] in /Users/rafael/gamedev/godot/mytests/CSharpStudy/study_classes/Exercise1.cs:10
The terminal process terminated with exit code: 1
Aren't "BigInteger"s supposed to handle pretty high numbers??
In the source of that BigInteger, there is:
// maximum length of the BigInteger in uint (4 bytes)
// change this to suit the required level of precision.
private const int maxLength = 70;
The length is counted in limbs, each limb is 32 bits in this implementation. Additionally, the topmost bit of the topmost limb is treated as a sign bit. Therefore, without changing the source the maximum number that can be stored in this kind of BigInteger (this limit does not apply to the BigInteger in System.Numerics) is 270*32-1-1, or in other words, there are 2239 "normal" bits available.
That allows for some fairly big integers, but not big enough: according to Wolfram Alpha fib(3227) requires just over 2239 bits, therefore it does not fit.
I have a piece of code that is
// Bernstein hash
// http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx
ulong result = (ulong)s[0];
for ( int i = 1; i < s.Length; ++i )
{
result = 33 * result + (ulong)s[i];
}
return (int)result % Buckets.Count;
and the problem is that it's sometimes returning negative values. I know the reason is because (int)result can be negative. But I want to coerce it to be non-negative since it's used as an index. Now I realize I could do
int k = (int)result % Buckets.Count;
k = k < 0 ? k*-1 : k;
return k;
but is there a better way?
On a deeper level, why is int used for the index of containers in C#? I come from a C++ background and we have size_t which is an unsigned integral type. That makes more sense to me.
Use
return (int)(result % (ulong)Buckets.Count);
As you sum up values you reach a positive integer number that cannot be expressed as a positive number in a signed 32 bit integer. The cast to int will return a negative number. The modulo operation will then also return a negative number. If you do the modulo operation first, you will get a low positive number and the cast to int will do no harm.
While you can find a way to cast this to an int properly, I'm wondering why you don't just calculate it as an int from the beginning.
int result = (int)s[0]; // or, if s[0] is already an int, omit the cast
for ( int i = 1; i < s.Length; ++i )
{
result = 33 * result + (int)s[i];
}
return Math.Abs(result) % Buckets.Count;
As to why C# uses a signed int for indexes, it has to do with cross-language compatibility.
In my code I need to convert string representation of integers to long and double values.
String representation is a byte array (byte[]). For example, for a number 12345 string representation is { 49, 50, 51, 52, 53 }
Currently, I use following obvious code for conversion to long (and almost the same code for conversion to double)
private long bytesToIntValue()
{
string s = System.Text.Encoding.GetEncoding("Latin1").GetString(bytes);
return long.Parse(s, CultureInfo.InvariantCulture);
}
This code works as expected, but in my case I want something better. It's because currently I must convert bytes to string first.
In my case, bytesToIntValue() gets called about 12 million times and about 25% of all memory allocations are made in this method.
Sure, I want to optimize this part. I want to perform conversions without intermediate string (+ speed, - allocation).
What would you recommend? How can I perform conversions without intermediate strings? Is there a faster method to perform conversions?
EDIT:
Byte arrays I am dealing with are always contain ASCII-encoded data. Numbers can be negative. For double values exponential format is allowed. Hexadecimal integers are not allowed.
How can I perform conversions without intermediate strings?
Well you can easily convert each byte to a char. For example - untested:
private static long ConvertAsciiBytesToInt32(byte[] bytes)
{
long value = 0;
foreach (byte b in bytes)
{
value *= 10L;
char c = b; // Implicit conversion; effectively ISO-8859-1
if (c < '0' || c > '9')
{
throw new ArgumentException("Bytes contains non-digit: " + c);
}
value += (c - '0');
}
return value;
}
Note that this really does assume it's ASCII (or compatible) - if your byte array is actually UTF-16 (for example) then it will definitely do the wrong thing.
Also note that this doesn't perform any sort of length validation or overflow checking... and it doesn't cope with negative numbers. You could add all of these if you want, but we don't know enough about your requirements to know if it's worth adding the complexity.
I'm not sure that there is a easy way to do that,
Please note that it won't work with other encodings, The test shown on my computer that this is only 3 times faster (I don't think it worth it).
The code + test :
class MainClass
{
public static void Main(string[] args)
{
string str = "12341234";
byte[] buffer = Encoding.ASCII.GetBytes(str);
Stopwatch sw = Stopwatch.StartNew();
for(int i = 0; i < 1000000 ;i ++)
{
long val = BufferToLong.GetValue(buffer);
}
Console.WriteLine (sw.ElapsedMilliseconds);
sw.Restart();
for (int i = 0 ; i < 1000000 ; i++)
{
string valStr = Encoding.ASCII.GetString(buffer);
long val = long.Parse(valStr);
}
Console.WriteLine (sw.ElapsedMilliseconds);
}
}
static class BufferToLong
{
public static long GetValue(Byte[] buffer) {
long number = 0;
foreach (byte currentByte in buffer) {
char currentChar = (char)currentByte;
int currentDigit = currentChar - '0';
number *= 10 ;
number += currentDigit;
}
return number;
}
}
In the end, I created C# version of strol function. This function comes with CRT and source code of CRT comes with Visual Studio.
The resulting method is almost the same as code provided by #Jon Skeet in his answer but also contains some checks for overflow.
In my case all the changes proved to be very useful in terms of speed and memory.
What I am looking for is something like PHPs decbin function in C#. That function converts decimals to its representation as a string.
For example, when using decbin(21) it returns 10101 as result.
I found this function which basically does what I want, but maybe there is a better / faster way?
var result = Convert.ToString(number, 2);
– Almost the only use for the (otherwise useless) Convert class.
Most ways will be better and faster than the function that you found. It's not a very good example on how to do the conversion.
The built in method Convert.ToString(num, base) is an obvious choice, but you can easily write a replacement if you need it to work differently.
This is a simple method where you can specify the length of the binary number:
public static string ToBin(int value, int len) {
return (len > 1 ? ToBin(value >> 1, len - 1) : null) + "01"[value & 1];
}
It uses recursion, the first part (before the +) calls itself to create the binary representation of the number except for the last digit, and the second part takes care of the last digit.
Example:
Console.WriteLine(ToBin(42, 8));
Output:
00101010
int toBase = 2;
string binary = Convert.ToString(21, toBase); // "10101"
To have the binary value in (at least) a specified number of digits, padded with zeroes:
string bin = Convert.ToString(1234, 2).PadLeft(16, '0');
The Convert.ToString does the conversion to a binary string.
The PadLeft adds zeroes to fill it up to 16 digits.
This is my answer:
static bool[] Dec2Bin(int value)
{
if (value == 0) return new[] { false };
var n = (int)(Math.Log(value) / Math.Log(2));
var a = new bool[n + 1];
for (var i = n; i >= 0; i--)
{
n = (int)Math.Pow(2, i);
if (n > value) continue;
a[i] = true;
value -= n;
}
Array.Reverse(a);
return a;
}
Using Pow instead of modulo and divide so i think it's faster way.
The number is bigger than int & long but can be accomodated in Decimal. But the normal ToString or Convert methods don't work on Decimal.
I believe this will produce the right results where it returns anything, but may reject valid integers. I dare say that can be worked around with a bit of effort though... (Oh, and it will also fail for negative numbers at the moment.)
static string ConvertToHex(decimal d)
{
int[] bits = decimal.GetBits(d);
if (bits[3] != 0) // Sign and exponent
{
throw new ArgumentException();
}
return string.Format("{0:x8}{1:x8}{2:x8}",
(uint)bits[2], (uint)bits[1], (uint)bits[0]);
}
Do it manually!
http://www.permadi.com/tutorial/numDecToHex/
I've got to agree with James - do it manually - but don't use base-16. Use base 2^32, and print 8 hex digits at a time.
I guess one option would be to keep taking chunks off it, and converting individual chunks? A bit of mod/division etc, converting individual fragments...
So: what hex value do you expect?
Here's two approaches... one uses the binary structure of decimal; one does it manually. In reality, you might want to have a test: if bits[3] is zero, do it the quick way, otherwise do it manually.
decimal d = 588063595292424954445828M;
int[] bits = decimal.GetBits(d);
if (bits[3] != 0) throw new InvalidOperationException("Only +ve integers supported!");
string s = Convert.ToString(bits[2], 16).PadLeft(8,'0') // high
+ Convert.ToString(bits[1], 16).PadLeft(8, '0') // middle
+ Convert.ToString(bits[0], 16).PadLeft(8, '0'); // low
Console.WriteLine(s);
/* or Jon's much tidier: string.Format("{0:x8}{1:x8}{2:x8}",
(uint)bits[2], (uint)bits[1], (uint)bits[0]); */
const decimal chunk = (decimal)(1 << 16);
StringBuilder sb = new StringBuilder();
while (d > 0)
{
int fragment = (int) (d % chunk);
sb.Insert(0, Convert.ToString(fragment, 16).PadLeft(4, '0'));
d -= fragment;
d /= chunk;
}
Console.WriteLine(sb);