What I am looking for is something like PHPs decbin function in C#. That function converts decimals to its representation as a string.
For example, when using decbin(21) it returns 10101 as result.
I found this function which basically does what I want, but maybe there is a better / faster way?
var result = Convert.ToString(number, 2);
– Almost the only use for the (otherwise useless) Convert class.
Most ways will be better and faster than the function that you found. It's not a very good example on how to do the conversion.
The built in method Convert.ToString(num, base) is an obvious choice, but you can easily write a replacement if you need it to work differently.
This is a simple method where you can specify the length of the binary number:
public static string ToBin(int value, int len) {
return (len > 1 ? ToBin(value >> 1, len - 1) : null) + "01"[value & 1];
}
It uses recursion, the first part (before the +) calls itself to create the binary representation of the number except for the last digit, and the second part takes care of the last digit.
Example:
Console.WriteLine(ToBin(42, 8));
Output:
00101010
int toBase = 2;
string binary = Convert.ToString(21, toBase); // "10101"
To have the binary value in (at least) a specified number of digits, padded with zeroes:
string bin = Convert.ToString(1234, 2).PadLeft(16, '0');
The Convert.ToString does the conversion to a binary string.
The PadLeft adds zeroes to fill it up to 16 digits.
This is my answer:
static bool[] Dec2Bin(int value)
{
if (value == 0) return new[] { false };
var n = (int)(Math.Log(value) / Math.Log(2));
var a = new bool[n + 1];
for (var i = n; i >= 0; i--)
{
n = (int)Math.Pow(2, i);
if (n > value) continue;
a[i] = true;
value -= n;
}
Array.Reverse(a);
return a;
}
Using Pow instead of modulo and divide so i think it's faster way.
Related
I'm trying to implement SQL Server Vardecimal decompression. Values stored as 3 digits decimals per every 10 bits. But during implementation I found strange behavior of math. Here is simple test I made
private SqlDecimal Test() {
SqlDecimal mantissa = 0;
SqlDecimal sign = -1;
byte exponent = 0x20;
int numDigits = 0;
// -999999999999999999999999999999999.99999
for (int i = 0; i < 13; i++) {
int temp = 999;
//equal to mantissa = mantissa * 1000 + temp;
numDigits += 3;
int pwr = exponent - (numDigits - 1);
mantissa += temp * (SqlDecimal)Math.Pow(10, pwr);
}
return sign * mantissa;
}
First 2 passes are fine, I have
999000000000000000000000000000000
999999000000000000000000000000000
but third have
999999998999999999999980020000000
Is it some bug in C# SqlDecimal math or am I doing something wrong?
This is an issue with how you're constructing the value to add here:
mantissa += temp * (SqlDecimal)Math.Pow(10, pwr);
The problem starts when pwr is 24. You can see this very clearly here:
Console.WriteLine((SqlDecimal) Math.Pow(10, 24));
The output on my box is:
999999999999999980000000
Now I don't know exactly where that's coming from - but it's simplest to remove the floating point arithmetic entirely. While it may not be efficient, this is a simple way of avoiding the problem:
static SqlDecimal PowerOfTen(int power)
{
// Note: only works for non-negative power values at the moment!
// (To handle negative input, divide by 10 on each iteration instead.)
SqlDecimal result = 1;
for (int i = 0; i < power; i++)
{
result = result * 10;
}
return result;
}
If you then change the line to:
mantissa += temp * PowerOfTen(pwr);
then you'll get the results you expect - at least while pwr is greater than zero. It should be easy to fix PowerOfTen to handle negative values as well though.
Update
Amending the below method to just work with Parse and ToString should improve performance for larger numbers (which would be the general use case for these types):
public static SqlDecimal ToSqlDecimal(this BigInteger bigint)
{
return SqlDecimal.Parse(bigint.ToString());
}
This trick also works for the double returned by the original Math.Pow call; so you could just do:
SqlDecimal.Parse(string.Format("{0:0}",Math.Pow(10,24)))
Original Answer
Obviously #JonSkeet's answer's best, as it only involves 24 iterations, vs potentially thousands in my attempt. However, here's an alternate solution, which may help out in other scenarios where you need to convert large integers (i.e. System.Numeric.BigInteger) to SqlDecimal / where performance is less of a concern.
Fiddle Example
//using System.Data.SqlTypes;
//using System.Numerics; //also needs an assembly reference to System.Numerics.dll
public static class BigIntegerExtensions
{
public static SqlDecimal ToSqlDecimal(this BigInteger bigint)
{
SqlDecimal result = 0;
var longMax = (SqlDecimal)long.MaxValue; //cache the converted value to minimise conversions
var longMin = (SqlDecimal)long.MinValue;
while (bigint > long.MaxValue)
{
result += longMax;
bigint -= long.MaxValue;
}
while (bigint < long.MinValue)
{
result += longMin;
bigint -= long.MinValue;
}
return result + (SqlDecimal)(long)bigint;
}
}
For your above use case, you could use this like so (uses the BigInteger.Pow method):
mantissa += temp * BigInteger.Pow(10, pwr).ToSqlDecimal();
I am using BigInteger.Parse(some string) but it takes forever and I'm not even sure if it finishes.
However, I can convert the huge string to a byte array and jam the byte array into a BigInteger constructor in very little time but it munges the original number stored in the string because of the endian issue with BigInteger and byte arrays.
Is there a way to convert the string to a byte array and put the byte array into the BigInteger object while preserving the original number stored in ASCII in the string?
String s = "12345"; // Some huge string, millions of digits.
BigInteger bi = new BigInteger(Encoding.ASCII.GetBytes(s); // very fast but the 12345 is lost
// OR...
BigInteger bi = BigInteger.Parse(s); // Takes forever therefore unuseable.
The byte[] representation of BigInteger has little to do with the ASCII characters. Much like the byte representation of an int has little to do with the ASCII representation of it.
To parse the number, each character must be converted to the digit value, and added to the previously parsed value multiplied by 10. That is probably why it's taking so long, and any version you write will probably not perform better. It has to do something like:
var nr=0;
foreach(var c in "123") nr=nr*10+(c-'0');
Edit
While it is not possible to perform the conversion by just converting to a byte array, the library implementation is slower then it has to be (at least for simple scenarios that do not need internationalization for example). Using the trick suggested by Rudy Velthuis in the comments and not taking into account hex formats or internationalization, I was able to produce a version which for 303104 characters runs ~5 times faster (from 18.2s to 3.75s. For 1 milion digits the fast method takes 47s, long, but it is a huge number):
public static class Helper
{
static BigInteger[] factors = Enumerable.Range(0, 19).Select(i=> BigInteger.Pow(10, i)).ToArray();
public static BigInteger ParseFast(string str)
{
var result = new BigInteger(0);
var n = str.Length;
var hasSgn = str[0] == '-';
int j;
for (var i = hasSgn ? 1 : 0; i < n; i += j - i)
{
long gr = 0;
for (j = i; j < i + 18 && j < n; j++)
{
gr = gr * 10 + (str[j] - '0');
}
result = result * factors[j-i]+ gr;
}
if (hasSgn)
{
result = BigInteger.MinusOne * result;
}
return result;
}
}
In my code I need to convert string representation of integers to long and double values.
String representation is a byte array (byte[]). For example, for a number 12345 string representation is { 49, 50, 51, 52, 53 }
Currently, I use following obvious code for conversion to long (and almost the same code for conversion to double)
private long bytesToIntValue()
{
string s = System.Text.Encoding.GetEncoding("Latin1").GetString(bytes);
return long.Parse(s, CultureInfo.InvariantCulture);
}
This code works as expected, but in my case I want something better. It's because currently I must convert bytes to string first.
In my case, bytesToIntValue() gets called about 12 million times and about 25% of all memory allocations are made in this method.
Sure, I want to optimize this part. I want to perform conversions without intermediate string (+ speed, - allocation).
What would you recommend? How can I perform conversions without intermediate strings? Is there a faster method to perform conversions?
EDIT:
Byte arrays I am dealing with are always contain ASCII-encoded data. Numbers can be negative. For double values exponential format is allowed. Hexadecimal integers are not allowed.
How can I perform conversions without intermediate strings?
Well you can easily convert each byte to a char. For example - untested:
private static long ConvertAsciiBytesToInt32(byte[] bytes)
{
long value = 0;
foreach (byte b in bytes)
{
value *= 10L;
char c = b; // Implicit conversion; effectively ISO-8859-1
if (c < '0' || c > '9')
{
throw new ArgumentException("Bytes contains non-digit: " + c);
}
value += (c - '0');
}
return value;
}
Note that this really does assume it's ASCII (or compatible) - if your byte array is actually UTF-16 (for example) then it will definitely do the wrong thing.
Also note that this doesn't perform any sort of length validation or overflow checking... and it doesn't cope with negative numbers. You could add all of these if you want, but we don't know enough about your requirements to know if it's worth adding the complexity.
I'm not sure that there is a easy way to do that,
Please note that it won't work with other encodings, The test shown on my computer that this is only 3 times faster (I don't think it worth it).
The code + test :
class MainClass
{
public static void Main(string[] args)
{
string str = "12341234";
byte[] buffer = Encoding.ASCII.GetBytes(str);
Stopwatch sw = Stopwatch.StartNew();
for(int i = 0; i < 1000000 ;i ++)
{
long val = BufferToLong.GetValue(buffer);
}
Console.WriteLine (sw.ElapsedMilliseconds);
sw.Restart();
for (int i = 0 ; i < 1000000 ; i++)
{
string valStr = Encoding.ASCII.GetString(buffer);
long val = long.Parse(valStr);
}
Console.WriteLine (sw.ElapsedMilliseconds);
}
}
static class BufferToLong
{
public static long GetValue(Byte[] buffer) {
long number = 0;
foreach (byte currentByte in buffer) {
char currentChar = (char)currentByte;
int currentDigit = currentChar - '0';
number *= 10 ;
number += currentDigit;
}
return number;
}
}
In the end, I created C# version of strol function. This function comes with CRT and source code of CRT comes with Visual Studio.
The resulting method is almost the same as code provided by #Jon Skeet in his answer but also contains some checks for overflow.
In my case all the changes proved to be very useful in terms of speed and memory.
I have a binary number 1011011, how can I loop through all these binary digits one after the other ?
I know how to do this for decimal integers by using modulo and division.
int n = 0x5b; // 1011011
Really you should just do this, hexadecimal in general is much better representation:
printf("%x", n); // this prints "5b"
To get it in binary, (with emphasis on easy understanding) try something like this:
printf("%s", "0b"); // common prefix to denote that binary follows
bool leading = true; // we're processing leading zeroes
// starting with the most significant bit to the least
for (int i = sizeof(n) * CHAR_BIT - 1; i >= 0; --i) {
int bit = (n >> i) & 1;
leading |= bit; // if the bit is 1, we are no longer reading leading zeroes
if (!leading)
printf("%d", bit);
}
if (leading) // all zero, so just print 0
printf("0");
// at this point, for n = 0x5b, we'll have printed 0b1011011
You can use modulo and division by 2 exactly like you would in base 10. You can also use binary operators, but if you already know how to do that in base 10, it would be easier if you just used division and modulo
Expanding on Frédéric and Gabi's answers, all you need to do is realise that the rules in base 2 are no different to in base 10 - you just need to do your division and modulus with a divisor 2 instead of 10.
The next step is simply to use number >> 1 instead of number / 2 and number & 0x1 instead of number % 2 to improve performance. Mind you, with modern optimising compilers there's probably no difference...
Use an AND with increasing powers of two...
In C, at least, you can do something like:
while (val != 0)
{
printf("%d", val&0x1);
val = val>>1;
}
To expand on #Marco's answer with an example:
uint value = 0x82fa9281;
for (int i = 0; i < 32; i++)
{
bool set = (value & 0x1) != 0;
value >>= 1;
Console.WriteLine("Bit set: {0}", set);
}
What this does is test the last bit, and then shift everything one bit.
If you're already starting with a string, you could just iterate through each of the characters in the string:
var values = "1011011".Reverse().ToCharArray();
for(var index = 0; index < values.Length; index++) {
var isSet = (Boolean)Int32.Parse(values[index]); // Boolean.Parse only works on "true"/"false", not 0/1
// do whatever
}
byte input = Convert.ToByte("1011011", 2);
BitArray arr = new BitArray(new[] { input });
foreach (bool value in arr)
{
// ...
}
You can simply loop through every bit. The following C like pseudocode allows you to set the bit number you want to check. (You might also want to google endianness)
for()
{
bitnumber = <your bit>
printf("%d",(val & 1<<bitnumber)?1:0);
}
The code basically writes 1 if the bit it set or 0 if not. We shift the value 1 (which in binary is 1 ;) ) the number of bits set in bitnumber and then we AND it with the value in val to see if it matches up. Simple as that!
So if bitnumber is 3 we simply do this
00000100 ( The value 1 is shifted 3 left for example)
AND
10110110 (We check it with whatever you're value is)
=
00000100 = True! - Both values have bit 3 set!
The number is bigger than int & long but can be accomodated in Decimal. But the normal ToString or Convert methods don't work on Decimal.
I believe this will produce the right results where it returns anything, but may reject valid integers. I dare say that can be worked around with a bit of effort though... (Oh, and it will also fail for negative numbers at the moment.)
static string ConvertToHex(decimal d)
{
int[] bits = decimal.GetBits(d);
if (bits[3] != 0) // Sign and exponent
{
throw new ArgumentException();
}
return string.Format("{0:x8}{1:x8}{2:x8}",
(uint)bits[2], (uint)bits[1], (uint)bits[0]);
}
Do it manually!
http://www.permadi.com/tutorial/numDecToHex/
I've got to agree with James - do it manually - but don't use base-16. Use base 2^32, and print 8 hex digits at a time.
I guess one option would be to keep taking chunks off it, and converting individual chunks? A bit of mod/division etc, converting individual fragments...
So: what hex value do you expect?
Here's two approaches... one uses the binary structure of decimal; one does it manually. In reality, you might want to have a test: if bits[3] is zero, do it the quick way, otherwise do it manually.
decimal d = 588063595292424954445828M;
int[] bits = decimal.GetBits(d);
if (bits[3] != 0) throw new InvalidOperationException("Only +ve integers supported!");
string s = Convert.ToString(bits[2], 16).PadLeft(8,'0') // high
+ Convert.ToString(bits[1], 16).PadLeft(8, '0') // middle
+ Convert.ToString(bits[0], 16).PadLeft(8, '0'); // low
Console.WriteLine(s);
/* or Jon's much tidier: string.Format("{0:x8}{1:x8}{2:x8}",
(uint)bits[2], (uint)bits[1], (uint)bits[0]); */
const decimal chunk = (decimal)(1 << 16);
StringBuilder sb = new StringBuilder();
while (d > 0)
{
int fragment = (int) (d % chunk);
sb.Insert(0, Convert.ToString(fragment, 16).PadLeft(4, '0'));
d -= fragment;
d /= chunk;
}
Console.WriteLine(sb);