I'm trying to transfer a C# function to C++ but ran into a lovely problem I've never seen nor needed before. Everything has transfer quite nicely except for a bit shift line.
C# - Works without problems.
long val = 2791804260201463808;
int Cap = (int)val; //-608501760
val = (long)((ulong)val >> 32);
return val; // this returns 650017582
Now transfer to C++
C++ - Compile error "warning C4293: '>>': shift count negative or too big, undefined behavior"
long val = 2791804260201463808;
int Cap = (int)val; //-608501760
val = (long)((ulong)val >> 32);
return val; // this returns -608501760 - No change, as if bit shift was skipped
How can I transfer this? I'm having a problem seeing out of my box.
I've tried different variable types with no luck.
Related
I need to extract some bit ranges from a 16-byte value, e.g.:
bit 0 = first thing
next 54 bits = second thing
next 52 bits = third thing
last 21 bits = fourth thing
.net doesn't have a UInt128 structure, well it has the BigInteger class, but I'm not sure that's right for the job, maybe it is?
I have found a third party library that can read bits from a stream, but when trying to convert them back to UInt64's using the BitConverter, it will fail, as 54 bits isn't long enough for a UInt64, but it's too long for a UInt32
My immediate thought was the bit shifting was the way to do this, but now I'm not so sure how to proceed, seeing as I can't think of a good way of handling the original 16 bytes.
Any suggestions or comments would be appreciated.
Here's some untested code. I'm sure that there are bugs in it (whenever I write code like this, I get shifts, masks, etc. wrong). However, it should be enough to get you started. If you get this working and there are only a few problems, let me know in the comments and I'll fix things. If you can't get it to work, let me know as well, and I'll delete the answer. If it requires a major rewrite, post your working code as an answer and let me know.
The other thing to worry about with this (since you mentioned that this comes from a file) is endian-ness. Not all computer architectures represent values in the same way. I'll leave any byte swizzling (if needed) to you.
First, structs in C++ are basically the same as classes (though people think they are different). In C#, they are very different. A struct in C# is a Value Type. When you do value type assignment, the compiler makes a copy of the value of the struct, rather than just making a copy to a reference to the object (like it does with classes). Value types have an implicit default constructor that initializes all members to their default (zero or null) values.
Marking the struct with [StructLayout(LayoutKind.Sequential)] tells the compiler to layout the members in the specified order (they compiler doesn't have to normally). This allows you to pass a reference to one of these (via P/Invoke) to a C program if you want to.
So, my struct starts off this way:
[StructLayout(LayoutKind.Sequential)]
public struct Struct128
{
//not using auto-properties with private setters on purpose.
//This should look like a single 128-bit value (in part, because of LayoutKind.Sequential)
private ulong _bottom64bits;
private ulong _top64bits;
}
Now I'm going to add members to that struct. Since you are getting the 128 bits from a file, don't try to read the data into a single 128-bit structure (if you can figure out how (look up serialization), you can, but...). Instead, read 64 bits at a time and use a constructor like this one:
public Struct128(ulong bottom64, ulong top64)
{
_top64bits = top64;
_bottom64bits = bottom64;
}
If you need to write the data in one of these back into the file, go get it 64-bits at a time using read-only properties like this:
//read access to the raw storage
public ulong Top64 => _top64bits;
public ulong Bottom64 => _bottom64bits;
Now we need to get and set the various bit-ish values out of our structure. Getting (and setting) the first thing is easy:
public bool FirstThing
{
get => (_bottom64bits & 0x01) == 1;
set
{
//set or clear the 0 bit
if (value)
{
_bottom64bits |= 1ul;
}
else
{
_bottom64bits &= (~1ul);
}
}
}
Getting/setting the second and fourth things are very similar. In both cases, to get the value, you mask away all but the important bits and then shift the result. To set the value, you take the property value, shift it to the right place, zero out the bits in the appropriate (top or bottom) value stored in the structure and OR in the new bits (that you set up by shifting)
//bits 1 through 55
private const ulong SecondThingMask = 0b111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1110;
public ulong SecondThing
{
get => (_bottom64bits & SecondThingMask) >> 1;
set
{
var shifted = (value << 1) & SecondThingMask;
_bottom64bits = (_bottom64bits & (~SecondThingMask)) | shifted;
}
}
and
//top 21 bits
private const ulong FourthThingMask = 0b1111_1111_1111_1111_1111_1000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000;
//to shift the top 21 bits down to the bottom 21 bits, need to shift 64-21
private const int FourthThingShift = 64 - 21;
public uint FourthThing
{
get => (uint)((_top64bits & FourthThingMask) >> FourthThingShift);
set
{
var shifted = ((ulong)value << FourthThingShift) & FourthThingMask;
_top64bits = (_top64bits & (~FourthThingMask)) | shifted;
}
}
It's the third thing that is tricky. To get the value, you need to mask the correct bits out of both the top and bottom values, shift them to the right positions and return the ORed result.
To set the value, you need to take the property value, split it into upper and lower portions and then do the same kind of magic ORing that was done for the second and fourth things:
//the third thing is the hard part.
//The bottom 55 bits of the _bottom64bits are dedicate to the 1st and 2nd things, so the next 9 are the bottom 9 of the 3rd thing
//The other 52-9 (=43) bits come-from/go-to the _top64bits
//top 9 bits
private const ulong ThirdThingBottomMask = 0b1111_1111_1000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000;
//bottom 43 bits
private const ulong ThirdThingTopMask = 0b111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111;
private const int ThirdThingBottomShift = 64 - 9;
//bottom 9 bits
private const ulong ThirdThingBottomSetMask = 0b1_1111_1111;
//all but the bottom 9 bits
private const ulong ThirdThingTopSetMask = 0b1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1110_0000_0000;
//52 bits total
private const ulong ThirdThingOverallMask = 0b1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111;
public ulong ThirdThing
{
get
{
var bottom = (_bottom64bits & ThirdThingBottomMask) >> ThirdThingBottomShift;
var top = (_top64bits & ThirdThingTopMask) << 9;
return top | bottom;
}
set
{
var masked = value & ThirdThingOverallMask;
var bottom = (masked & ThirdThingBottomSetMask) << ThirdThingBottomShift;
_bottom64bits = (_bottom64bits & (~ThirdThingBottomSetMask)) | bottom;
var top = (masked & ThirdThingTopSetMask) >> 9;
_top64bits = (_top64bits & (~ThirdThingTopSetMask)) | top;
}
}
I hope this is useful. Let me know.
I have a SQL Server table that has a column in it that is defined as Binary(7).
It is updated with data from a Cobol program that has Comp-3 data (packed decimal).
I wrote a C# program to take a number and create the Comp-3 value. I have it available to SQL Server via CLR Integration. I'm able to access it like a stored procedure.
My problem is, I need to take the value from this program and save it in the binary column. When I select a row of data that is already in there, I am seeing a value like the following:
0x00012F0000000F
The value shown is COBOL comp-3 (packed decimal) data, stored in the SQL table. Remember, this field is defined as Binary(7). There are two values concatenated and stored here. Unsigned value 12, and unsigned value 0.
I need to concatenate 0x00012F (length of 3 characters) and 0x0000000F (length of 4 characters) together and write it to the column.
My question is two part.
1) I am able to return a string representation of the Comp-3 value from my program. But, I'm not sure if this is the format I need to return to make this work. What format should I return to SQL, so it can be used correctly?
2) What do I need to do to convert this to make it work?
I hope I was clear enough. It's a lot to digest...Thanks!
I figured it out!
I needed to change the output to byte[], and reference it coming out of the program in SQL as varbinary.
This is the code, if anyone else in the future needs it. I hope this helps others that need to create Comp-3 (packed decimal) in SQL. I'll outline the steps to use it below.
Below is the source for the C# program. Compile it as a dll.
using System;
using System.Collections.Generic;
using System.Data;
using Microsoft.SqlServer.Server;
using System.Data.SqlTypes;
namespace Numeric2Comp3
{
//PackedDecimal conversions
public class PackedDecimal
{
[Microsoft.SqlServer.Server.SqlProcedure]
public static void ToComp3(string numberin, out byte[] hexarray, out string hexvalue)
{
long value;
bool result = Int64.TryParse(numberin, out value);
if (!result)
{
hexarray = null;
hexvalue = null;
return;
}
Stack<byte> comp3 = new Stack<byte>(10);
byte currentByte;
if (value < 0)
{
currentByte = 0x0d; //signed -
value = -value;
}
else if (numberin.Trim().StartsWith("+"))
{
currentByte = 0x0c; //signed +
}
else
{
currentByte = 0x0f; //unsigned
}
bool byteComplete = false;
while (value != 0)
{
if (byteComplete)
currentByte = (byte)(value % 10);
else
currentByte |= (byte)((value % 10) << 4);
value /= 10;
byteComplete = !byteComplete;
if (byteComplete)
comp3.Push(currentByte);
}
if (!byteComplete)
comp3.Push(currentByte);
hexarray = comp3.ToArray();
hexvalue = bytesToHex(comp3.ToArray());
}
private static string bytesToHex(byte[] buf)
{
string HexChars = "0123456789ABCDEF";
System.Text.StringBuilder sb = new System.Text.StringBuilder((buf.Length / 2) * 5 + 3);
for (int i = 0; i < buf.Length; i++)
{
sbyte b = Convert.ToSByte(buf[i]);
b = (sbyte)(b >> 4); // Hit to bottom
b = (sbyte)(b & 0x0F); // get HI byte
sb.Append(HexChars[b]);
b = Convert.ToSByte(buf[i]); // refresh
b = (sbyte)(b & 0x0F); // get LOW byte
sb.Append(HexChars[b]);
}
return sb.ToString();
}
}
}
Save the dll somewhere in a folder on the SQL Server machine. I used 'C:\NTA\Libraries\Numeric2Comp3.dll'.
Next, you'll need to enable CLR Integration on SQL Server. Read about it on Microsoft's website here: Introduction to SQL Server CLR Integration. Open SQL Server Management Studio and execute the following to enable CLR Integration:
sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'clr enabled', 1;
GO
RECONFIGURE;
GO
Once that is done, execute the following in Management Studio:
CREATE ASSEMBLY Numeric2Comp3 from 'C:\NTA\Libraries\Numeric2Comp3.dll' WITH PERMISSION_SET = SAFE
You can execute the following to remove the assembly, if you need to for any reason:
drop assembly Numeric2Comp3
Next, in Management studio, execute the following to create the stored procedure to reference the dll:
CREATE PROCEDURE Numeric2Comp3
#numberin nchar(27), #hexarray varbinary(27) OUTPUT, #hexstring nchar(27) OUTPUT
AS
EXTERNAL NAME Numeric2Comp3.[Numeric2Comp3.PackedDecimal].ToComp3
If everything above runs successfully, you're done!
Here is some SQL to test it out:
DECLARE #in nchar(27), #hexstring nchar(27), #hexarray varbinary(27)
set #in = '20120123'
EXEC Numeric2Comp3 #in, #hexarray out, #hexstring out
select len(#hexarray), #hexarray
select len(#hexstring), #hexstring
This will return the following values:
(No column name) (No column name)
5 0x020120123F
(No column name) (No column name)
10 020120123F
In my case, what I need is the value coming out of #hexarray. This will be written to the Binary column in my table.
I hope this helps others that may need it!
If you have Comp-3 stored in a binary filed as a hex string, well I wonder if the process that created this is working as it should.
Be that as it may, the best solution would be to cast them in the select; the cast sytax is simple, but I don't know if a comp-3 cast is available.
Here are examples on MSDN.
So let's work with the string: To transform the string you use this:
string in2 = "020120123C";
long iOut = Convert.ToInt64(in2.Substring(0, in2.Length - 1))
* (in2.Substring(in2.Length - 1, 1)=="D"? -1 : 1 ) ;
It treats the last character as th sign, with 'D' being the one negative sign. Both 'F' and 'C' would be positive.
Will you also need to write the data back?
I am curious: What string representaion comes out for fractional numbers like 123.45 ?
( I'll leave the original answer for reference..:)
Here are a few lines of code to show how you can work with bit and bytes.
The operations to use are:
shift the data n bits right or left: << n or >> n
masking/clearing unwanted high bits: e.g. set all to 0 except the last 4 bits: & 0xF
adding bitwise: |
If you have a string representation like the one you have shown the out3 and out4 byte would be the result. The other conversions are just examples how to process bit; you can't possibly have decimals as binarys or binarys that look like decimals. Maybe you get integers - then out7 and out8 would be the results.
To combine two bytes into one integer look at the last calculation!
// 3 possible inputs:
long input = 0x00012F0000071F;
long input2 = 3143;
string inputS = "0x00012F0000071F";
// take binary input as such
byte out1 = (byte)((input >> 4) & 0xFFFFFF );
byte out2 = (byte)(input >> 36);
// take string as decimals
byte out3 = Convert.ToByte(inputS.Substring(5, 2));
byte out4 = Convert.ToByte(inputS.Substring(13, 2));
// take binary as decimal
byte out5 = (byte)(10 * ((input >> 40) & 0xF) + (byte)((input >> 36) & 0xF));
byte out6 = (byte)(10 * ((input >> 8) & 0xF) + (byte)((input >> 4) & 0xF));
// take integer and pick out 3rd and last byte
byte out7 = (byte)(input2 >> 8);
byte out8 = (byte)(input2 & 0xFF);
// combine two bytes to one integer
int byte1and2 = (byte)(12) << 8 | (byte)(71) ;
Console.WriteLine(out1.ToString());
Console.WriteLine(out2.ToString());
Console.WriteLine(out3.ToString());
Console.WriteLine(out4.ToString());
Console.WriteLine(out5.ToString());
Console.WriteLine(out6.ToString());
Console.WriteLine(out7.ToString());
Console.WriteLine(out8.ToString());
Console.WriteLine(byte2.ToString());
currently im working on a solution for a prime-number calculator/checker. The algorythm is already working and verry efficient (0,359 seconds for the first 9012330 primes). Here is a part of the upper region where everything is declared:
const uint anz = 50000000;
uint a = 3, b = 4, c = 3, d = 13, e = 12, f = 13, g = 28, h = 32;
bool[,] prim = new bool[8, anz / 10];
uint max = 3 * (uint)(anz / (Math.Log(anz) - 1.08366));
uint[] p = new uint[max];
Now I wanted to go to the next level and use ulong's instead of uint's to cover a larger area (you can see that already), where i tapped into my problem: the bool-array.
Like everybody should know, bool's have the length of a byte what takes a lot of memory when creating the array... So I'm searching for a more resource-friendly way to do that.
My first idea was a bit-array -> not byte! <- to save the bool's, but haven't figured out how to do that by now. So if someone ever did something like this, I would appreciate any kind of tips and solutions. Thanks in advance :)
You can use BitArray collection:
http://msdn.microsoft.com/en-us/library/system.collections.bitarray(v=vs.110).aspx
MSDN Description:
Manages a compact array of bit values, which are represented as Booleans, where true indicates that the bit is on (1) and false indicates the bit is off (0).
You can (and should) use well tested and well known libraries.
But if you're looking to learn something (as it seems to be the case) you can do it yourself.
Another reason you may want to use a custom bit array is to use the hard drive to store the array, which comes in handy when calculating primes. To do this you'd need to further split addr, for example lowest 3 bits for the mask, next 28 bits for 256MB of in-memory storage, and from there on - a file name for a buffer file.
Yet another reason for custom bit array is to compress the memory use when specifically searching for primes. After all more than half of your bits will be 'false' because the numbers corresponding to them would be even, so in fact you can both speed up your calculation AND reduce memory requirements if you don't even store the even bits. You can do that by changing the way addr is interpreted. Further more you can also exclude numbers divisible by 3 (only 2 out of every 6 numbers has a chance of being prime) thus reducing memory requirements by 60% compared to plain bit array.
Notice the use of shift and logical operators to make the code a bit more efficient.
byte mask = (byte)(1 << (int)(addr & 7)); for example can be written as
byte mask = (byte)(1 << (int)(addr % 8));
and addr >> 3 can be written as addr / 8
Testing shift/logical operators vs division shows 2.6s vs 4.8s in favor of shift/logical for 200000000 operations.
Here's the code:
void Main()
{
var barr = new BitArray(10);
barr[4] = true;
Console.WriteLine("Is it "+barr[4]);
Console.WriteLine("Is it Not "+barr[5]);
}
public class BitArray{
private readonly byte[] _buffer;
public bool this[long addr]{
get{
byte mask = (byte)(1 << (int)(addr & 7));
byte val = _buffer[(int)(addr >> 3)];
bool bit = (val & mask) == mask;
return bit;
}
set{
byte mask = (byte) ((value ? 1:0) << (int)(addr & 7));
int offs = (int)addr >> 3;
_buffer[offs] = (byte)(_buffer[offs] | mask);
}
}
public BitArray(long size){
_buffer = new byte[size/8 + 1]; // define a byte buffer sized to hold 8 bools per byte. The spare +1 is to avoid dealing with rounding.
}
}
I'm attempting to convert a pseudo rand function from c++ to c# but it doesnt seem to return the correct values. Its important that i use a consistent set for encryption so i cant just use a random number.
this is the function in c++.
int get_pseudo_rand()
{
return( ((_last_rand = _last_rand * 214013L
+ 2531011L) >> 16) & 0x7fff );
}
and this is my c# alternative
int get_pseudo_rand()
{
return (((_last_rand = (_last_rand * 214013 + 2531011) >> 16) & 0x7fff));
}
I removed the Ls since c#s int data type is 4 bytes like c++ longs whereas c#s longs are 8 bytes.
the first time the function is run from the seed the answer is consistent with the c++ version but then it begins to diverge.
Any ideas?
You have parenthesized the two statements in a different way that changes their meaning. The C++ code updates _last_rand and then right-shifts the result, the C# code performs the right-shift before updating _last_rand. I've lined the statements up below to make the difference more obvious.
C++:
return (((_last_rand = _last_rand * 214013L + 2531011L) >> 16) & 0x7fff);
C#:
return (((_last_rand = (_last_rand * 214013 + 2531011 ) >> 16) & 0x7fff));
The problem is that you have parenthesized differentyl which leads to storing different values in _last_rand... with your code _last_rand stores 28818 after the first run... with the C++ code it stores 1888663550 which is the value BEFORE >> and before &. Thus it startes diverging from second run on...
To achieve the same behaviour as in C++ use in C#
return (((_last_rand = _last_rand * 214013 + 2531011) >> 16) & 0x7fff);
Is there a function in c# that takes two 32 bit integers (int) and returns a single 64 bit one (long)?
Sounds like there should be a simple way to do this, but I couldn't find a solution.
Try the following
public long MakeLong(int left, int right) {
//implicit conversion of left to a long
long res = left;
//shift the bits creating an empty space on the right
// ex: 0x0000CFFF becomes 0xCFFF0000
res = (res << 32);
//combine the bits on the right with the previous value
// ex: 0xCFFF0000 | 0x0000ABCD becomes 0xCFFFABCD
res = res | (long)(uint)right; //uint first to prevent loss of signed bit
//return the combined result
return res;
}
Just for clarity... While the accepted answer does appear to work correctly. All of the one liners presented do not appear to produce accurate results.
Here is a one liner that does work:
long correct = (long)left << 32 | (long)(uint)right;
Here is some code so you can test it for yourself:
long original = 1979205471486323557L;
int left = (int)(original >> 32);
int right = (int)(original & 0xffffffffL);
long correct = (long)left << 32 | (long)(uint)right;
long incorrect1 = (long)(((long)left << 32) | (long)right);
long incorrect2 = ((Int64)left << 32 | right);
long incorrect3 = (long)(left * uint.MaxValue) + right;
long incorrect4 = (long)(left * 0x100000000) + right;
Console.WriteLine(original == correct);
Console.WriteLine(original == incorrect1);
Console.WriteLine(original == incorrect2);
Console.WriteLine(original == incorrect3);
Console.WriteLine(original == incorrect4);
Try
(long)(((long)i1 << 32) | (long)i2)
this shifts the first int left by 32 bits (the length of an int), then ors in the second int, so you end up with the two ints concatentated together in a long.
Be careful with the sign bit. Here is a fast ulong solution, that is also not portable from little endian to big endian:
var a = 123;
var b = -123;
unsafe
{
ulong result = *(uint*)&a;
result <<= 32;
result |= *(uint*)&b;
}
This should do the trick
((Int64) a << 32 | b)
Where a and b are Int32. Although you might want to check what happens with the highest bits. Or just put it inside an "unchecked {...}" block.
Gotta be careful with bit twiddling like this though cause you'll have issues on little endian/big endian machines (exp Mono platforms aren't always little endian). Plus you have to deal with sign extending. Mathematically the following is the same but deals with sign extension and is platform agnostic.
return (long)( high * uint.MaxValue ) + low;
When jitted at runtime it will result in performance similar to the bit twiddling. That's one of the nice things about interpreted languages.
There is a problem when i2 < 0 - high 32 bits will be set (0xFFFFFFFF,1xxx... binary) - thecoop was wrong
Better would be something like (Int64)(((UInt64)i1 << 32) | (UInt32)i2)
Or simply C++ way
public static unsafe UInt64 MakeLong(UInt32 low, UInt32 high)
{
UInt64 retVal;
UInt32* ptr = (UInt32*)&retVal;
*ptr++ = low;
*ptr = high;
return retVal;
}
UInt64 retVal;
unsafe
{
UInt32* ptr = (UInt32*)&retVal;
*ptr++ = low;
*ptr = high;
}
But the best solution found then here ;-)
[StructLayout(LayoutKind.Explicit)]
[FieldOffset()]
https://stackoverflow.com/questions/12898591
(even w/o unsafe)
Anyway FieldOffset works for each item, so you have to specify position of each half separate and remember negative #s are zero complements, so ex. low <0 and high >0 will not make sense - for example -1,0 will give Int64 as 4294967295 probably.