Packing record length in 2 bytes - c#

I want to create a ASCII string which will have a number of fields. For e.g.
string s = f1 + "|" + f2 + "|" + f3;
f1, f2, f3 are fields and "|"(pipe) is the delimiter. I want to avoid this delimiter and keep the field count at the beginning like:
string s = f1.Length + f2.Length + f3.Length + f1 + f2 + f3;
All lengths are going to be packed in 2 chars, Max length = 00-99 in this case. I was wondering if I can pack the length of each field in 2 bytes by extracting bytes out of a short. This would allow me to have a range 0-65536 using only 2 bytes. E.g.
short length = 20005;
byte b1 = (byte)length;
byte b2 = (byte)(length >> 8);
// Save bytes b1 and b2
// Read bytes b1 and b2
short length = 0;
length = b2;
length = (short)(length << 8);
length = (short)(length | b1);
// Now length is 20005
What do you think about the above code, Is this a good way to keep the record lengths?

I cannot see what you are trying to achieve. short aka Int16 is 2 bytes - yes, so you can happily use it. But creating a string does not make sense.
short sh = 56100; // 2 bytes
I believe you mean, being able to output the short to a stream. For this there are ways:
BinaryWriter.Write(sh) which writes 2 bytes straight to the stream
BitConverter.GetBytes(sh) which gives you bytes of a short
Reading back you can use the same classes.

If you want ascii, i.e. "00" as characters, then just:
byte[] bytes = Encoding.Ascii.GetBytes(length.ToString("00"));
or you could optimise it if you want.
But IMO, if you are storing 0-99, 1 byte is plenty:
byte b = (byte)length;
If you want the range 0-65535, then just:
bytes[0] = (byte)length;
bytes[1] = (byte)(length >> 8);
or swap index 0 and 1 for endianness.
But if you are using the full range (of either single or double byte), then it isn't ascii nor a string. Anything that tries to read it as a string might fail.

Whether it's a good idea depends on the details of what it's for, but it's not likely to be good.
If you do this then you're no longer creating an "ASCII string". Those were your words, but maybe you don't really care whether it's ASCII.
You will sometimes get bytes with a value of 0 in your "string". If you're handling the strings with anything written in C, this is likely to cause trouble. You'll also get all sorts of other characters -- newlines, tabs, commas, etc. -- that may confuse software that's trying to work with your data.
The original plan of separating with (say) | characters will be more compact and easier for humans and software to read. The only obvious downsides are (1) you can't allow field values with a | in (or else you need some sort of escaping) and (2) parsing will be marginally slower.

If you want to get clever you could pack your 2 bytes into 1 where the value of byte 1 is <= 127, or if the value is >=128 you use 2 bytes instead. This technique looses you 1 bit, per byte that you are using, but if you normally have small values, but occasionally have larger values it dynamically grows to accommodate the value.
All you need to do is mark bit 8 with a value indicating that the 2nd byte is required to be read.
If bit 8 of the active byte is not set, it means you have completed your value.
EG
If you have a value of 4 then you use this
|8|7|6|5|4|3|2|1|
|0|0|0|0|0|1|0|0|
If you have a value of 128 you then can read the 1st byte check if bit 8 is high, and read the remaining 7 bits of the 1st byte, then you do the same with the 2nd byte, moving the 7bits left 7 bits.
|BYTE 0 |BYTE 1 |
|8|7|6|5|4|3|2|1|8|7|6|5|4|3|2|1|
|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|

Related

How do I interpret a WORD or Nibble as a number in C#?

I don't come from a low-level development background, so I'm not sure how to convert the below instruction to an integer...
Basically I have a microprocessor which tells me which IO's are active or inactive. I send the device an ASCII command and it replies with a WORD about which of the 15 I/O's are open/closed... here's the instruction:
Unit Answers "A0001/" for only DIn0 on, "A????/" for All Inputs Active
Awxyz/ - w=High Nibble of MSB in 0 to ? ASCII Character 0001=1, 1111=?, z=Low Nibble of LSB.
At the end of the day I just want to be able to convert it back into a number which will tell me which of the 15 (or 16?) inputs are active.
I have something hooked up to the 15th I/O port, and the reply I get is "A8000", if that helps?
Can someone clear this up for me please?
You can use the BitConverter class to convert an array of bytes to the integer format you need.
If you're getting 16 bits, convert them to a UInt16.
C# does not define the endianness. That is determined by the hardware you are running on. Intel and AMD processors are little-endian. You can learn the endian-ness of your platform using BitConverter.IsLittleEndian. If the computer running .NET and the hardware providing the data do not have the same endian-ness, you would have to swap the two bytes.
byte[] inputFromHardware = { 126, 42 };
ushort value = BitConverter.ToUInt16( inputFromHardware, index );
which of the 15 (or 16?) inputs are active
If the bits are hardware flags, it is plausible that all 16 are used to mean something. When bits are used to represent a signed number, one of the bits is used to represent positive vs. negative. Since the hardware is providing status bits, none of the bits should be interpreted as a sign.
If you want to know if a specific bit is active, you can use a bit mask along with the & operator. For example, the binary mask
0000 0000 0000 0100
corresponds to the hex number
0x0004
To learn if the third bit from the right is toggled on, use
bool thirdBitFromRightIsOn = (value & 0x0004 != 0);
UPDATE
If the manufacturer says the value 8000 (I assume hex) represents Channel 15 being active, look at it like this:
Your bit mask
1000 0000 0000 0000 (binary)
8 0 0 0 (hex)
Based in that info from the manufacturer, the left-most bit corresponds to Channel 15.
You can use that mask like:
bool channel15IsOn = (value & 0x8000 != 0);
A second choice (but for Nibbles only) is to use the Math library like this:
string x = "A0001/"
int y = 0;
for (int i = 3; i >= 0; i--)
{
if (x[4-i] == '1')
y += (int)Math.Pow(2,i);
}
Using the ability to treat a string as an array of characters by just using the brackets and a numeric character as a number by casting you can make the conversion from bits to an integer hardcoded.
It is a novice solution but i think it's a proper too.

Visual FoxPro (VFP) CTOBIN and BINTOC Functions - Equivalent In .Net

We are rewriting some applications previously developed in Visual FoxPro and redeveloping them using .Net ( using C# )
Here is our scenario:
Our application uses smartcards. We read in data from a smartcard which has a name and number. The name comes back ok in readable text but the number, in this case '900' comes back as a 2 byte character representation (131 & 132) and look like this - ƒ„
Those 2 special characters can be seen in the extended Ascii table.. now as you can see the 2 bytes are 131 and 132 and can vary as there is no single standard extended ascii table ( as far as I can tell reading some of the posts on here )
So... the smart card was previously written to using the BINTOC function in VFP and therefore the 900 was written to the card as ƒ„. And within foxpro those 2 special characters can be converted back into integer format using CTOBIN function.. another built in function in FoxPro..
So ( finally getting to the point ) - So far we have been unable to convert those 2 special characters back to an int ( 900 ) and we are wondering if this is possible in .NET to read the character representation of an integer back to an actual integer.
Or is there a way to rewrite the logic of those 2 VFP functions in C#?
UPDATE:
After some fiddling we realise that to get 900 into 2bytes we need to convert 900 into a 16bit Binary Value, then we need to convert that 16 bit binary value into a decimal value.
So as above we are receiving back 131 and 132 and their corresponding binary values as being 10000011 ( decimal value 131 ) and 10000100 ( decimal value 132 ).
When we concatenate these 2 values to '1000001110000100' it gives the decimal value 33668 however if we removed the leading 1 and transform '000001110000100' to decimal it gives the correct value of 900...
Not too sure why this is though...
Any help would be appreciated.
It looks like VFP is storing your value as a signed 16 bit (short) integer. It seems to have a strange changeover point to me for the negative numbers but it adds 128 to 8 bit numbers and adds 32768 to 16 bit numbers.
So converting your 16 bit numbers from the string should be as easy as reading it as a 16 bit integer and then taking 32768 away from it. If you have to do this manually then the first number has to be multiplied by 256 and then add the second number to get the stored value. Then take 32768 away from this number to get your value.
Examples:
131 * 256 = 33536
33536 + 132 = 33668
33668 - 32768 = 900
You could try using the C# conversions as per http://msdn.microsoft.com/en-us/library/ms131059.aspx and http://msdn.microsoft.com/en-us/library/tw38dw27.aspx to do at least some of the work for you but if not it shouldn't be too hard to code the above manually.
It's a few years late, but here's a working example.
public ulong CharToBin(byte[] s)
{
if (s == null || s.Length < 1 || s.Length > 8)
return 0ul;
var v = s.Select(c => (ulong)c).ToArray();
var result = 0ul;
var multiplier = 1ul;
for (var i = 0; i < v.Length; i++)
{
if (i > 0)
multiplier *= 256ul;
result += v[i] * multiplier;
}
return result;
}
This is a VFP 8 and earlier equivalent for CTOBIN, which covers your scenario. You should be able to write your own BINTOC based on the code above. VFP 9 added support for multiple options like non-reversed binary data, currency and double data types, and signed values. This sample only covers reversed unsigned binary like older VFP supported.
Some notes:
The code supports 1, 2, 4, and 8-byte values, which covers all
unsigned numeric values up to System.UInt64.
Before casting the
result down to your expected numeric type, you should verify the
ceiling. For example, if you need an Int32, then check the result
against Int32.MaxValue before you perform the cast.
The sample avoids the complexity of string encoding by accepting a
byte array. You would need to understand which encoding was used to
read the string, then apply that same encoding to get the byte array
before calling this function. In the VFP world, this is frequently
Encoding.ASCII, but it depends on the application.

Sending HEX values over a packet in C#

I currently have the following:
N.Sockets.UdpClient UD;
UD = new N.Sockets.UdpClient();
UD.Connect("xxx.xxx.xxx.xxx", 5255);
UD.Send( data, data.Length );
How would I send data in hex? I cannot just save it straight into a Byte array.
Hex is just an encoding. It's simply a way of representing a number. The computer works with bits and bytes only -- it has no notion of "hex".
So any number, whether represented in hex or decimal or binary, can be encoded into a series of bytes:
var data = new byte[] { 0xFF };
And any hex string can be converted into a number (using, e.g. int.Parse()).
Things get more interesting when a number exceeds one byte: Then there has to be an agreement of how many bytes will be used to represent the number, and the order they should be in.
In C#, ints are 4 bytes. Internally, depending on the endianness of the CPU, the most significant byte (highest-valued digits) might be stored first (big-endian) or last (little-endian). Typically, big-endian is used as the standard for communication over the network (remember the sender and receiver might have CPUs with different endianness). But, since you are sending the raw bytes manually, I'll assume you are also reading the raw bytes manually on the other end; if that's the case, you are of course free to use any arbitrary format you like, providing that the client can understand that format unambiguously.
To encode an int in big-endian order, you can do:
int num = 0xdeadbeef;
var unum = (uint)num; // Convert to uint for correct >> with negative numbers
var data = new[] {
(byte)(unum >> 24),
(byte)(unum >> 16),
(byte)(unum >> 8),
(byte)(unum)
};
Be aware that some packets might never reach the client (this is the main practical difference between TCP and UDP), possibly leading to misinterpretation of the bytes. You should take steps to improve the robustness of your message-sending (e.g. by adding a checksum, and ignoring values whose checksums are invalid or missing).

Bit/byte conversion

How many bits is a .NET string that's 10 characters in length? (.NET strings are UTF-16, right?)
On 32-bit systems:
4 bytes = Type pointer (Every object has one of these)
4 bytes = Lock (One of these too!)
4 bytes = Length (Need the length)
2 * Length bytes = Data (And the chars themselves)
=======================
12 + 2*Length bytes
=======================
96 + 16*Length bits
So 10 chars would = 256 bits = 32 bytes
I am not sure if the Lock grows to 64-bit on 64-bit systems. I kinda hope not, but you never know. The 64-bit structure overhead is therefore anywhere from 16-20 bytes (as opposed to the 12 bytes on 32-bit).
Every char in the string is two bytes in size, so if you are just converting the chars directly and not using any particular encoding, the answer is string.Length * 2 * 8
otherwise the result depends on the encoding, you can write:
int numbits = System.Text.Encoding.UTF8.GetByteCount(str)*8; //returns 80
or
int numbits = System.Text.Encoding.Unicode.GetByteCount(str)*8 //returns 160
If you are talking pure Unicode-16 then:
10 characters = 20 bytes = 160 bits
This really needs a context in order to be answered properly.
It all comes down to how you define character and how to you store the data.
For example, if you define character as a single letter from the users point of view it can be more than 2 bytes, for example this character: Å is two Unicode code points (U+0041 U+030A, Latin Capital A + Combining Ring Above) so it will require two .net chars or 4 bytes int UTF-16.
Now even if you are talking about 10 .net Char elements than if it's in memory you have some object overhead (that was already mentioned) and a bit of alignment overhead (on 32bit system everything has to be aligned to 4 bytes boundary, in 64bit the rules are more complicated) so you may have some empty bytes at the end.
If you are talking about database or files than each database and file system has its own overhead.

ASCII values in hexadecimal notation

I am trying to parse some output data from and PBX and I have found something that I can't really figure out.
In the documentation it says the following
Information for type of call and feature. Eight character for ’status information 3’ with following ASCII values in hexadecimal notation.
1. Character
Bit7 Incoming call
Bit6 Outgoing call
Bit5 Internal call
Bit4 CN call
2. Character
Bit3 Transferred call (transferring party inside)
Bit2 CN-transferred call (transferring party outside)
Bit1
Bit0
Any ideas how to interpret this? I have no raw data at the time to match against but I still need to figure it out.
Probably you'll receive two characters (hex digits: 0-9, A-F) First digit represents the hex value for the most significant 4 bits, next digit for the least significant 4 bits.
Example:
You will probably receive something like the string "7C" as hex representation of the bitmap: 01111100.
Eight character for ’status information 3’ with following ASCII values in hexadecimal notation.
If think this means the following.
You will get 8 bytes - one byte per line, I guess.
It is just the wrong term. They mean two hex digits per byte but call them characters.
So it is just a byte with bit flags - or more precisely a array of eight such bytes.
Bit
7 incoming
6 outgoing
5 internal
4 CN
3 transfered
2 CN transfered
1 unused?
0 unused?
You could map this to a enum.
[BitFlags]
public enum CallInformation : Byte
{
Incoming = 128,
Outgoing = 64,
Internal = 32,
CN = 16
Transfered = 8,
CNTransfered = 4,
Undefined = 0
}
Very hard without data. I'd guess that you will get two bytes (two ASCII characters), and need to pick them apart at the bit level.
For instance, if the first character is 'A', you will need to look up its character code (65, or hex 0x41), and then look at the bits. Of course the bits are the same regardless of decimal or hex, but its easer to do by hand in hex. 0x41 is bit 5 and bit 1 set, so that would be an "internal call". Bit 1 seems undocumented.
I'm not sure why it looks as if that would require two characters; it's only eight bits documented.

Categories