ASCII values in hexadecimal notation

ASCII values in hexadecimal notation - c#

I am trying to parse some output data from and PBX and I have found something that I can't really figure out.
In the documentation it says the following
Information for type of call and feature. Eight character for ’status information 3’ with following ASCII values in hexadecimal notation.
1. Character
Bit7 Incoming call
Bit6 Outgoing call
Bit5 Internal call
Bit4 CN call
2. Character
Bit3 Transferred call (transferring party inside)
Bit2 CN-transferred call (transferring party outside)
Bit1
Bit0
Any ideas how to interpret this? I have no raw data at the time to match against but I still need to figure it out.

Probably you'll receive two characters (hex digits: 0-9, A-F) First digit represents the hex value for the most significant 4 bits, next digit for the least significant 4 bits.
Example:
You will probably receive something like the string "7C" as hex representation of the bitmap: 01111100.

Eight character for ’status information 3’ with following ASCII values in hexadecimal notation.
If think this means the following.
You will get 8 bytes - one byte per line, I guess.
It is just the wrong term. They mean two hex digits per byte but call them characters.
So it is just a byte with bit flags - or more precisely a array of eight such bytes.
Bit
7 incoming
6 outgoing
5 internal
4 CN
3 transfered
2 CN transfered
1 unused?
0 unused?
You could map this to a enum.
[BitFlags]
public enum CallInformation : Byte
{
Incoming = 128,
Outgoing = 64,
Internal = 32,
CN = 16
Transfered = 8,
CNTransfered = 4,
Undefined = 0
}

Very hard without data. I'd guess that you will get two bytes (two ASCII characters), and need to pick them apart at the bit level.
For instance, if the first character is 'A', you will need to look up its character code (65, or hex 0x41), and then look at the bits. Of course the bits are the same regardless of decimal or hex, but its easer to do by hand in hex. 0x41 is bit 5 and bit 1 set, so that would be an "internal call". Bit 1 seems undocumented.
I'm not sure why it looks as if that would require two characters; it's only eight bits documented.

Related

Maximum UTF-8 string size given UTF-16 size

What is the formula for determining the maximum number of UTF-8 bytes required to encode a given number of UTF-16 code units (i.e. the value of String.Length in C# / .NET)?
I see 3 possibilities:
# of UTF-16 code units x 2
# of UTF-16 code units x 3
# of UTF-16 code units x 4
A UTF-16 code point is represented by either 1 or 2 code units, so we just need to consider the worst case scenario of a string filled with one or the other. If a UTF-16 string is composed entirely of 2 code unit code points, then we know the UTF-8 representation will be at most the same size, since the code points take up a maximum of 4 bytes in both representations, thus worst case is option (1) above.
So the interesting case to consider, which I don't know the answer to, is the maximum number of bytes that a single code unit UTF-16 code point can require in UTF-8 representation.
If all single code unit UTF-16 code points can be represented with 3 UTF-8 bytes, which my gut tells me makes the most sense, then option (2) will be the worst case scenario. If there are any that require 4 bytes then option (3) will be the answer.
Does anyone have insight into which is correct? I'm really hoping for (1) or (2) as (3) is going to make things a lot harder :/
UPDATE
From what I can gather, UTF-16 encodes all characters in the BMP in a single code unit, and all other planes are encoded in 2 code units.
It seems that UTF-8 can encode the entire BMP within 3 bytes and uses 4 bytes for encoding the other planes.
Thus it seems to me that option (2) above is the correct answer, and this should work:
string str = "Some string";
int maxUtf8EncodedSize = str.Length * 3;
Does that seem like it checks out?

The worst case for a single UTF-16 word is U+FFFF which in UTF-16 is encoded just as-is (0xFFFF) Cyberchef. In UTF-8 it is encoded to ef bf bf (three bytes).
The worst case for two UTF-16 words (a "surrogate pair") is U+10FFFF which in UTF-16 is encoded as 0xDBFF DFFF. In UTF-8 it is encoded to f3 cf bf bf (four bytes).
Therefore the worst case is a load of U+FFFF's which will convert a UTF-16 string of length 2N bytes to a UTF-8 string of length 3N bytes.
So yes, you are correct. I don't think you need to consider stuff like glyphs because that sort of thing is done after decoding from UTF8/16 to code points.

Properly formed UTF-8 can be up to 4 bytes per Unicode codepoint.
UTF-16-encoded characters can be up to 2 16-bit sequences per Unicode codepoint.
Characters outside the basic multilingual plane (including emoji and languages that were added to more recent versions of Unicode) are represented in up to 21 bits, which in the UTF-8 format results in 4 byte sequences, which turn out to also take up 4 bytes in UTF-16.
However, there are some environments that do things weirdly. Since UTF-16 characters outside the basic multilingual plane take up to 2 16-bit sequences (they're detectible because they're always 16 bit sequences in the range U+D800 to U+DFFF), some mistaken UTF-8 implementations, usually referred to as CESU-8, that convert those UTF-8 sequences into two 3-byte UTF-8 sequences, for a total of six bytes per UTF-32 codepoint. (I believe some early Oracle DB implementations did this, and I'm sure they weren't the only ones).
There's one more minor wrench in things, which is that some glyphs are classified as combining characters, and multiple UTF-16 (or UTF-32) sequences are used when determining what gets displayed on the screen, but I don't think that applies in your case.
Based on your edit, it looks like you're trying to estimate the maximum length of .Net encoding conversion. String Length measures the total number of Chars, which are a count of UTF-16 codepoints. As a worst-case estimate, therefore, I believe you can safely estimate count(Char) * 3, because the non-BMP characters will be count(Char) * 2 yielding 4 bytes as UTF-8.
If you want to get the total number of UTF-32 codepoints represented, you should be able to do something like
var maximumUtf8Bytes = System.Globalization.StringInfo(myString).LengthInTextElements * 4;
(My C# is a bit rusty as I haven't used a .Net environment much in the last few years, but I think that does the trick).

Converting int to hex but not string

Now I know that converting a int to hex is simple but I have an issue here.
I have an int that I want to convert it to hex and then add another hex to it.
The simple solution is int.Tostring("X") but after my int is turned to hex, it is also turned to string so I can't add anything to it until it is turned back to int again.
So my question is; is there a way to turn a int to hex and avoid having turned it to string as well. I mean a quick way such as int.Tostring("X") but without the int being turned to string.

I mean a quick way such as int.Tostring("X") but without the int being
turned to string.
No.
Look at this way. What is the difference between those?
var i = 10;
var i = 0xA;
As a value, they are exactly same. As a representation, first one is decimal notation and the second one is hexadecimal notation. The X you use hexadecimal format specifier which generates hexadecimal notation of that numeric value.
Be aware, you can parse this hexadecimal notation string to integer anytime you want.
C# convert integer to hex and back again

There is no need to convert. Number ten is ten, write it in binary or hex, yes their representation will differ depending in which base you write them but value is same. So just add another integer to your integer - and convert the final result to hex string when you need it.
Take example. Assume you have
int x = 10 + 10; // answer is 20 or 0x14 in Hex.
Now, if you added
int x = 0x0A + 0x0A; // x == 0x14
Result would still be 0x14. See?
Numeric 10 and 0x0A have same value just they are written in different base.
Hexadecimal string although is a different beast.
In above case that could be "0x14".
For computer this would be stored as: '0', 'x', '1', '4' - four separate characters (or bytes representing these characters in some encoding). While in case with integers, it is stored as single integer (encoded in binary form).

I guess you missing the point what is HEX and what is INT. They both represent an numbers. 1, 2, 3, 4, etc.. numbers. There's a way to look at numbers: as natural numbers: INT and hexadecimal - but at the end those are same numbers. For example if you have number: 5 + 5 = 10 (int) and A (as hex) but it the same number. Just view on them are different

Hex is just a way to represent number. The same statment is true for decimal number system and binary although with exception of some custom made numbers (BigNums etd) everything will be stored as binary as long as its integer (by integer i mean not floating point number). What would you really like to do is probably performing calculations on integers and then printing them as a Hex which have been already described in this topic C# convert integer to hex and back again

The short answer: no, and there is no need.
The integer One Hundred and seventy nine (179) is B3 in hex, 179 in base-10, 10110011 in base-2 and 20122 in base-3. The base of the number doesn't change the value of it. B3, 17, 10110011, and 20122 are all the same number, they are just represented different. So it doesn't matter what base they are in as long as you do you mathematical operations on numbers in the same base it doesn't matter what the base is.
So in your case with Hex numbers, they can contain characters such as 'A','B', 'C', and so on. So when you get a value in hex if it is a number that will contain a letter in its hex representation it will have to be a string as letters are not ints. To do what you want, it would be best to convert both numbers to regular ints and then do math and convert to Hex after. The reason for this is that if you want to be able to add (or whatever operation) with them looking like hex you are going to to need to change the behavior of the desired operator on string which is a hassle.

How can I convert a byte into a string of binary digits in C#?

I am trying to convert a byte into a string of binary digits - not encoded, just as it is, i.e. if the byte = 00110101 then the string would be "00110101".
I have searched high and low, and everything I find is either relating to getting the ASCII or UTF or whatever value of the byte, or converting a character into a byte, neither of which is what I want. Just doing ToString() gives me the int value.
Maybe i'm missing something obvious, and I understand this is a fairly rare case. It must be possible without some crazy loop which iterates through, surely?
(I'm sending the string over bluetoothLE to a rotating shop display cabinet to program it)
edit: here's some code:
DateTime updateTime = DateTime.Now;
byte dow = (byte)updateTime.DayOfWeek;
Debug.WriteLine(dow.ToString());
If I break and inspect 'dow', it shows as '3' (it's wednesday), not 00000011 as I would have expected. I just tried BitConverter as suggested below, but that still returns '3'.

You want to use Convert.ToString() but specify a base, in this case because it's binary, base 2.
However, you'll also need to pad to the number of bits, because it will cut off 0 digits, so 00000001 would end up as 1.
Try this:
Convert.ToString(theByte,2).PadLeft(8,'0');

How do I interpret a WORD or Nibble as a number in C#?

I don't come from a low-level development background, so I'm not sure how to convert the below instruction to an integer...
Basically I have a microprocessor which tells me which IO's are active or inactive. I send the device an ASCII command and it replies with a WORD about which of the 15 I/O's are open/closed... here's the instruction:
Unit Answers "A0001/" for only DIn0 on, "A????/" for All Inputs Active
Awxyz/ - w=High Nibble of MSB in 0 to ? ASCII Character 0001=1, 1111=?, z=Low Nibble of LSB.
At the end of the day I just want to be able to convert it back into a number which will tell me which of the 15 (or 16?) inputs are active.
I have something hooked up to the 15th I/O port, and the reply I get is "A8000", if that helps?
Can someone clear this up for me please?

You can use the BitConverter class to convert an array of bytes to the integer format you need.
If you're getting 16 bits, convert them to a UInt16.
C# does not define the endianness. That is determined by the hardware you are running on. Intel and AMD processors are little-endian. You can learn the endian-ness of your platform using BitConverter.IsLittleEndian. If the computer running .NET and the hardware providing the data do not have the same endian-ness, you would have to swap the two bytes.
byte[] inputFromHardware = { 126, 42 };
ushort value = BitConverter.ToUInt16( inputFromHardware, index );
which of the 15 (or 16?) inputs are active
If the bits are hardware flags, it is plausible that all 16 are used to mean something. When bits are used to represent a signed number, one of the bits is used to represent positive vs. negative. Since the hardware is providing status bits, none of the bits should be interpreted as a sign.
If you want to know if a specific bit is active, you can use a bit mask along with the & operator. For example, the binary mask
0000 0000 0000 0100
corresponds to the hex number
0x0004
To learn if the third bit from the right is toggled on, use
bool thirdBitFromRightIsOn = (value & 0x0004 != 0);
UPDATE
If the manufacturer says the value 8000 (I assume hex) represents Channel 15 being active, look at it like this:
Your bit mask
1000 0000 0000 0000 (binary)
8 0 0 0 (hex)
Based in that info from the manufacturer, the left-most bit corresponds to Channel 15.
You can use that mask like:
bool channel15IsOn = (value & 0x8000 != 0);

A second choice (but for Nibbles only) is to use the Math library like this:
string x = "A0001/"
int y = 0;
for (int i = 3; i >= 0; i--)
{
if (x[4-i] == '1')
y += (int)Math.Pow(2,i);
}
Using the ability to treat a string as an array of characters by just using the brackets and a numeric character as a number by casting you can make the conversion from bits to an integer hardcoded.
It is a novice solution but i think it's a proper too.

If byte is 8 bit integer then how can we set it to 255?

The byte keyword denotes an integral
type that stores values as indicated
in the following table. It's an Unsigned 8-bit integer.
If it's only 8 bits then how can we assign it to equal 255?
byte myByte = 255;
I thought 8 bits was the same thing as just one character?

There are 256 different configuration of bits in a byte
0000 0000
0000 0001
0000 0010
...
1111 1111
So can assign a byte a value in the 0-255 range

Characters are described (in a basic sense) by a numeric representation that fits inside an 8 bit structure. If you look at the ASCII Codes for ascii characters, you'll see that they're related to numbers.
The integer count a bit sequence can represent is generated by the formula 2^n - 1 (as partially described above by #Marc Gravell). So an 8 bit structure can hold 256 values including 0 (also note TCPIP numbers are 4 separate sequences of 8 bit structures). If this was a signed integer, the first bit would be a flag for the sign and the remaining 7 would indicate the value, so while it would still hold 256 values, but the maximum and minimum would be determined by the 7 trailing bits (so 2^7 - 1 = 127).
When you get into Unicode characters and "high ascii" characters, the encoding requires more than an 8 bit structure. So in your example, if you were to assign a byte a value of 76, a lookup table could be consulted to derive the ascii character v.

11111111 (8 on bits) is 255: 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1
Perhaps you're confusing this with 256, which is 2^8?

8 bits (unsigned) is 0 thru 255, or (2^8)-1.
It sounds like you are confusing integer vs text representations of data.

i thought 8 bits was the same thing as
just one character?
I think you're confusing the number 255 with the string "255."
Think about it this way: if computers stored numbers internally using characters, how would it store those characters? Using bits, right?
So in this hypothetical scenario, a computer would use bits to represent characters which it then in turn used to represent numbers. Aside from being horrendous from an efficiency standpoint, this is just redundant. Bits can represent numbers directly.

255 = 2^8 − 1 = FF[hex] = 11111111[bin]

range of values for unsigned 8 bits is 0 to 255. so this is perfectly valid
8 bits is not the same as one character in c#. In c# character is 16 bits. ANd even if character is 8 bits it has no relevance to the main question

I think you're confusing character encoding with the actual integral value stored in the variable.
A 8 bit value can have 255 configurations as answered by Arkain
Optionally, in ASCII, each of those configuration represent a different ASCII character
So, basically it depends how you interpret the value, as a integer value or as a character
ASCII Table
Wikipedia on ASCII

Sure, a bit late to answer, but for those who get this in a google search, here we go...
Like others have said, a character is definitely different to an integer. Whether it's 8-bits or not is irrelevant, but I can help by simply stating how each one works:
for an 8-bit integer, a value range between 0 and 255 is possible (or -127..127 if it's signed, and in this case, the first bit decides the polarity)
for an 8-bit character, it will most likely be an ASCII character, of which is usually referenced by an index specified with a hexadecimal value, e.g. FF or 0A. Because computers back in the day were only 8-bit, the result was a 16x16 table i.e. 256 possible characters in the ASCII character set.
Either way, if the byte is 8 bits long, then both an ASCII address or an 8-bit integer will fit in the variable's data. I would recommend using a different more dedicated data type though for simplicity. (e.g. char for ASCII or raw data, int for integers of any bit length, usually 32-bit)

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.