Generate a byte sequence with fixed size from a string - c#

I want to create a byte sequence with a fixed length out of a string which has a variable length. What is the best way to archive this. All bytes should be as different as possible.
The code is used for research for myself, nothing productive.
This has been my first approach for the generation of the bytes:
static byte[] GenerateBytes(string password, Int32 strength)
{
Byte[] result = new byte[strength];
Byte[] pwBytes = Encoding.ASCII.GetBytes(password);
Int32 prime = GetLowerPrime(pwBytes.Length);
// Offset count to avoid values
Int32 count = prime;
Int32 sum = 0;
for (int i = 0; i < result.Length; i++) {
sum += (result[i] = pwBytes[(count++ % pwBytes.Length)]);
}
count += prime;
Int32 pcount = prime;
for (int i = 0; i < result.Length * 7; i++) {
result[(i % result.Length)] ^= (Byte)(pwBytes[(count++ % pwBytes.Length)] ^ ((pcount += pwBytes[(count % pwBytes.Length)]) % 255));
}
return result;
}
And generated some samples with 256 / 128 / 64 generated bytes and counted the unique bytes:
Password "Short": 170 103 60
Password "LongerX": 173 101 55
Password "Really Long": 169 100 57
Password "Unbelivable Safe!0ยง$": 162 101 56
Password "MCV": 119 113 61
Password "AAA": 50 51 50
Password "BBB": 67 67 52
Password "AAAAAA": 48 48 48
I tried to change the prime selector a bit this improves the generation with short keys but has partly a impact on long ones. I also tracked some statistics of the bytes. Generated and each byte value is used between 9 and 30 times.
What do you think about the results? How can i improve the generation of the bytes?

You seems to be reinventing the wheel. If you need to make key from the password, use hashing function, or, the best way - one of the standard password-based key derivation function. Search for PBKDF2.

well if you really want to roll your own solution that has no real practical use other than theoretical interest, (because this sounds like a homework question) just start off with a one-time pad of random bytes and XOR the pwd with the first few bytes, should give you reasonably high entropy for short pwds.

Related

C# Decoding Active Directory SID third digit being decoded incorrectly. All other's correct

I retrieve an objectsid value from Active Directory. It comes as a base64 encoded string.
I have the following function (below) to decode this into an actual SID text string.
The following function works just fine in all cases except a few.
When examined in the Active Directory Attribute Editor, a SID of "S-1-5-21-3477787073-812361429-1014394821" is being returned instead as "S-1-4-21-3477787073-812361429-1014394821". The third digit is off by one.
Likewise, a SID of "S-1-5-32-573" is being returned as "S-1-2-32-573", and again, the third digit is short by 3.
I've debugged it enough to figure out that it is the length of the byte[] array that is different, and I think is the cause of this error.
When the byte array is 28 bits, it decodes correctly. If it is shorter than 28 bits, the third SID digit is off directly relative to how much shorter the byte[] array is.
I think there is something about the for() loop condition i < bytes[7], even though byte[7] seems to always be 5 in all cases (at least the ones that I have breakpointed).
Or possibly the BitConverter's second parameter (8 + (i * 4)). If byte[7] is always 5, then (5*4) + 8 = 28. And when the byte array is 28 bits, the return value is always correct. It is when the byte[] array is less than 28 bits that the return value is wrong.
I did not write this function (found on another stack overflow question answer), so I do not really understand what it is doing beyond what I've described here. I did add the code comments.
public static string Base64DecodeSID(string data) {
if (data == "") {
return "";
}
byte[] bytes = Convert.FromBase64String(data);
try {
string sid = "S-" + bytes[0].ToString(); // Add SID revision.
if (bytes[6] != 0 || bytes[5] != 0) { // SID authority
sid += ("-" +
String.Format( // Left 0-pad hex
"0x{0:2x}{1:2x}{2:2x}{3:2x}{4:2x}{5:2x}", // values to 2 places.
bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6]
)
);
} else {
sid += ("-" + (bytes[1] + (bytes[2] << 8) + (bytes[3] << 16) + (bytes[4] << 24)).ToString());
}
for (int i = 0; i < bytes[7]; i++) { // Sub-Authority...
sid += ("-" + BitConverter.ToUInt32(bytes, 8 + (i * 4)).ToString());
}
return sid;
}
(I excluded the catch() for brevity)
Example
Base64: "AQQAAAAAAAUVAAAAwdFKz9WmazDFb3Y8"
Decoded byte[] array:
[00] 1
[01] 4
[02] 0
[03] 0
[04] 0
[05] 0
[06] 0
[07] 5
[08] 21
[09] 0
[10] 0
[11] 0
[12] 193
[13] 209
[14] 74
[15] 207
[16] 213
[17] 166
[18] 107
[19] 48
[20] 197
[21] 111
[22] 118
[23] 60
Expected return value: "S-1-5-21-3477787073-812361429-1014394821"
Actual return value: "S-1-4-21-3477787073-812361429-1014394821"
Any idea how to fix this? Do I need to pad the array out to 28, or even 32 bits?
As noted in the comments, the SecurityIdentifier class makes this easy:
public static string Base64DecodeSID(string data) {
var bytes = Convert.FromBase64String(data);
var sid = new SecurityIdentifier(bytes, 0);
return sid.Value;
}
You may want to add some error checks in there in case it's passed a bad value.
Calling Base64DecodeSID("AQQAAAAAAAUVAAAAwdFKz9WmazDFb3Y8") returns S-1-5-21-3477787073-812361429-1014394821

Binary of a number

Is there a simply way to convert decimal/ascii 6 bit decimal numbers from 1 to 100 to binary representation?
To be more specific im interested in 6 bit binary ascii. So I made this to get int 32.
For example "u" is changed to 61 instead 117 in standard decimal ascii.
Then this 61 is needed to be "111101" instead of traditional "01110101" but after this 48 + 8 math it's not important as now it's normal binary, just with 6 bits used.
foreach (char c in partToDecode)
{
var sum = c - 48;
if (sum>40)
{
sum = sum - 8;
}
Found this, but i don't have a clue how to traspose it to c#
void binary(unsigned n) {
unsigned i;
// Reverse loop
for (i = 1 << 31; i > 0; i >>= 1)
printf("%u", !!(n & i));
}
. . .
binary(65);
You can try Convert.ToString, e.g.
int source = 61;
// "111101"
string result = Convert.ToString(source, 2).PadLeft(6, '0');
Fiddle

Porting code containing unsigned char pointer in C to C#

I have this code in C that I need to port to C#:
void CryptoBuffer(unsigned char *Buffer, unsigned short length)
{
unsigned short i;
for(i=0; i < length; i++)
{
*Buffer ^= 0xAA;
*Buffer++ += 0xC9;
}
}
I tried this:
public void CryptoBuffer(byte[] buffer, int length)
{
for(int i = 0; i < length; i++)
{
buffer[i] ^= 0xAA;
buffer[i] += 0xC9;
}
}
But the outcome doesn't match the one expected.
According to the example, this:
A5 03 18 01...
should become this:
A5 6F 93 8B...
It also says the first byte is not encrypted, so that's why A5 stays the same.
EDIT for clarification: The specification just says you should skip the first byte, it doesn't go into details, so I'm guessing you just pass the sequence from position 1 until the last position to skip the first byte.
But my outcome with that C# port is:
A5 72 7B 74...
Is this port correct or am I missing something?
EDIT 2: For further clarification, this is a closed protocol, so I can't go into details, that's why I provided just enough information to help me port the code, that C code was the one that was given to me, and that's what the specification said it would do.
The real problem was that the "0xAA" was wrong in the specification, that's why the output wasn't the expected one. The C# code provided here and by the accepted answer are correct after all.
Let's break it down shall we, one step at a time.
void CryptoBuffer(unsigned char *Buffer, unsigned short length)
{
unsigned short i;
for(i=0; i < length; i++)
{
*Buffer ^= 0xAA;
*Buffer++ += 0xC9;
}
}
Regardless of some other remarks, this is how you normally do these things in C/C++. There's nothing fancy about this code, and it isn't overly complicated, but I think it is good to break it down to show you what happens.
Things to note:
unsigned char is basically the same as byte in c#
unsigned length has a value between 0-65536. Int should do the trick.
Buffer has a post-increment
The byte assignment (+= 0xC9) will overflow. If it overflows it's truncated to 8 bits in this case.
The buffer is passed by ptr, so the pointer in the calling method will stay the same.
This is just basic C code, no C++. It's quite safe to assume people don't use operator overloading here.
The only "difficult" thing here is the Buffer++. Details can be read in the book "Exceptional C++" from Sutter, but a small example explains this as well. And fortunately we have a perfect example at our disposal. A literal translation of the above code is:
void CryptoBuffer(unsigned char *Buffer, unsigned short length)
{
unsigned short i;
for(i=0; i < length; i++)
{
*Buffer ^= 0xAA;
unsigned char *tmp = Buffer;
*tmp += 0xC9;
Buffer = tmp + 1;
}
}
In this case the temp variable can be solved trivially, which leads us to:
void CryptoBuffer(unsigned char *Buffer, unsigned short length)
{
unsigned short i;
for(i=0; i < length; i++)
{
*Buffer ^= 0xAA;
*Buffer += 0xC9;
++Buffer;
}
}
Changing this code to C# now is pretty easy:
private void CryptoBuffer(byte[] Buffer, int length)
{
for (int i=0; i<length; ++i)
{
Buffer[i] = (byte)((Buffer[i] ^ 0xAA) + 0xC9);
}
}
This is basically the same as your ported code. This means that somewhere down the road something else went wrong... So let's hack the cryptobuffer shall we? :-)
If we assume that the first byte isn't used (as you stated) and that the '0xAA' and/or the '0xC9' are wrong, we can simply try all combinations:
static void Main(string[] args)
{
byte[] orig = new byte[] { 0x03, 0x18, 0x01 };
byte[] target = new byte[] { 0x6F, 0x93, 0x8b };
for (int i = 0; i < 256; ++i)
{
for (int j = 0; j < 256; ++j)
{
bool okay = true;
for (int k = 0; okay && k < 3; ++k)
{
byte tmp = (byte)((orig[k] ^ i) + j);
if (tmp != target[k]) { okay = false; break; }
}
if (okay)
{
Console.WriteLine("Solution for i={0} and j={1}", i, j);
}
}
}
Console.ReadLine();
}
There we go: oops there are no solutions. That means that the cryptobuffer is not doing what you think it's doing, or part of the C code is missing here. F.ex. do they really pass 'Buffer' to the CryptoBuffer method or did they change the pointer before?
Concluding, I think the only good answer here is that critical information for solving this question is missing.
The example you were provided with is inconsistent with the code in the C sample, and the C and C# code produce identical results.
The porting looks right; can you explain why 03 should become 6F? The fact that the result seems to be off the "expected" value by 03 is a bit suspicious to me.
The port looks right.
What I would do in this situation is to take out a piece of paper and a pen, write out the bytes in binary, do the XOR, and then the addition. Now compare this to the C and C# codes.
In C#, you are overflowing the byte so it gets truncated to 0x72. Here's the math for converting the 0x03 in both binary and hex:
00000011 0x003
^ 10101010 0x0AA
= 10101001 0x0A9
+ 11001001 0x0C9
= 101110010 0x172
With the original method in C, we first suppose the sequence is decrypted/encrypted in a symmetric way with calling CryptoBuffer
initially invoke on a5 03 18 01 ...
a5 03 18 01 ... => d8 72 7b 74 ...
then on d8 72 7b 74 ...
d8 72 7b 74 ... => 3b a1 9a a7 ...
initially invoke on a5 6f 93 8b ...
a5 6f 93 8b ... => d8 8e 02 ea ...
then on d8 8e 02 ea ...
d8 8e 02 ea ... => 3b ed 71 09 ...
and we know it's not feasible.
Of course, you might have an asymmetric decrypt method; but first off, we would need either a5 03 18 01 ... => a5 6f 93 8b ... or the reverse of direction been proved with any possible magic number. The code of an analysis with a brute force approach is put at the rear of post.
I made the magic number been a variable for testing. With reproducibility analysis, we found that the original sequence can be reproduced every 256 invocation on continuously varied magic number. Okay, with what we've gone through it's still possible here.
However, the feasibility analysis which tests all of 256*256=65536 cases with both direction, from original => expected and expected => original, and none makes it.
And now we know there is no way to decrypt the encrypted sequence to the expected result.
Thus, we can only tell that the expected behavior of both language or your code are identical, but for the expected result is not possible because of the assumption was broken.
Code for the analysis
public void CryptoBuffer(byte[] buffer, ushort magicShort) {
var magicBytes=BitConverter.GetBytes(magicShort);
var count=buffer.Length;
for(var i=0; i<count; i++) {
buffer[i]^=magicBytes[1];
buffer[i]+=magicBytes[0];
}
}
int Analyze(
Action<byte[], ushort> subject,
byte[] expected, byte[] original,
ushort? magicShort=default(ushort?)
) {
Func<byte[], String> LaHeX= // narrowing bytes to hex statement
arg => arg.Select(x => String.Format("{0:x2}\x20", x)).Aggregate(String.Concat);
var temporal=(byte[])original.Clone();
var found=0;
for(var i=ushort.MaxValue; i>=0; --i) {
if(found>255) {
Console.WriteLine(": might found more than the number of square root; ");
Console.WriteLine(": analyze stopped ");
Console.WriteLine();
break;
}
subject(temporal, magicShort??i);
if(expected.SequenceEqual(temporal)) {
++found;
Console.WriteLine("i={0:x2}; temporal={1}", i, LaHeX(temporal));
}
if(expected!=original)
temporal=(byte[])original.Clone();
}
return found;
}
void PerformTest() {
var original=new byte[] { 0xa5, 0x03, 0x18, 0x01 };
var expected=new byte[] { 0xa5, 0x6f, 0x93, 0x8b };
Console.WriteLine("--- reproducibility analysis --- ");
Console.WriteLine("found: {0}", Analyze(CryptoBuffer, original, original, 0xaac9));
Console.WriteLine();
Console.WriteLine("--- feasibility analysis --- ");
Console.WriteLine("found: {0}", Analyze(CryptoBuffer, expected, original));
Console.WriteLine();
// swap original and expected
var temporal=original;
original=expected;
expected=temporal;
Console.WriteLine("--- reproducibility analysis --- ");
Console.WriteLine("found: {0}", Analyze(CryptoBuffer, original, original, 0xaac9));
Console.WriteLine();
Console.WriteLine("--- feasibility analysis --- ");
Console.WriteLine("found: {0}", Analyze(CryptoBuffer, expected, original));
Console.WriteLine();
}
Here's a demonstration
http://codepad.org/UrX0okgu
shows that the original code, given an input of A5 03 18 01 produces D8 72 7B 01; so
rule that the first byte is not decoded can be correct only if the buffer is sent starting from 2nd (show us the call)
the output does not match (do you miss other calls?)
So your translation is correct but your expectations on what the original code does are not.

Little endian to integer

I am getting this string
8802000030000000C602000033000000000000800000008000000000000000001800000000000
and this is what i am expecting to convert from string,
88020000 long in little endian => 648
30000000 long in little endian => 48
C6020000 long in little endian => 710
33000000 long in little endian => 51
left side is the value i am getting from the string and right side is the value i am expecting. The right side values are might be wrong but is there any way i can get right side value from left??
I went through several threads here like
How to convert an int to a little endian byte array?
C# Big-endian ulong from 4 bytes
I tried quite different functions but nothing giving me value which are around or near by what i am expecting.
Update :
I am reading text file as below. Most of the data are current in text format, but all of the sudden i am getting bunch of GRAPHICS info, i am not sure how to handle it.
RECORD=28
cVisible=1
dwUser=0
nUID=23
c_status=1
c_data_validated=255
c_harmonic=0
c_dlg_verified=0
c_lock_sizing=0
l_last_dlg_updated=0
s_comment=
s_hlinks=
dwColor=33554432
memUsr0=
memUsr1=
memUsr2=
memUsr3=
swg_bUser=0
swg_dConnKVA=L0
swg_dDemdKVA=L0
swg_dCodeKVA=L0
swg_dDsgnKVA=L0
swg_dConnFLA=L0
swg_dDemdFLA=L0
swg_dCodeFLA=L0
swg_dDsgnFLA=L0
swg_dDiversity=L4607182418800017408
cStandard=0
guidDB={901CB951-AC37-49AD-8ED6-3753E3B86757}
l_user_selc_rating=0
r_user_selc_SCkA=
a_conn1=21
a_conn2=11
a_conn3=7
l_ct_ratio_1=x44960000
l_ct_ratio_2=x40a00000
l_set_ct_ratio_1=
l_set_ct_ratio_2=
c_ct_conn=0
ENDREC
GRAPHICS0=8802000030000000C602000033000000000000800000008000000000000000001800000000000
EOF
Depending on how you want to parse up the input string, you could do something like this:
string input = "8802000030000000C6020000330000000000008000000080000000000000000018000000";
for (int i = 0; i < input.Length ; i += 8)
{
string subInput = input.Substring(i, 8);
byte[] bytes = new byte[4];
for (int j = 0; j < 4; ++j)
{
string toParse = subInput.Substring(j * 2, 2);
bytes[j] = byte.Parse(toParse, NumberStyles.HexNumber);
}
uint num = BitConverter.ToUInt32(bytes, 0);
Console.WriteLine(subInput + " --> " + num);
}
88020000 --> 648
30000000 --> 48
C6020000 --> 710
33000000 --> 51
00000080 --> 2147483648
00000080 --> 2147483648
00000000 --> 0
00000000 --> 0
18000000 --> 24
Do you really literally mean that that's a string? What it looks like is this: You have a bunch of 32-bit words, each represented by 8 hex digits. Each one is presented in little-endian order, low byte first. You need to interpret each of those as an integer. So, e.g., 88020000 is 88 02 00 00, which is to say 0x00000288.
If you can clarify exactly what it is you've got -- a string, an array of some kind of numeric type, or what -- then it'll be easier to advise you further.

How to convert int to char[] without generating garbage in C#

Doubtless this seems like a strange request, given the availability of ToString() and Convert.ToString(), but I need to convert an unsigned integer (i.e. UInt32) to its string representation, but I need to store the answer into a char[].
The reason is that I am working with character arrays for efficiency, and as the target char[] is initialised as a member to char[10] (to hold the string representation of UInt32.MaxValue) on object creation, it should be theoretically possible to do the conversion without generating any garbage (by which I mean without generating any temporary objects in the managed heap.)
Can anyone see a neat way to achieve this?
(I'm working in Framework 3.5SP1 in case that is any way relevant.)
Further to my comment above, I wondered if log10 was too slow, so I wrote a version that doesn't use it.
For four digit numbers this version is about 35% quicker, falling to about 16% quicker for ten digit numbers.
One disadvantage is that it requires space for the full ten digits in the buffer.
I don't swear it doesn't have any bugs!
public static int ToCharArray2(uint value, char[] buffer, int bufferIndex)
{
const int maxLength = 10;
if (value == 0)
{
buffer[bufferIndex] = '0';
return 1;
}
int startIndex = bufferIndex + maxLength - 1;
int index = startIndex;
do
{
buffer[index] = (char)('0' + value % 10);
value /= 10;
--index;
}
while (value != 0);
int length = startIndex - index;
if (bufferIndex != index + 1)
{
while (index != startIndex)
{
++index;
buffer[bufferIndex] = buffer[index];
++bufferIndex;
}
}
return length;
}
Update
I should add, I'm using a Pentium 4. More recent processors may calculate transcendental functions faster.
Conclusion
I realised yesterday that I'd made a schoolboy error and run the benchmarks on a debug build. So I ran them again but it didn't actually make much difference. The first column shows the number of digits in the number being converted. The remaining columns show the times in milliseconds to convert 500,000 numbers.
Results for uint:
luc1 arx henk1 luc3 henk2 luc2
1 715 217 966 242 837 244
2 877 420 1056 541 996 447
3 1059 608 1169 835 1040 610
4 1184 795 1282 1116 1162 801
5 1403 969 1405 1396 1279 978
6 1572 1149 1519 1674 1399 1170
7 1740 1335 1648 1952 1518 1352
8 1922 1675 1868 2233 1750 1545
9 2087 1791 2005 2511 1893 1720
10 2263 2103 2139 2797 2012 1985
Results for ulong:
luc1 arx henk1 luc3 henk2 luc2
1 802 280 998 390 856 317
2 912 516 1102 729 954 574
3 1066 746 1243 1060 1056 818
4 1300 1141 1362 1425 1170 1210
5 1557 1363 1503 1742 1306 1436
6 1801 1603 1612 2233 1413 1672
7 2269 1814 1723 2526 1530 1861
8 2208 2142 1920 2886 1634 2149
9 2360 2376 2063 3211 1775 2339
10 2615 2622 2213 3639 2011 2697
11 3048 2996 2513 4199 2244 3011
12 3413 3607 2507 4853 2326 3666
13 3848 3988 2663 5618 2478 4005
14 4298 4525 2748 6302 2558 4637
15 4813 5008 2974 7005 2712 5065
16 5161 5654 3350 7986 2994 5864
17 5997 6155 3241 8329 2999 5968
18 6490 6280 3296 8847 3127 6372
19 6440 6720 3557 9514 3386 6788
20 7045 6616 3790 10135 3703 7268
luc1: Lucero's first function
arx: my function
henk1: Henk's function
luc3 Lucero's third function
henk2: Henk's function without the copy to the char array; i.e. just test the performance of ToString().
luc2: Lucero's second function
The peculiar order is the order they were created in.
I also ran the test without henk1 and henk2 so there would be no garbage collection. The times for the other three functions were nearly identical. Once the benchmark had gone past three digits the memory use was stable: so GC was happening during Henk's functions and didn't have a detrimental effect on the other functions.
Conclusion: just call ToString()
The following code does it, with the following caveat: it does not respect the culture settings, but always outputs normal decimal digits.
public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
if (value == 0) {
buffer[bufferIndex] = '0';
return 1;
}
int len = (int)Math.Ceiling(Math.Log10(value));
for (int i = len-1; i>= 0; i--) {
buffer[bufferIndex+i] = (char)('0'+(value%10));
value /= 10;
}
return len;
}
The returned value is how much of the char[] has been used.
Edit (for arx): the following version avoids the floating-point math and swaps the buffer in-place:
public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
if (value == 0) {
buffer[bufferIndex] = '0';
return 1;
}
int bufferEndIndex = bufferIndex;
while (value > 0) {
buffer[bufferEndIndex++] = (char)('0'+(value%10));
value /= 10;
}
int len = bufferEndIndex-bufferIndex;
while (--bufferEndIndex > bufferIndex) {
char ch = buffer[bufferEndIndex];
buffer[bufferEndIndex] = buffer[bufferIndex];
buffer[bufferIndex++] = ch;
}
return len;
}
And here yet another variation which computes the number of digits in a small loop:
public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
if (value == 0) {
buffer[bufferIndex] = '0';
return 1;
}
int len = 1;
for (uint rem = value/10; rem > 0; rem /= 10) {
len++;
}
for (int i = len-1; i>= 0; i--) {
buffer[bufferIndex+i] = (char)('0'+(value%10));
value /= 10;
}
return len;
}
I leave the benchmarking to whoever wants to do it... ;)
I'm coming little late to the party, but I guess you probably cannot get faster and less memory demanding results than with simple reinterpreting of the memory:
[System.Security.SecuritySafeCritical]
public static unsafe char[] GetChars(int value, char[] chars)
{
//TODO: if needed to use accross machines then
// this should also use BitConverter.IsLittleEndian to detect little/big endian
// and order bytes appropriately
fixed (char* numPtr = chars)
*(int*)numPtr = value;
return chars;
}
[System.Security.SecuritySafeCritical]
public static unsafe int ToInt32(char[] value)
{
//TODO: if needed to use accross machines then
// this should also use BitConverter.IsLittleEndian to detect little/big endian
// and order bytes appropriately
fixed (char* numPtr = value)
return *(int*)numPtr;
}
This is just a demonstration of an idea - you'd obviously need to add check for char array size and make sure that you have proper byte-ordering encoding. You can peek into reflected helper methods of BitConverter for those checks.

Categories