Is a substring of a CSPRN also a CSPRN? - c#

I want to generate a 4 character long CSPRN (Cryptographically Secure Psuedo Random Number) string. I know I can create an 8 character one by creating a 5 byte long random array and encoding as base32:
string CSPRN = "";
System.Security.Cryptography.RandomNumberGenerator rng = new System.Security.Cryptography.RNGCryptoServiceProvider();
byte[] tokenData = new byte[5];
rng.GetBytes(tokenData);
CSPRN = Base32.ToBase32String(tokenData); //should produce a string 5bytes*1.6charsperbyte = 8 chars long.
If I now take a substring of the first 4 characters of CSPRN - is it still a CSPRN?
My best guess is that it is, but wondering if there is any "gotcha"s from taking a substring rather than generating a smaller number.

Yes, it is. First lets look at your first security claim. Base 32 is a 5 bit encoding, so if you generate 5 times anything, say a byte, the string you generate contains full entropy per character.
Now the amount of entropy per character doesn't suddenly drop if you take it out of the string of course. So if you take 4 characters each should still contain the full entropy, giving you simply 4 characters of full of entropy within the base 32 alphabet.

Related

Creating URL ShortCode in C#

I am using this article to create a short code for a URL.
I've been working on this for a while and the pseudo code is just not making any sense to me. He states in "loop1" that I'm supposed to look from the first 4 bytes to the 4th 4 bytes, and then cast the bytes to an integer, then convert that to bits. I end up with 32 bits for each 4 bytes, but he's using 5 bytes in the "loop3" which isn't divisible by 32. I am not understanding what he's trying to say.
Then I noticed that he closes "loop2" at the bottom after you've written the short code to the database. That's not making any sense to me because I would be writing the same short code to the database over and over again.
Then I have "loop1" which is going to loop into infinity, again I'm not seeing why I would need to update the database to infinity.
I have tried to follow his example and ran it through the debugger line-by-line, but it's not making sense.
Here is the code I have so far, according to what I've been able to understand:
private void button1_Click(object sender, EventArgs e)
{
string codeMap = "abcdefghijklmnopqrstuvwxyz012345"; // 32 bytes
// Compute MD5 Hash
MD5 md5 = MD5.Create();
byte[] inputBytes = Encoding.ASCII.GetBytes(txtURL.Text);
byte[] hash = md5.ComputeHash(inputBytes);
// Loop from the first 4 bytes to the 4th 4 bytes
byte[] FourBytes = new byte[4];
for (int i = 0; i <= 3; i++)
{
FourBytes[i] = hash[i];
//int CastedBytes = FourBytes[i];
BitArray binary = new BitArray(FourBytes);
int CastedBytes = 0;
for(int ii = 0; i <=5; i++)
{
CastedBytes = CastedBytes + ii;
}
}
Can someone help me figure out what I'm doing wrong, so I can get this program working? I just need to convert URLs into short 6-digit unique codes.
Thanks.
Your MD5 hash is 128 bits. The idea is to represent those 128 bits in 6 characters, ideally without losing any information.
The codeMap contains 32 characters
string codeMap = "abcdefghijklmnopqrstuvwxyz012345"
Note that 2^5 is also 32. The third loop is using 5 bits of the hash at a time, and converting those 5 bits to a character in the codeMap. For example, for the bit pattern
00001 00011 00100
b d e
The algorithm uses 6 sets of 5 bits, so 30 bits in total. 2 bits are "wasted".
Note though that the 128 bit MD5 is being taken 4 bytes at a time, and those 4 bytes are converted to an integer. That is one approach to consuming the bits of the MD5, but certainly not the only one. It involves bit masking and bit shifting.
You may find it more straightforward to use a BitArray for the implementation. While this is probably slightly less efficient, it will not likely matter. If you go that path, initialize the BitArray with the bits of your MD5 hash, and then just take 5 bits at a time, converting them to a number in the range 0..31 to use as an index into codeMap.
This bit from the article is misleading
6 characters of short code can used to map 32^6 (1,073,741,824) URLs so it is unlikely to be used up in the near future
Due to the possibility of hash collisions, the system can manage far fewer than 1 billion URLs without a significant risk of the same short URL being assigned to two long URLs. See the Birthday Problem for more.
Unless you are expecting to have a hugely popular URL shortener, just use base 16 or base 64 off of a database auto increment column.
Base 16 would provide 16 million unique URLs. Base 64 would provide ~2^^36.

Generating a unique 15 digite Pin code from a 10digit number

I want to create pin codes and serial numbers for scratch papers , I have already generated unique 10 digit numbers , now I want to turn that 10 digit number to a 16 digit number (with check digit in the end) . The thing is that the function that does this should be reversible so by seeing the 16 digit number I can check whether it is valid or not .(if it is not generated by me it should not be valid) .
this is how I have generated the 10 digit unique random codes :
Guid PinGuid;
byte[] Arr;
UInt32 PINnum = 0;
while (PINnum.ToString().Length != 10)
{
PinGuid = Guid.NewGuid();
Arr = PinGuid.ToByteArray();
PINnum = BitConverter.ToUInt32(Arr, 0);
}
return PINnum.ToString();
I would be grateful if you can give me a hint on how to do it .
First off, I would avoid GUID since some prefixes are reserved for special applications. Which means that these areas of the GUID may not be allocated uniformly on creation, so you may not get exactly 10 digits of randomness like you plan.
Also since your loop waits for the GUID to become the right size you could do it more efficiently.
10 digits = 10**10
Log_2(10) = approx 3322/1000
So you need approx 33 bits for 10 digit number. Since you want your number to be exactly 10 digits, you can either pad numbers less than 10^10 with leading zeroes, or you can generate only numbers between 10^9 and 10^10 - 1.
If you take the latter case you need 9*10^9 numbers in your space -- giving you all numbers from 1 followed by nine zeroes up to 9 followed by 9 9s.
Then you would like to convert this space of numbers into a larger space, to expand it by a factor of 5 and include one more digit as a check digit.
Pick a check digit function as anything you like. You could simply sum (mod 10) the original 10 digits, or choose something more complicated.
Presumably you do not want people to be able to generate valid instances. So if you are really serious about your security, you should modify any suggestions you get from the net before deploying them.
I would do something along the lines of :
Generate a uniform 10digit number with no leading zeroes by
randomTenDigits = 10**9 + rand(9*10**9)
Using an encryption scheme (like AES 256 or even RSA or El-Gamal since their slower speed will no be so important since input length is small ) encrypt this 10 digit number using a secret key only you and others you trust are aware of. Perhaps you can concatenate the 10 digit number 10 times, and then concatenate that result with some other secret that you choose, and then finally encrypt this expanded secret of which the 10 digit number is a part.
Take some choice 5 digits (around 17 bits) of the resulting ciphertext, and append these to your 10 digit number.
Generate 1 digit of check digit by whatever method you desire.
As you will note the real security of this scheme is not from a check digit, it is from the secret key you can use to authenticate the 16 digit number. The test you will use to authenticate it is: does the given 10 digit number when concatenated with other secrets I have, encrypt, using a secret key only I know, to the given 5 digit number presented with it.
Since the difficulty for an attacker of forging one of your numbers depends on the difficulty of
discovering your secret keys and other info
discovering which method of encryption you use
discovering which part of the resulting cipher text you emit for the 5 digit secret, or
simply brute forcing the 5 digits to discover the correct pairing, and since 5 digits is not a big space to search, I would suggest instead generating larger numbers. 10 or 16 digits is not really a huge space to search. So instead of digits I would use upper and lower case letters plus digits plus space and full stop to give you 64 letters in your alphabet. Then if you used 16 you get around 96 bits of security.
However if numbers are non-negotiable and the size of 10 digits for your base space is also non-negotiable, doing it this way is probably the most secure. You may be able to set up your system to deter people from brute forcing it, though you should consider what if someone acquires a piece of your hardware through a vendor. I believe it is easier to design security in rather than design in a mechanism for detecting people trying to brute force query your system.
However if serious dough is on the line ( like millions ) the security you employ should really be first class. Equivalent to the kind of security you would employ to protect a pin number to a million dollar bank account. The more secure you are the longer you can carry on your biz with credibility and trust.
So along these lines I would suggest increasing the size of your secrets to make it infeasible for someone to simply try all combinations and forge a valid one, and in particular thinking about how to design your system to make it difficult to break for people with lots of skills and motivation (money). You really can't be too careful.
I would keep it simple. Put PINnum.ToString() into a buffer. Place a filler digit at 5 intervals. The first four could be random garbage and the last could be a check digit, or you could make each filler a check digit for its section. Here is an example.
buf = PINnum.ToString();
int chkdgit = function to create your checkdigit
Random rnd = new Random();
int i = rnd.Next(1001,9999);
fillbuf = i.toString();
return buf[0] + buf[1] + fillbuf[0] + buf[2] .... chkdgit.toString();
its a rather simple approach, but if your security needs aren't at level 1, it might suffice

Date Time Encoding

Any ideas or implementations floating about for encoding the current date including the milliseconds into the shortest possible string length?
e.g I want 31/10/2011 10:41:45 in the shortest string possible (ideally 5 characters) - obviously decodable.
If it is impossible to get down to 5 characters, then the year is optional.
edit: it doesn't actually need to be decodable. It just needs to be a unique string.
An time_t is 31 bits. Add 10 bits for up to 1000 milliseconds: That's 41 bits. You want 5 characters: That's 8 bits for the 1st 4 characters + 9 bits for the last one.
Using Chinese ideograms, you should easily be able to find a range of 256 consecutive chars for each of the 1st 4 chars and a range of 512 for the last one.
Needless to say your encoded date will look... chinese! But it should do the trick ;-)
BTW, you don't have to stick to Chinese. You might even want to choose a different Unicode 256 chars range for each character. Of course, you'll want to find sequences of 256/512 printable chars.
Now let's say we skip the year. We're down to 86400 x 366 seconds per year = 31622400 seconds. Including millisecs : 31622400000. That's 35 bits. Great: We're down at 7 bits per character. Easy! :-)
you can use the Ticks:
var ticks = System.DateTime.Now.Ticks;
this is a 64bit number. You get the Time back by calling:
var timeBack = new System.DateTime(ticks);
of course this are 8 bytes but I don't think you can get this more compact (easily).
No can do: The total ms in an year (365 days) is 31,536,000,000 (=365*24*60*60*1000). You need 34.87628063 bits of information to store that value (log2 31,536,000,000). You probably meant "printable characters" BUT you would need 7 bits/character to store 35 bits in 5 characters. As an example base64 is 6 bits/character of information, so 6 characters. Ascii85 would be a little better, but still you would need around 5.5 characters, so 6 characters.
Clearly if you meant 5 BYTES, everything changes. You can store 34.84 years (in ms) in that space.
And if you meant 5 C# PRINTABLE AND UNPRINTABLE CHARACTERS (each C# character is 16 bits), then it's even better. 10 bytes! DateTime in C# is only 8 bytes and it uses ticks (they are a VERY VERY VERY small part of a second)!
BUT if you meant 5 C# PRINTABLE CHARACTERS characters, then use Serge's response. It's very good and show us that the world is a big place (and show us that why good questions are so much important: they let us see the world in new ways).
You can use ASCII characters to represent the numbers and drop the formatting, for example:
31/10/2011 10:41:45
*/*/** *:*:*
*******
That's 7, you can drop 2 if you don't want to include the full year. Obviously the * are actual characters relating to a number, A could be 1 etc, or even use the proper ASCII codes.

Bit/byte conversion

How many bits is a .NET string that's 10 characters in length? (.NET strings are UTF-16, right?)
On 32-bit systems:
4 bytes = Type pointer (Every object has one of these)
4 bytes = Lock (One of these too!)
4 bytes = Length (Need the length)
2 * Length bytes = Data (And the chars themselves)
=======================
12 + 2*Length bytes
=======================
96 + 16*Length bits
So 10 chars would = 256 bits = 32 bytes
I am not sure if the Lock grows to 64-bit on 64-bit systems. I kinda hope not, but you never know. The 64-bit structure overhead is therefore anywhere from 16-20 bytes (as opposed to the 12 bytes on 32-bit).
Every char in the string is two bytes in size, so if you are just converting the chars directly and not using any particular encoding, the answer is string.Length * 2 * 8
otherwise the result depends on the encoding, you can write:
int numbits = System.Text.Encoding.UTF8.GetByteCount(str)*8; //returns 80
or
int numbits = System.Text.Encoding.Unicode.GetByteCount(str)*8 //returns 160
If you are talking pure Unicode-16 then:
10 characters = 20 bytes = 160 bits
This really needs a context in order to be answered properly.
It all comes down to how you define character and how to you store the data.
For example, if you define character as a single letter from the users point of view it can be more than 2 bytes, for example this character: Å is two Unicode code points (U+0041 U+030A, Latin Capital A + Combining Ring Above) so it will require two .net chars or 4 bytes int UTF-16.
Now even if you are talking about 10 .net Char elements than if it's in memory you have some object overhead (that was already mentioned) and a bit of alignment overhead (on 32bit system everything has to be aligned to 4 bytes boundary, in 64bit the rules are more complicated) so you may have some empty bytes at the end.
If you are talking about database or files than each database and file system has its own overhead.

ASCII values in hexadecimal notation

I am trying to parse some output data from and PBX and I have found something that I can't really figure out.
In the documentation it says the following
Information for type of call and feature. Eight character for ’status information 3’ with following ASCII values in hexadecimal notation.
1. Character
Bit7 Incoming call
Bit6 Outgoing call
Bit5 Internal call
Bit4 CN call
2. Character
Bit3 Transferred call (transferring party inside)
Bit2 CN-transferred call (transferring party outside)
Bit1
Bit0
Any ideas how to interpret this? I have no raw data at the time to match against but I still need to figure it out.
Probably you'll receive two characters (hex digits: 0-9, A-F) First digit represents the hex value for the most significant 4 bits, next digit for the least significant 4 bits.
Example:
You will probably receive something like the string "7C" as hex representation of the bitmap: 01111100.
Eight character for ’status information 3’ with following ASCII values in hexadecimal notation.
If think this means the following.
You will get 8 bytes - one byte per line, I guess.
It is just the wrong term. They mean two hex digits per byte but call them characters.
So it is just a byte with bit flags - or more precisely a array of eight such bytes.
Bit
7 incoming
6 outgoing
5 internal
4 CN
3 transfered
2 CN transfered
1 unused?
0 unused?
You could map this to a enum.
[BitFlags]
public enum CallInformation : Byte
{
Incoming = 128,
Outgoing = 64,
Internal = 32,
CN = 16
Transfered = 8,
CNTransfered = 4,
Undefined = 0
}
Very hard without data. I'd guess that you will get two bytes (two ASCII characters), and need to pick them apart at the bit level.
For instance, if the first character is 'A', you will need to look up its character code (65, or hex 0x41), and then look at the bits. Of course the bits are the same regardless of decimal or hex, but its easer to do by hand in hex. 0x41 is bit 5 and bit 1 set, so that would be an "internal call". Bit 1 seems undocumented.
I'm not sure why it looks as if that would require two characters; it's only eight bits documented.

Categories