I'm implementing a URL shortening feature in my application in order to provide my users shorter alternative URLs that can be used in Twitter. The point is to be independent from the shortening services that offer this same service and include it as a feature of my web app.
What's the best way to create an unique random sequence of characters of about 6 chars? I plan to use that as an index for the items in my database that will have the alternative URLs.
Edited:
This feature will be used in a job board website, where every new job ad will get a custom URL with the title plus the shorter one to be used in Twitter. That said, the total number of unique 6 char combinations will be more than enough for a long time.
Do you really need 'random', or would 'unique' be sufficient?
Unique is extremely simple - just insert the URL into a database, and convert the sequential id for that record to a base-n number which is represented by your chosen characterset.
For example, if you want to only use [A-Z] in your sequence, you convert the id of the record to a base 26 number, where A=1, B=2,... Z=26. The algothithm is a recursive div26/mod26, where the quotient is the required character and the remainder is used to calculate the next character.
Then when retrieving URL, you perform the inverse function, which is to convert the base-26 number back to decimal. Perform SELECT URL WHERE ID = decimal, and you're done!
EDIT:
private string alphabet = "abcdefghijklmnopqrstuvwxyz";
// or whatever you want. Include more characters
// for more combinations and shorter URLs
public string Encode(int databaseId)
{
string encodedValue = String.Empty;
while (databaseId > encodingBase)
{
int remainder;
encodedValue += alphabet[Math.DivRem(databaseId, alphabet.Length,
out remainder)-1].ToString();
databaseId = remainder;
}
return encodedValue;
}
public int Decode(string code)
{
int returnValue;
for (int thisPosition = 0; thisPosition < code.Length; thisPosition++)
{
char thisCharacter = code[thisPosition];
returnValue += alphabet.IndexOf(thisCharacter) *
Math.Pow(alphabet.Length, code.Length - thisPosition - 1);
}
return returnValue;
}
The simplest way to make unique sequences is to do this sequentially, ie: aaaaaa aaaaab aaaaac ... These aren't necessarily the prettiest, but will guarantee uniqueness for the first 12230590463 sequences (provided you used a-z and A-Z as unique characters). If you need more URLs than that, you'd need to add a seventh char.
They aren't random sequences, though. If you make random ones, just pick a random char of the 48, 6 times. You'll need to check your existing DB for "used" sequences, though, as you'll be more likely to get collisions.
I would use an autonumber system, and create an algorithm to generate the keys. ie 1 = a, 2 = b, 27 = aa etc.
You can use the database autonumber to guarantee that your URL is unique, and you can calculate the URL possibly in a sproc in the DB or in your business layer?
Additionally you can now index on the incrementing number which is cheap and DB's are optimised for these to be used and hashed as primary/foreign keys as opposed to a variable length random string.
The usefulness of a random generator is limited to preventing users from plugging random URLs in to find things they shouldn't have a link to. If this is not your goal then sequential IDs should work just fine. If you just don't want to give users the impression that they are using "infant" technology (when they see that their job ad is #000001), why not start the sequence at some arbitrary value?
When you state "total number of unique 6 char combinations will be more than enough for a long time" for your random generation have you factored the birthday paradox into your calculations? This is generally the bane of any attempt to create random IDs within a range that is only 1 order of magnitude or less then the expected range that will be needed.
To create truly random IDs, you would need to create a loop that generates a new random value, checks to see if that value has already been used, and then repeats the loop if needed. The birthday paradox means that you quickly get to the point where many of the values generated are already in use (despite only a fraction of the total range being consumed), which causes the program to get slower and slower over time until it is taking thousands of attempts (and database lookups) to generate each ID.
I would suggest you go with the idea of encoding sequential IDs. To avoid the problem of users being able to simply increment/decrement the value in the URL to "explore", you can use a combination bit shifting and an alternate ordered list of letters (instead of 1=a, 2=b use 1=t, 2=j, etc).
Thinking about this more here is an idea.
You can start with a key table, incrementing chars AAAAAA - ZZZZZZ.
Then do a random select from that table each time you insert a new URL, and delete from the available keys.
Thoughts?
For random select try this link
Select a random row with MySQL:
SELECT column FROM table
ORDER BY RAND()
LIMIT 1
Select a random row with PostgreSQL:
SELECT column FROM table
ORDER BY RANDOM()
LIMIT 1
Select a random row with Microsoft SQL Server:
SELECT TOP 1 column FROM table
ORDER BY NEWID()
Select a random row with IBM DB2
SELECT column, RAND() as IDX
FROM table
ORDER BY IDX FETCH FIRST 1 ROWS ONLY
Thanks Tim
Select a random record with Oracle:
SELECT column FROM
( SELECT column FROM table
ORDER BY dbms_random.value )
WHERE rownum = 1
I'd say hash it!
http://www.codinghorror.com/blog/archives/000935.html
Instead of keeping a table of all possible values, just keep a table of the values you've used. Use the random function to generate 6 random values, 1 to 26, make the string from that and save it in an array or table. If it already exists, you can (a) generate another string, or (b) move through the table to the next available (missing) 6-letter string and use that value. (b) will be more efficient as the table fills.
Following the idea of Reed Copsey's answer, i present the following code:
class IDGetter
{
private StringID ID = new StringID();
public string GetCurrentID()
{
string retStr = "";
if (ID.char1 > 51)
id.char1 = 0;
if (ID.char2 > 51)
id.char2 = 0;
if (ID.char3 > 51)
id.char3 = 0;
if (ID.char4 > 51)
id.char4 = 0;
if (ID.char5 > 51)
id.char5 = 0;
if (ID.char6 > 51)
throw new Exception("the maximum number of id's has been reached");
return ToIDChar(ID.char1) + ToIDChar(ID.char2) + ToIDChar(ID.char3) + ToIDChar(ID.char4) + ToIDChar(ID.char5) + ToIDChar(ID.char6)
id.char1++;
}
public void SetCurrentID(StringID id) //for setting the current ID from storage or resetting it or something
{
this.ID = id;
}
private const string alphabet = "abcdefghijklmopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static string ToIDChar(int number)
{
if (number > 51 || number < 0)
{
throw new InvalidArgumentException("the number passed in (" + number + ") must be between the range 0-51");
}
return alphabet[number];
}
}
public struct StringID
{
public int char1 = 0;
public int char2 = 0;
public int char3 = 0;
public int char4 = 0;
public int char5 = 0;
public int char6 = 0;
}
You might want to come up with a method of storing the current ID but that ought to work.
I used this to do something very similar. I was not to worried about the speed of it as it was going to be a rarely used event and table. But it's possible to then increase the string as needed.
/// Generates a string and checks for existance
/// <returns>Non-existant string as ID</returns>
public static string GetRandomNumbers(int numChars, string Type)
{
string result = string.Empty;
bool isUnique = false;
while (!isUnique)
{
//Build the string
result = MakeID(numChars);
//Check if unsued
isUnique = GetValueExists(result, Type);
}
return result;
}
/// Builds the string
public static string MakeID(int numChars)
{
string random = string.Empty;
string[] chars = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z" };
Random rnd = new Random();
for (int i = 0; i < numChars; i++)
{
random += chars[rnd.Next(0, 35)];
}
return random;
}
/// Checks database tables based on type for existance, if exists then retry
/// <returns>true or false</returns>
private static bool GetValueExists(string value, string Type)
{
bool result = false;
string sql = "";
if (Type == "URL")
{
sql = string.Format(#"IF EXISTS (SELECT COUNT(1) FROM myTable WHERE uniqueString = '{0}')
BEGIN
SELECT 1
END
ELSE
BEGIN
SELECT 0
END ", value);
}
//query the DB to see if it's in use
result = //ExecuteSQL
return result;
}
Related
I tried to sort Guids generated by UuidCreateSequential, but I see the results are not correct, am I mising something? here is the code
private class NativeMethods
{
[DllImport("rpcrt4.dll", SetLastError = true)]
public static extern int UuidCreateSequential(out Guid guid);
}
public static Guid CreateSequentialGuid()
{
const int RPC_S_OK = 0;
Guid guid;
int result = NativeMethods.UuidCreateSequential(out guid);
if (result == RPC_S_OK)
return guid;
else throw new Exception("could not generate unique sequential guid");
}
static void TestSortedSequentialGuid(int length)
{
Guid []guids = new Guid[length];
int[] ids = new int[length];
for (int i = 0; i < length; i++)
{
guids[i] = CreateSequentialGuid();
ids[i] = i;
Thread.Sleep(60000);
}
Array.Sort(guids, ids);
for (int i = 0; i < length - 1; i++)
{
if (ids[i] > ids[i + 1])
{
Console.WriteLine("sorting using guids failed!");
return;
}
}
Console.WriteLine("sorting using guids succeeded!");
}
EDIT1:
Just to make my question clear, why the guid struct is not sortable using the default comparer ?
EDIT 2:
Also here are some sequential guids I've generated, seems they are not sorted ascending as presented by the hex string
"53cd98f2504a11e682838cdcd43024a7",
"7178df9d504a11e682838cdcd43024a7",
"800b5b69504a11e682838cdcd43024a7",
"9796eb73504a11e682838cdcd43024a7",
"c14c5778504a11e682838cdcd43024a7",
"c14c5779504a11e682838cdcd43024a7",
"d2324e9f504a11e682838cdcd43024a7",
"d2324ea0504a11e682838cdcd43024a7",
"da3d4460504a11e682838cdcd43024a7",
"e149ff28504a11e682838cdcd43024a7",
"f2309d56504a11e682838cdcd43024a7",
"f2309d57504a11e682838cdcd43024a7",
"fa901efd504a11e682838cdcd43024a7",
"fa901efe504a11e682838cdcd43024a7",
"036340af504b11e682838cdcd43024a7",
"11768c0b504b11e682838cdcd43024a7",
"2f57689d504b11e682838cdcd43024a7"
First off, let's re-state the observation: when creating sequential GUIDs with a huge time delay -- 60 billion nanoseconds -- between creations, the resulting GUIDs are not sequential.
am I missing something?
You know every fact you need to know to figure out what is going on. You're just not putting them together.
You have a service that provides numbers which are both sequential and unique across all computers in the universe. Think for a moment about how that is possible. It's not a magic box; someone had to write that code.
Imagine if you didn't have to do it using computers, but instead had to do it by hand. You advertise a service: you provide sequential globally unique numbers to anyone who asks at any time.
Now, suppose I ask you for three such numbers and you hand out 20, 21, and 22. Then sixty years later I ask you for three more and surprise, you give me 13510985, 13510986 and 13510987. "Wait just a minute here", I say, "I wanted six sequential numbers, but you gave me three sequential numbers and then three more. What gives?"
Well, what do you suppose happened in that intervening 60 years? Remember, you provide this service to anyone who asks, at any time. Under what circumstances could you give me 23, 24 and 25? Only if no one else asked within that 60 years.
Now is it clear why your program is behaving exactly as it ought to?
In practice, the sequential GUID generator uses the current time as part of its strategy to enforce the globally unique property. Current time and current location is a reasonable starting point for creating a unique number, since presumably there is only one computer on your desk at any one time.
Now, I caution you that this is only a starting point; suppose you have twenty virtual machines all in the same real machine and all trying to generate sequential GUIDs at the same time? In these scenarios collisions become much more likely. You can probably think of techniques you might use to mitigate collisions in these scenarios.
After researching, I can't sort the guid using the default sort or even using the default string representation from guid.ToString as the byte order is different.
to sort the guids generated by UuidCreateSequential I need to convert to either BigInteger or form my own string representation (i.e. hex string 32 characters) by putting bytes in most signification to least significant order as follows:
static void TestSortedSequentialGuid(int length)
{
Guid []guids = new Guid[length];
int[] ids = new int[length];
for (int i = 0; i < length; i++)
{
guids[i] = CreateSequentialGuid();
ids[i] = i;
// this simulates the delay between guids creation
// yes the guids will not be sequential as it interrupts generator
// (as it used the time internally)
// but still the guids should be in increasing order and hence they are
// sortable and that was the goal of the question
Thread.Sleep(60000);
}
var sortedGuidStrings = guids.Select(x =>
{
var bytes = x.ToByteArray();
//reverse high bytes that represents the sequential part (time)
string high = BitConverter.ToString(bytes.Take(10).Reverse().ToArray());
//set last 6 bytes are just the node (MAC address) take it as it is.
return high + BitConverter.ToString(bytes.Skip(10).ToArray());
}).ToArray();
// sort ids using the generated sortedGuidStrings
Array.Sort(sortedGuidStrings, ids);
for (int i = 0; i < length - 1; i++)
{
if (ids[i] > ids[i + 1])
{
Console.WriteLine("sorting using sortedGuidStrings failed!");
return;
}
}
Console.WriteLine("sorting using sortedGuidStrings succeeded!");
}
Hopefully I understood your question correctly. It seems you are trying to sort the HEX representation of your Guids. That really means that you are sorting them alphabetically and not numerically.
Guids will be indexed by their byte value in the database. Here is a console app to prove that your Guids are numerically sequential:
using System;
using System.Linq;
using System.Numerics;
class Program
{
static void Main(string[] args)
{
//These are the sequential guids you provided.
Guid[] guids = new[]
{
"53cd98f2504a11e682838cdcd43024a7",
"7178df9d504a11e682838cdcd43024a7",
"800b5b69504a11e682838cdcd43024a7",
"9796eb73504a11e682838cdcd43024a7",
"c14c5778504a11e682838cdcd43024a7",
"c14c5779504a11e682838cdcd43024a7",
"d2324e9f504a11e682838cdcd43024a7",
"d2324ea0504a11e682838cdcd43024a7",
"da3d4460504a11e682838cdcd43024a7",
"e149ff28504a11e682838cdcd43024a7",
"f2309d56504a11e682838cdcd43024a7",
"f2309d57504a11e682838cdcd43024a7",
"fa901efd504a11e682838cdcd43024a7",
"fa901efe504a11e682838cdcd43024a7",
"036340af504b11e682838cdcd43024a7",
"11768c0b504b11e682838cdcd43024a7",
"2f57689d504b11e682838cdcd43024a7"
}.Select(l => Guid.Parse(l)).ToArray();
//Convert to BigIntegers to get their numeric value from the Guids bytes then sort them.
BigInteger[] values = guids.Select(l => new BigInteger(l.ToByteArray())).OrderBy(l => l).ToArray();
for (int i = 0; i < guids.Length; i++)
{
//Convert back to a guid.
Guid sortedGuid = new Guid(values[i].ToByteArray());
//Compare the guids. The guids array should be sequential.
if(!sortedGuid.Equals(guids[i]))
throw new Exception("Not sequential!");
}
Console.WriteLine("All good!");
Console.ReadKey();
}
}
I need to generate random string for my automated tests. I use the following Chinese custom class to do it.
public class UniqueIdGenerator
{
static public string GenerateUniqueId(int idLength) // generates uniqueID for all fields (0-175 characters)
{
string uniqueID = "";
string initialID = Guid.NewGuid().ToString().Remove(35);
if (idLength <= 35)
{
uniqueID = Guid.NewGuid().ToString().Remove(idLength);
}
if (idLength > 35 & idLength <= 70)
{
uniqueID = initialID + Guid.NewGuid().ToString().Remove(idLength - 35);
}
if (idLength > 70 & idLength <= 105)
{
uniqueID = initialID + initialID + Guid.NewGuid().ToString().Remove(idLength - 70);
}
if (idLength > 105 & idLength <= 140)
{
uniqueID = initialID + initialID + initialID + Guid.NewGuid().ToString().Remove(idLength - 105);
}
if (idLength > 140 & idLength <= 175)
{
uniqueID = initialID + initialID + initialID + initialID + Guid.NewGuid().ToString().Remove(idLength - 140);
}
return uniqueID;
}
How can I simplify it to use for any natural number?
This is a very bad idea and you should not do this.
First off, you say that you need a random string, but guids are guaranteed to be unique, not random. Guids can be generated randomly, or they can be generated based on the current time, or they can be generated by assigning a range of guids to a particular machine and then generating them sequentially. There are any number of ways that you can generate unique identifiers, and only some of them are random. Guids make no guarantee of being random, so if you need randomness, you are using the wrong tool.
Second, you can't just take parts out of a guid and expect it to still be unique, any more than you can find the part of the airplane that "does the flying" and expect it to fly without the other parts of the airplane attached to it.
If you need a globally unique identifier then make a guid. It might not be random, but it will be unique. It does not need to be any longer, so do not add anything to it. It must not be shorter, otherwise it is no longer guaranteed to be unique
If you need a random string then get a source of randomness -- either pseudo-random or crypto strength, as required -- and generate random characters from the alphabet of your choice, and string them together. You can predict its likelihood of being unique by performing a "birthday problem" collision analysis on it. An inexact but close-enough computation is: take the number of possible characters in the alphabet. Raise that to the power of half the length of the string. That is the number of strings you have to generate before it becomes likely that you've generated at least one string twice.
For example, if you have an alphabet of 0123456789ABCDEF, then that's 16 characters. If the string is 12 letters long, then compute 166. That's about a 16 million, so you're going to run into high likelihood of generating the same string twice in the first 16 million strings generated.
If you want a random string of arbitrary length that is hex characters, you could use:
public static string GetRandomString(int length)
{
var rng = new RNGCryptoServiceProvider();
var padded = (int)Math.Ceiling(length / 2.0m);
var bytes = new byte[padded];
rng.GetBytes(bytes);
return bytes.Aggregate(new StringBuilder(), (f, s) => f.AppendFormat("{0:X2}", s)).ToString(0, length);
}
Try the following:
var builder = new StringBuilder();
while(builder.Length < idLength)
{
builder.Append(Guid.NewGuid().ToString());
}
return builder.ToString(0, idLength);
If all you require is a method capable of generating garbage for testing purposes, then perhaps the following would be more suitable:
private static Random Random = new Random();
public static string GenerateRandomString(int length, string characterSet = "abcdefghijklmnopqrstuvwxyzABCDDEFGHIJKLMNOPQRSTUVWXYZ")
{
var builder = new StringBuilder();
while(builder.Length < length)
{
builder.Append(characterSet.Chars[Random.Next(characterSet.Length)]);
}
return builder.ToString();
}
As an aside, when randomising values for tests, I prefer to still give my properties descriptive names as I find that the fail messages are generally far more helpful. I use an extension method for this purpose:
private static Random Random = new Random();
public static string Randomise(this string value)
{
return value + Random.Next();
}
Which is the useable in my tests like so:
var customer = new Customer { Id = "CustomerId".Randomise() };
This may not be applicable in this case, since a string of a given length is required, however it does save a bit of time if/when a test fails.
Note : A GUID with a user-defined length is no longer a GUID.
That said, if you want to generate a string that resembles a GUID of a given length, try something like the following (untested) code.
string myMethod(int length)
{
StringBuilder myGuidLikeString = new StringBuilder();
while(myGuidLikeString.Length < length)
{
myGuidLikeString .Append(Guid.NewGuid.ToString());
}
return myGuidLikeString.ToString(0,length);
}
Oops - rich.okelly beat me to it with an almost identical method.
I have a situation where by I need to create tens of thousands of unique numbers. However these numbers must be 9 digits and cannot contain any 0's. My current approach is to generate 9 digits (1-9) and concatenate them together, and if the number is not already in the list adding it into it. E.g.
public void generateIdentifiers(int quantity)
{
uniqueIdentifiers = new List<string>(quantity);
while (this.uniqueIdentifiers.Count < quantity)
{
string id = string.Empty;
id += random.Next(1,10);
id += random.Next(1,10);
id += random.Next(1,10);
id += " ";
id += random.Next(1,10);
id += random.Next(1,10);
id += random.Next(1,10);
id += " ";
id += random.Next(1,10);
id += random.Next(1,10);
id += random.Next(1,10);
if (!this.uniqueIdentifiers.Contains(id))
{
this.uniqueIdentifiers.Add(id);
}
}
}
However at about 400,000 the process really slows down as more and more of the generated numbers are duplicates. I am looking for a more efficient way to perform this process, any help would be really appreciated.
Edit: - I'm generating these - http://www.nhs.uk/NHSEngland/thenhs/records/Pages/thenhsnumber.aspx
As others have mentioned, use a HashSet<T> instead of a List<T>.
Furthermore, using StringBuilder instead of simple string operations will gain you another 25%. If you can use numbers instead of strings, you win, because it only takes a third or fourth of the time.
var quantity = 400000;
var uniqueIdentifiers = new HashSet<int>();
while (uniqueIdentifiers.Count < quantity)
{
int i=0;
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
i = i*10 + random.Next(1,10);
uniqueIdentifiers.Add(i);
}
It takes about 270 ms on my machine for 400,000 numbers and about 700 for 1,000,000. And this even without any parallelism.
Because of the use of a HashSet<T> instead of a List<T>, this algorithm runs in O(n), i.e. the duration will grow linear. 10,000,000 values therefore take about 7 seconds.
This suggestion may or may not be popular.... it depends on people's perspective. Because you haven't been too specific about what you need them for, how often, or the exact number, I will suggest a brute force approach.
I would generate a hundred thousand numbers - shouldn't take very long at all, maybe a few seconds? Then use Parallel LINQ to do a Distinct() on them to eliminate duplicates. Then use another PLINQ query to run a regex against the remainder to eliminate any with zeroes in them. Then take the top x thousand. (PLINQ is brilliant for ripping through large tasks like this). If needed, rinse and repeat until you have enough for your needs.
On a decent machine it will just about take you longer to write this simple function than it will take to run it. I would also query why you have 400K entries to test when you state you actually need "tens of thousands"?
The trick here is that you only need ten thousand unique numbers. Theoretically you could have almost 9,0E+08 possibilities, but why care if you need so many less?
Once you realize that you can cut down on the combinations that much then creating enough unique numbers is easy:
long[] numbers = { 1, 3, 5, 7 }; //note that we just take a few numbers, enough to create the number of combinations we might need
var list = (from i0 in numbers
from i1 in numbers
from i2 in numbers
from i3 in numbers
from i4 in numbers
from i5 in numbers
from i6 in numbers
from i7 in numbers
from i8 in numbers
from i9 in numbers
select i0 + i1 * 10 + i2 * 100 + i3 * 1000 + i4 * 10000 + i5 * 100000 + i6 * 1000000 + i7 * 10000000 + i8 * 100000000 + i9 * 1000000000).ToList();
This snippet creates a list of more than a 1,000,000 valid unique numbers pretty much instantly.
Try avoiding checks making sure that you always pick up a unique number:
static char[] base9 = "123456789".ToCharArray();
static string ConvertToBase9(int value) {
int num = 9;
char[] result = new char[9];
for (int i = 8; i >= 0; --i) {
result[i] = base9[value % num];
value = value / num;
}
return new string(result);
}
public static void generateIdentifiers(int quantity) {
var uniqueIdentifiers = new List<string>(quantity);
// we have 387420489 (9^9) possible numbers of 9 digits in base 9.
// if we choose a number that is prime to that we can easily get always
// unique numbers
Random random = new Random();
int inc = 386000000;
int seed = random.Next(0, 387420489);
while (uniqueIdentifiers.Count < quantity) {
uniqueIdentifiers.Add(ConvertToBase9(seed));
seed += inc;
seed %= 387420489;
}
}
I'll try to explain the idea behind with small numbers...
Suppose you have at most 7 possible combinations. We choose a number that is prime to 7, e.g. 3, and a random starting number, e.g. 4.
At each round, we add 3 to our current number, and then we take the result modulo 7, so we get this sequence:
4 -> 4 + 3 % 7 = 0
0 -> 0 + 3 % 7 = 3
3 -> 3 + 3 % 7 = 6
6 -> 6 + 6 % 7 = 5
In this way, we generate all the values from 0 to 6 in a non-consecutive way. In my example, we are doing the same, but we have 9^9 possible combinations, and as a number prime to that I choose 386000000 (you just have to avoid multiples of 3).
Then, I pick up the number in the sequence and I convert it to base 9.
I hope this is clear :)
I tested it on my machine, and generating 400k unique values took ~ 1 second.
Meybe this will bee faster:
//we can generate first number wich in 9 base system will be between 88888888 - 888888888
//we can't start from zero becouse it will couse the great amount of 1 digit at begining
int randNumber = random.Next((int)Math.Pow(9, 8) - 1, (int)Math.Pow(9, 9));
//no we change our number to 9 base, but we add 1 to each digit in our number
StringBuilder builder = new StringBuilder();
for (int i=(int)Math.Pow(9,8); i>0;i= i/9)
{
builder.Append(randNumber / i +1);
randNumber = randNumber % i;
}
id = builder.ToString();
Looking at the solutions already posted, mine seems fairly basic. But, it works, and generates 1million values in approximate 1s (10 million in 11s).
public static void generateIdentifiers(int quantity)
{
HashSet<int> uniqueIdentifiers = new HashSet<int>();
while (uniqueIdentifiers.Count < quantity)
{
int value = random.Next(111111111, 999999999);
if (!value.ToString().Contains('0') && !uniqueIdentifiers.Contains(value))
uniqueIdentifiers.Add(value);
}
}
use string array or stringbuilder, wjile working with string additions.
more over, your code is not efficient because after generating many id's your list may hold new generated id, so that the while loop will run more than you need.
use for loops and generate your id's from this loop without randomizing. if random id's are required, use again for loops and generate more than you need and give an generation interval, and selected from this list randomly how much you need.
use the code below to have a static list and fill it at starting your program. i will add later a second code to generate random id list. [i'm a little busy]
public static Random RANDOM = new Random();
public static List<int> randomNumbers = new List<int>();
public static List<string> randomStrings = new List<string>();
private void fillRandomNumbers()
{
int i = 100;
while (i < 1000)
{
if (i.ToString().Contains('0') == false)
{
randomNumbers.Add(i);
}
}
}
I think first thing would be to use StringBuilder, instead of concatenation - you'll be pleasantly surprised.
Antoher thing - use a more efficient data structure, for example HashSet<> or HashTable.
If you could drop the quite odd requirement not to have zero's - then you could of course use just one random operation, and then format your resulting number the way you want.
I think #slugster is broadly right - although you could run two parallel processes, one to generate numbers, the other to verify them and add them to the list of accepted numbers when verified. Once you have enough, signal the original process to stop.
Combine this with other suggestions - using more efficient and appropriate data structures - and you should have something that works acceptably.
However the question of why you need such numbers is also significant - this requirement seems like one that should be analysed.
Something like this?
public List<string> generateIdentifiers2(int quantity)
{
var uniqueIdentifiers = new List<string>(quantity);
while (uniqueIdentifiers.Count < quantity)
{
var sb = new StringBuilder();
sb.Append(random.Next(11, 100));
sb.Append(" ");
sb.Append(random.Next(11, 100));
sb.Append(" ");
sb.Append(random.Next(11, 100));
var id = sb.ToString();
id = new string(id.ToList().ConvertAll(x => x == '0' ? char.Parse(random.Next(1, 10).ToString()) : x).ToArray());
if (!uniqueIdentifiers.Contains(id))
{
uniqueIdentifiers.Add(id);
}
}
return uniqueIdentifiers;
}
I am going to a directory picking up some files and then adding them to a Dictionary.
The first time in the loop the key needs to be A, second time B etc. Afer 26/Z the number represents different characters and from 33 it starts at lowercase a up to 49 which is lowercase q.
Without having a massive if statement to say if i == 1 then Key is 'A' etc etc how can I can keep this code tidy?
Sounds like you just need to keep an index of where you've got to, then some mapping function:
int index = 0;
foreach (...)
{
...
string key = MapIndexToKey(index);
dictionary[key] = value;
index++;
}
...
// Keys as per comments
private static readonly List<string> Keys =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopq"
.Select(x => x.ToString())
.ToList();
// This doesn't really need to be a separate method at the moment, but
// it means it's flexible for future expansion.
private static string MapIndexToKey(int index)
{
return Keys[index];
}
EDIT: I've updated the MapIndexToKey method to make it simpler. It's not clear why you want a string key if you only ever use a single character though...
Another edit: I believe you could actually just use:
string key = ((char) (index + 'A')).ToString();
instead of having the mapping function at all, given your requirements, as the characters are contiguous in Unicode order from 'A'...
Keep incrementing from 101 to 132, ignoring missing sequence, and convert them to character. http://www.asciitable.com/
Use reminder (divide by 132) to identify second loop
This gives you the opportunity to map letters to specific numbers, perhaps not alphabet ordered.
var letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
.Select((chr, index) => new {character = chr, index = index + 1 });
foreach(var letter in letters)
{
int index = letter.index;
char chr = letter.character;
// do something
}
How about:
for(int i=0; i<26; ++i)
{
dict[(char)('A'+ (i % 26))] = GetValueFor(i);
}
Does anyone know of a method to convert words like "first", "tenth" and "one hundredth" to their numeric equivalent?
Samples:
"first" -> 1,
"second" -> 2,
"tenth" -> 10,
"hundredth" -> 100
Any algorithm will suffice but I'm writing this in C#.
EDIT
It ain't pretty and only works with one word at a time but it suits my purposes. Maybe someone can improve it but I'm out of time.
public static int GetNumberFromOrdinalString(string inputString)
{
string[] ordinalNumberWords = { "", "first", "second", "third", "fourth", "fifth", "sixth", "seventh", "eighth", "ninth", "tenth", "eleventh", "twelfth", "thirteenth", "fourteenth", "fifteenth", "sixteenth", "seventeenth", "eighteenth", "nineteenth", "twentieth" };
string[] ordinalNumberWordsTens = { "", "tenth", "twentieth", "thirtieth", "fortieth", "fiftieth", "sixtieth", "seventieth", "eightieth", "ninetieth" };
string[] ordinalNumberWordsExtended = {"hundredth", "thousandth", "millionth", "billionth" };
if (inputString.IsNullOrEmpty() || inputString.Length < 5 || inputString.Contains(" ")) return 0;
if (ordinalNumberWords.Contains(inputString) || ordinalNumberWordsTens.Contains(inputString))
{
var outputMultiplier = ordinalNumberWords.Contains(inputString) ? 1 : 10;
var arrayToCheck = ordinalNumberWords.Contains(inputString) ? ordinalNumberWords : ordinalNumberWordsTens;
// Use the loop counter to get our output integer.
for (int x = 0; x < arrayToCheck.Count(); x++)
{
if (arrayToCheck[x] == inputString)
{
return x * outputMultiplier;
}
}
}
// Check if the number is one of our extended numbers and return the appropriate value.
if (ordinalNumberWordsExtended.Contains(inputString))
{
return inputString == ordinalNumberWordsExtended[0] ? 100 : inputString == ordinalNumberWordsExtended[1] ? 1000 : inputString == ordinalNumberWordsExtended[2] ? 1000000 : 1000000000;
}
return 0;
}
I've never given this much thought beyond I know the word "and" is supposed to be the transition from whole numbers to decimals. Like
One Hundred Ninety-Nine Dollars and Ten Cents
not
One Hundred and Ninety-Nine Dollars.
Anyways any potential solution would have to parse the input string, raise any exceptions or otherwise return the value.
But first you'd have to know "the rules" This seems to be very arbitrary and based on tradition but this gentleman seems as good a place as any to start:
Ask Dr. Math
I think you might end up having to map strings to values up to the maximum range you expect and then parse the string in order and place values as such. Since there's very little regular naming convention across and within order of magnitude, I don't think there's an elegant or easy way to parse a word to get its numeric value. Luckily, depending on the format, you probably only have to map every order of magnitude. For example, if you only expect numbers 0-100 and they are inputted as "ninety-nine" then you only need to map 0-9, then 10-100 in steps of 10.