I need to create a Dictionary that expresses a mapping between each char in an alphabet and another char in that alphabet, where both the key and value are unique -- like a very simple cipher that expresses how to code/decode a message. There can be no duplicate keys or values.
Does anyone see what is wrong with this code? It is still producing duplicate values in the mapping despite the fact that on each iteration the pool of available characters decreases for each value already used.
string source_alphabet = _alphabet; //ie "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
string target_alphabet = _alphabet;
Dictionary<char, char> _map = new Dictionary<char, char>();
for (int i = 0; i < source_alphabet.Length; i++)
{
int random = _random.Next(target_alphabet.Length - 1); //select a random index
char _output = target_alphabet[random] //get the char at the random index
_map.Add(source_alphabet[i], _output); //add to the dictionary
target_alphabet = target_alphabet.Replace(_output.ToString(), string.Empty);
// remove the char we just added from the remaining alphabet
}
Thanks.
I would consider performing a simple Fisher Yates shuffle over one or both sequences of the alphabet, then you can simply iterate over the output and put together your mapper.
Pseudocode
Shuffle(sequence1)
Shuffle(sequence2)
for index 0 to 25
dictionary add sequence1[index], sequence2[index]
When you try to select a random value each time, then there is a high probability that you will get a collision and therefore have a non-unique value selected. The answer is usually to shuffle, then select in order.
"a quick fix" though not optimal would be (if mapping A to A is NOT allowed)
int random = _random.Next(target_alphabet.Length - 1);
while ( source_alphabet[i] == target_alphabet[random] ) {random = _random.Next(target_alphabet.Length - 1);};
if mapping A to A is allowed then ignore the above change... BUT at least change the last line to
target_alphabet = target_alphabet.Remove ( random, 1 );
I guess you could add another "for" loop on the target_alphabet inside the existing "for" loop and check for the characters not be same with a small "if" condition and continue the inner loop if same or break if not.
This works.
for (int i = 0; i < source_alphabet.Length; i++)
{
int random = _random.Next(target_alphabet.Length - 1); //select a random index
char _output = target_alphabet[random]; //get the char at the random index
_map.Add(source_alphabet[i], _output); //add to the dictionary
// remove the char we just added from the remaining alphabet
target_alphabet = target_alphabet.Remove(random, 1);
}
Related
This question already has answers here:
Randomize a List<T>
(28 answers)
Closed 5 years ago.
I want to fill a small array with unique values from a bigger array. How do I check the uniqueness?
Here's what I have:
int[] numbers45 = new int[45];
for (int i = 0; i <= 44; i++) // I create a big array
{
numbers45[i] = i + 1;
}
Random r = new Random();
int[] draw5 = new int[5]; //new small array
Console.WriteLine("The 5 draws are:");
for (int i = 1; i <= 4; i++)
{
draw5[i] = numbers45[r.Next(numbers45.Length)]; //I fill the small array with values from the big one. BUT the values might not be unique.
Console.WriteLine(draw5[i]);
}
There are multiple ways to do what you are asking.
First off, though, I would recommend to use one of the classes which wraps the array type and adds some extra functionality you could use (in this case a List would probably be a perfect structure to use)!
One way to handle this is to check if the value is already in the draw5 array. This can be done with (for example) the List<T>.Contains(T) function, and if it exists, try another.
Personally though, I would probably have randomized the first array with the OrderBy linq method and just return a random number, like:
numbers45.OrderBy(o => random.Next());
That way the numbers are already random and unique when it is supposed to be added to the second list.
And a side note: Remember that arrays indexes starts on index 0. In your second loop, you start at 1 and go to 4, that is, you wont set a value to the first index.
You could just run for (int i=0;i<5;i++) to get it right.
Inspired by Jite's answer, I changed to use Guid to randomize the numbers
var randomized = numbers45.OrderBy(o => Guid.NewGuid().ToString());
Then you could take the draws by:
var draws = randomized.Take(5).ToArray;
HashSet<int> hs = new HashSet<int>();
int next = random.next(45);
while(hs.Length <=5)
{
if(!hs.Contains(array[next]))
hs.Add(array[next]);
next = random next(45);
}
I have a list of strings which have been read in from a dictionary file (sorted into alphabetical order). I want to create an index of the last position of each starting letter, so if there were 1000 words beginning with A it would be recorded as position 999 (because arrays start from 0). 1000 words beginning with B would mean the end position of B is 1999 and so on. These position values will be stored in a int array.
The only way I can think to do this is loop through the whole list, and have lots of else if statements to look at the first letter of the word. Not the most elegant solution.
Does anyone know of a simple way to do this, rather than 26 if statements?
Edit: The purpose of this is to generate random words. If I wanted a word beginning with B I would generate a random number between 1000 and 1999 and get the word from that position in the list.
Well, you could create a dictionary using LINQ:
// Note: assumes no empty words
Dictionary<char, int> lastEntries = words
.Select((index, value) => new { index, value })
.GroupBy(pair => pair.value[0])
.ToDictionary(g => g.Key, g => g.Max(p => p.index));
Or more usefully, keep the first and last indexes:
Dictionary<char, Tuple<int, int>> entryMinMax = words
.Select((value, index) => new { value, index })
.GroupBy(pair => pair.value[0])
.ToDictionary(g => g.Key,
g => Tuple.Of(g.Min(p => p.index), g.Max(p => p.index));
Alternatively, if the point is to effectively group the words by first letter, just do that, using a lookup:
ILookup<char, string> lookup = words.ToLookup(word => word[0]);
Then you can use:
char first = 'B'; // Or whatever
Random rng = new Random(); // But don't create a new one each time...
var range = lookup[first];
var count = range.Count();
if (count == 0)
{
// No words starting with that letter!
}
int index = random.Next(count);
var randomWord = range.ElementAt(index);
I would handle this a different way, and it doesn't require it to be ordered.
public List<string> GetAllStringsStartingWith(char startsWith, List<string> allWords)
{
List<string> letterSpecificWords = allWords.FindAll(word => word.ToLower()[0].Equals(startsWith));
return letterSpecificWords;
}
From here you now have a list containing only words that start with the letter "a". You can change out "a" with a variable for whatever you need it to be, and it will always find all of them beginning with that letter.
Notes:
word.ToLower() is used to make sure it's a lowercase value. If you switch the letter you're looking for to a variable, you'll want to do this on the variable as well.
You still need to handle the random integer, but you now have an accurate count (words.Count) to use.
This assumes no empty entries in the list.
words.ToLower()[0] gets the first character
This might be a case of an xy problem. Why do you need the index of the last occurrence of each letter? Chances are, this isn't really what you want to do.
To answer your question anyway, for each letter, you could use the FindLastIndex method.
int index = myList.FindLastIndex(i => i.ToLower()[0] == 'a');
I like Jon Skeet's method better though since you don't have to loop through each letter.
You could loop through the list with a for loop and compare the first letter of the current element to the first letter of the next element. If the letter is the same continue the loop, if it is different then store the index of the next element and then continue the loop.
To retrieve last:
words.LastOrDefault(i => i[0].ToLower() == 'a');
To get index:
words.FindLastIndex(i => i[0].ToLower() == 'a');
You can do this per one cycle:
public Dictionary<char, int> GetCharIndex(IList<string> words)
{
if (words == null || !words.Any()) throw new ArgumentException("words can't be null or empty");
Dictionary<char, int> charIndex = new Dictionary<char, int>();
char prevLetter = words[0][0];
for(int i = 1;i < words.Count;i++)
{
char letter = words[i][0];
if (letter != prevLetter) //change of first letter of the word -> add previous letter to dictionary
{
charIndex.Add(prevLetter, i - 1);
prevLetter = letter;
}
}
charIndex.Add(words[words.Count - 1][0], words.Count - 1); //special case for last word
return charIndex;
}
I had an interview for a Jr. developer position a few days ago, and they asked:
"If you had an array of letters "a" and "b" how would you write a method to count how many instances of those letters are in the array?"
I said that you would have a for loop with an if else statement that would increment 1 of 2 counter variables. After that, though, they asked how I would solve that same problem, if the array could contain any letter of the alphabet. I said that I would go about it the same way, with a long IF statement, or a switch statement. In hindsight, that doesn't seem so efficient; is there an easier way to go about doing this?
You could declare the array of size 256 (number of possible character codes) zero it and simply increase the one which corresponds to a char code you read.
For example if you are reading the 'a' the corresponding code is ASCII 97 so you increase the array[97] you can optimize the amount of memory decreasing the code by 97 (if you know the input is going to be characters only) you also need to be aware what to do with capital characters ( are you conciser them as different or not) also in this case you need to take care to decrease the character by 65.
So at the end code would look like this:
int counts[122 - 97] = {0}; // codes of a - z
char a = get_next_char();
if ( is_capital(a)){
counts[a - 65]++;
}
else
{
counts[a - 97] ++;
}
this code assumes the 'A' = 'a'
if its not the case you need to have different translation in the if's but you can probably figure out the idea now. This saves a lot of comparing as opposed to your approach.
Depending on whether the objective is CPU efficiency, memory efficiency, or developer efficiency, you could just do:
foreach(var grp in theString.GroupBy(c => c)) {
Console.WriteLine("{0}: {1}", grp.Key, grp.Count());
}
Not awesome efficiency, but fine for virtually on non-pathological scenarios. In real scenarios, due to unicode, I'd probably use a dictionary as a counter - unicode is to big to pre-allocate an array.
Dictionary<char, int> counts = new Dictionary<char, int>();
foreach(char c in theString) {
int count;
if(!counts.TryGetValue(c, out count)) count = 0;
counts[c] = count + 1;
}
foreach(var pair in counts) {
Console.WriteLine("{0}: {1}", pair.Key, pair.Value);
}
You can create Dictionary<string, int>, then iterate through array, check if element exist as key in dictionary and increment value.
Dictionary<string, int> counter = new Dictionary<string, int>();
foreach(var item in items)
{
if(counter.ContainsKey(item))
{
counter[item] = counter[item] + 1;
}
}
Here is wonderful example given, it may resolve your query.
http://www.dotnetperls.com/array-find
string[] array1 = { "cat", "dog", "carrot", "bird" };</br>
//
// Find first element starting with substring.
//
string value1 = Array.Find(array1,
element => element.StartsWith("car", StringComparison.Ordinal));</br>
//
// Find first element of three characters length.
//
string value2 = Array.Find(array1,
element => element.Length == 3);
//
// Find all elements not greater than four letters long.
//
string[] array2 = Array.FindAll(array1,
element => element.Length <= 4);
Console.WriteLine(value1);
Console.WriteLine(value2);
Console.WriteLine(string.Join(",", array2));
I am going to a directory picking up some files and then adding them to a Dictionary.
The first time in the loop the key needs to be A, second time B etc. Afer 26/Z the number represents different characters and from 33 it starts at lowercase a up to 49 which is lowercase q.
Without having a massive if statement to say if i == 1 then Key is 'A' etc etc how can I can keep this code tidy?
Sounds like you just need to keep an index of where you've got to, then some mapping function:
int index = 0;
foreach (...)
{
...
string key = MapIndexToKey(index);
dictionary[key] = value;
index++;
}
...
// Keys as per comments
private static readonly List<string> Keys =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopq"
.Select(x => x.ToString())
.ToList();
// This doesn't really need to be a separate method at the moment, but
// it means it's flexible for future expansion.
private static string MapIndexToKey(int index)
{
return Keys[index];
}
EDIT: I've updated the MapIndexToKey method to make it simpler. It's not clear why you want a string key if you only ever use a single character though...
Another edit: I believe you could actually just use:
string key = ((char) (index + 'A')).ToString();
instead of having the mapping function at all, given your requirements, as the characters are contiguous in Unicode order from 'A'...
Keep incrementing from 101 to 132, ignoring missing sequence, and convert them to character. http://www.asciitable.com/
Use reminder (divide by 132) to identify second loop
This gives you the opportunity to map letters to specific numbers, perhaps not alphabet ordered.
var letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
.Select((chr, index) => new {character = chr, index = index + 1 });
foreach(var letter in letters)
{
int index = letter.index;
char chr = letter.character;
// do something
}
How about:
for(int i=0; i<26; ++i)
{
dict[(char)('A'+ (i % 26))] = GetValueFor(i);
}
I faced a following problem: generate N unique alphanumeric strings from a restricted alphabet. Here's my solution in C#:
string Alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
Random generator = new Random();
const int ToGenerate = 10000;
const int CharactersCount = 4;
ArrayList generatedStrings = new ArrayList();
while( generatedStrings.Count < ToGenerate ) {
string newString = "Prefix";
for( int i = 0; i < CharactersCount; i++ ) {
int index = generator.Next( Alphabet.Length );
char character = Alphabet[index];
newString += character;
}
if( !generatedStrings.Contains( newString ) ) {
generatedStrings.Add( newString );
}
}
for( int i = 0; i < generatedStrings.Count; i++ ) {
System.Console.Out.WriteLine( generatedStrings[i] );
}
it generates 10K strings starting with "Prefix" and otherwise consisting of capital letters and numbers. The output looks good.
Now I see the following problem. The produced strings are for a scenario where they should be unlikely to be predicted by anyone. In my program the seed is time-dependent. Once someone knows the seed value he can run the same code and get the exact same strings. If he knows any two strings he can easily figure out my algorithm (since it is really naive) and attempt to brute-force the seed value - just enumerate all possible seed values until he sees the two known strings in the output.
Is there some simple change that could be done to my code to make the described attack less possible?
Well, how would he know the seed? Unless he knew the exact time you ran the code, that is very hard to do. But if you need stronger, you can also create cryptographically strong random numbers via System.Security.Cryptography.RandomNumberGenerator.Create - something like:
var rng = System.Security.Cryptography.RandomNumberGenerator.Create();
byte[] buffer = new byte[4];
char[] chars = new char[CharactersCount];
for(int i = 0 ; i < chars.Length ; i++)
{
rng.GetBytes(buffer);
int nxt = BitConverter.ToInt32(buffer, 0);
int index = nxt % Alphabet.Length;
if(index < 0) index += Alphabet.Length;
chars[i] = Alphabet[index];
}
string s = new string(chars);
Well, it depends what you consider "simple".
You can "solve" your problem by using a "true" source of random numbers. You can try the free ones (random.org, fourmilab hotbits, etc), or buy one, depending on the sort of operation you're running.
Alternatively (and perhaps better) is to not generate in advance, and instead generate on demand. But this may be a significant change to your business process/model.