How check if letters are in string? - c#

It quite hard question to ask but I will try.
I have my 4 letters m u g o . I have also free string word(s).
Let'say: og ogg muogss. I am looking for any wise method to check if I can construct word(s) using only my letters. Please take notice that we used once g we won't be able to use it again.
og - possible because we need only **g** and **o**
ogg - not possible we took **o** and **g**, need the second **g**
muogss - not possible we took all, need also additional **s**
So my tactic is take my letters to char array and remove one by one and check how many left to build the word(s). But is it possible to use somehow in few lines, i do not know - regex ?

your method is only a few lines...
public static bool CanBeMadeFrom(string word, string letters)
{
foreach (var i in word.Select(c => letters.IndexOf(c, 0)))
{
if (i == -1) return false;
letters = letters.Remove(i, 1);
}
return true;
}

Here's a simple approach:
For your source word, create an array of size 26 and use it to count the how many times each letter appears.
Do the same for each word in your dictionary.
Then compare the two.
If every letter occurs less than or equal to as many times in the dictionary word as the source word, then it can be used to make that word. If not, then it cannot.
C-Sharpish Pseudocode: (probably doesn't compile as written)
/** Converts characters to a 0 to 25 code representing alphabet position.
This is specific to the English language and would need to be modified if used
for other languages. */
int charToLetter(char c) {
return Char.ToUpper(c)-'A';
}
/** Given a source word and an array of other words to check, returns all
words from the array which can be made from the letters of the source word. */
ArrayList<string> checkSubWords(string source, string[] dictionary) {
ArrayList<string> output = new ArrayList<string>();
// Stores how many of each letter are in the source word.
int[] sourcecount = new int[26]; // Should initialize to 0, automatically
foreach (char c in source) {
sourcecount[c]++;
}
foreach (string s in dictionary) {
// Stores how many of each letter are in the dictionary word.
int[] dictcount = new int[26]; // Should initialize to 0, automatically
foreach (char c in s) {
dictcount[c]++;
}
// Then we check that there exist no letters which appear more in the
// dictionary word than the source word.
boolean isSubword = true;
for (int i=0;i<26;i++) {
if (dictcount[i] > sourcecount[i]) {
isSubword = false;
}
}
// If they're all less than or equal to, then we add it to the output.
if (isSubWord) {
output.add(s);
}
}
return output;
}

If your definition of words is any arbitrary permutation of the available charactters then why do you need a regex? Just make sure you use each characters once. Regex doesn't know what a "correct word" is, and it's better to avoid using invalid characters by your algorithms than using them AND using a regex to make sure you didn't use them.

Related

Index out of range error in c# and IsLetter() also selecting space instead of Only alphabets

Console.WriteLine("Enter the plain text: ");
string ptxt=Console.ReadLine();
ptxt= ptxt.ToUpper();
char[] ptxtarr = ptxt.ToCharArray();
for(i=0; i<=ptxt.Length;i++)
{
if(char.IsLetter(ptxtarr[i]))
{
nptxtarr.Add(ptxtarr[j]);
j++;
}
}
getting error at ptxtarr[i] for the last value of the array for string "hello world"
also char.Isletter() should select only alphabets but it also selects space, why so?
screenshot
Why do you use j in the if clause?
it should be like this
Console.WriteLine("Enter the plain text: ");
string ptxt=Console.ReadLine();
ptxt= ptxt.ToUpper();
char[] ptxtarr = ptxt.ToCharArray();
for(i=0; i<=ptxt.Length;i++)
{
if(char.IsLetter(ptxtarr[i]))
{
nptxtarr.Add(ptxtarr[i]);
}
}
it is troublesome using that j variable as it will add those space and you get string with less character as the space count
for example you have string 'hello world' using your code it will return 'hello worl' The reason is when your code encountered space, it will not increment the j and therefore the added char is the space which is next char after your last added char
Few points.
Arrays in C# follows zero based index.
j<=ptxt.Lengthcauses your inner statement to look for n+1 element in n length array.
Multiple ways we can fix this problem.
for(i=0; i<ptxt.Length;i++)
{
if(char.IsLetter(ptxtarr[i]))
{
// logic here
}
}
or
foreach (char c in ptxtarr)
{
if(char.IsLetter(c))
{
// logic here
}
}
Alternateively, we can also use Linq.
if(ptxtarr.All(char.IsLetter)) // Verifies every character is a letter.
{
// logic here.
nptxtarr = ptxtarr
}
The char[] Follows Zero based indexing. that means if the array having three elements the indices will be 0,1,2. and the length of the array will be 3 which denotes the number of elements in the array.
What you have to do to solve your problem is :
Change your Looping condition as like this:
for(i=0; i<ptxt.Length;i++)
{
//Do your stuff here
}
Use foreach instead for this:
foreach (char c in ptxtarr)
{
if(char.IsLetter(c))
{
nptxtarr.Add(c);
}
}
Oh please stop this nonsense.
The reason it appears that Char.IsLetter() thinks that spaces are characters is that you're index is not your incrementor. You increment i in your for loop declaration, but you're using a separate letter, j, when you select from your character array. Since you only increment j when you hit a letter, but i increments on its own, your loop will correctly skip the space, but then still include it in your result array since j has not been incremented.
For future reference, include all relevant code. We're clearly missing the declarations for j and nptxtarr.
This will accomplish what you're looking to do without the unsightly loops or bugs:
Console.WriteLine("Enter the plain text: ");
char[] letters = Console.ReadLine()
.ToUpper()
.Where(c => Char.IsLetter(c))
.ToArray();
That will produce an array of upper case letters from one line of user input. What you do with it is up to you.
Why this works:
string is a series of unicode characters (look at the tooltip). It's enumerable and supports the IEnumerable interface. This takes the input from the console, converts it to upper case, enumerates through the characters that make up the string which qualify as Char.IsLetter and spits out an array of those characters.

Searching words in a dictionary C#

I have a text file which contains a list of about 150000 words. I loaded the the words into a dictionary and word lookup works fine for me. Now i want to search the dictionary to see if the dictionary contains a word starting from a particular alphabet.
I am using
foreach(KeyValuePair pair in dict ){
}
for this purpose. This seems to work fine for smaller word lists. But it doesnt work for 150000 wordlist. Could anyone help please.
void WordAvailable (char sLetter)
{
notAvail = true;
int count = 0;
do {
foreach (KeyValuePair<string,string> pair in dict) {
randWord = pair.Value;
count++;
if (randWord [0] == sLetter && !ListTest.usedWordsList.Contains (randWord)) {
notAvail = false;
startingLetter = char.ToString (sLetter);
break;
}
if (count >= dict.Count) {
ChooseRandomAlpha ();
sLetter = alpha;
count = 0;
}
}
} while(notAvail);
}
Now I want to search the dictionary to see if the dictionary contains a word starting from a particular alphabet.
That sounds like you want a SortedSet rather than a Dictionary. You can use GetViewBetween to find all the entries in the set that lie between two bounds. The lower bound would probably be "the string you're starting with" and for the upper bound, you could either work out "the last possible string starting with those characters" or use an exclusive upper bound by manually ignoring anything that doesn't start with your prefix, and "incrementing" the last character of your prefix. So for example, to find all words beginning with "tim" you can use GetViewBetween("tim", "tin") and ignore tin if it's in the dictionary.
Note that ordering can be an "interesting" exercise when it comes to multiple cultures etc. If this is just an academic exercise and you'll only have ASCII characters, you might want to lower case each word as you add it to the set, and then use an ordinal comparison. If you do need a culture-sensitive ordering, you could make that case-insensitive easily... but working out the upper bound for the prefix will be trickier.
Example of using GetViewBetween:
using System;
using System.Collections.Generic;
class Test
{
static void Main()
{
var words = new SortedSet<string>(StringComparer.Ordinal)
{
"cat", "dog", "banana", "laptop", "mug",
"coffee", "microphone", "water", "stairs", "phone"
};
foreach (var word in words.GetViewBetween("d", "n"))
{
Console.WriteLine(word);
}
}
}
Output:
dog
laptop
microphone
mug
An alternative would be to build your own trie implementation (or find an existing one) but I don't know of one in the BCL.

The best way to 4 letter words based on string with limits?

So I'm using this code down here to figure out all the words that could be spelled out of the alphabet variable, the problem is , I build this alphabet variable each time I call this based on the board of random letters in front of the user. What i see though , and of course, is "aaab" for example...
What I'm after is for code to only use the letter as many times as it appears in the alphabet var, so that it can't do something like "aaab" but just "ab"
I understand this code that I found in another thread is made to build combinations of the letters into 4 letter words, or arrangements,
I'm wondering if theres a simple way using SelectMany or Select, to not add up its self if its already been used, keep in mind there could be multiple "a's" in the alphabet var to begin with, so if theres 2 A's in there, it should still be able to to AAB, just not AAAB. I am a newbie, I know that I could go through my own list and add letters together based on how many times they actually exist in the alphabet string..im just wondering if theres a way to interupt i or x and not add to q if its already been used...
sorry if this is confusing... thank you :)
// I found this in another thread and seemed to work great and fast.
var alphabet = "abcd";
var q = alphabet.Select(x => x.ToString());
int size = 4;
for (int i = 0; i < size - 1; i++)
q = q.SelectMany(x => alphabet, (x, y) => x + y);
foreach (var item in q)
( DO STUFF)
To reach your goal, you must find a way to mark letters in your alphabet which are already used and avoid using these letters a second time.
To do so you need a data structure which can store more than the letters alone, so a list of letters (or a string) is not sufficient.
Try to bulid a list of classes like this one:
class UsedLetter
{
char letter;
bool used;
}
Then you can mark each letter as used after you drew it from the list.
Improvement
You may also store your alphabet as a list of characters:
List<char> alphabet;
and remove each letter from the alphabet after its drawn.
Here's how I have achieved what I think you're after:
using System;
using System.Collections.Generic;
using System.Linq;
namespace WordPerms
{
class Program
{
Stack<char> chars = new Stack<char>();
List<string> words = new List<string>();
static void Main(string[] args)
{
Program p = new Program();
p.GetChar("abad");
foreach (string word in p.words)
{
Console.WriteLine(word);
}
}
// This is called recursively to build the list of words.
private void GetChar(string alpha)
{
string beta;
for (int i = 0; i < alpha.Length; i++)
{
chars.Push(alpha[i]);
beta = alpha.Remove(i, 1);
GetChar(beta);
}
char[] charArray = chars.Reverse().ToArray();
words.Add(new string(charArray));
if (chars.Count() >= 1)
{
chars.Pop();
}
}
}
}
Hope that helps, Greg.

Constantly Incrementing String

So, what I'm trying to do this something like this: (example)
a,b,c,d.. etc. aa,ab,ac.. etc. ba,bb,bc, etc.
So, this can essentially be explained as generally increasing and just printing all possible variations, starting at a. So far, I've been able to do it with one letter, starting out like this:
for (int i = 97; i <= 122; i++)
{
item = (char)i
}
But, I'm unable to eventually add the second letter, third letter, and so forth. Is anyone able to provide input? Thanks.
Since there hasn't been a solution so far that would literally "increment a string", here is one that does:
static string Increment(string s) {
if (s.All(c => c == 'z')) {
return new string('a', s.Length + 1);
}
var res = s.ToCharArray();
var pos = res.Length - 1;
do {
if (res[pos] != 'z') {
res[pos]++;
break;
}
res[pos--] = 'a';
} while (true);
return new string(res);
}
The idea is simple: pretend that letters are your digits, and do an increment the way they teach in an elementary school. Start from the rightmost "digit", and increment it. If you hit a nine (which is 'z' in our system), move on to the prior digit; otherwise, you are done incrementing.
The obvious special case is when the "number" is composed entirely of nines. This is when your "counter" needs to roll to the next size up, and add a "digit". This special condition is checked at the beginning of the method: if the string is composed of N letters 'z', a string of N+1 letter 'a's is returned.
Here is a link to a quick demonstration of this code on ideone.
Each iteration of Your for loop is completely
overwriting what is in "item" - the for loop is just assigning one character "i" at a time
If item is a String, Use something like this:
item = "";
for (int i = 97; i <= 122; i++)
{
item += (char)i;
}
something to the affect of
public string IncrementString(string value)
{
if (string.IsNullOrEmpty(value)) return "a";
var chars = value.ToArray();
var last = chars.Last();
if(char.ToByte() == 122)
return value + "a";
return value.SubString(0, value.Length) + (char)(char.ToByte()+1);
}
you'll probably need to convert the char to a byte. That can be encapsulated in an extension method like static int ToByte(this char);
StringBuilder is a better choice when building large amounts of strings. so you may want to consider using that instead of string concatenation.
Another way to look at this is that you want to count in base 26. The computer is very good at counting and since it always has to convert from base 2 (binary), which is the way it stores values, to base 10 (decimal--the number system you and I generally think in), converting to different number bases is also very easy.
There's a general base converter here https://stackoverflow.com/a/3265796/351385 which converts an array of bytes to an arbitrary base. Once you have a good understanding of number bases and can understand that code, it's a simple matter to create a base 26 counter that counts in binary, but converts to base 26 for display.

What is the simplest way to refine a list's contents (words) based on character frequency and position? (C#)

I'm writing a console-environment Hangman game for my introductory programming class. The player chooses the word length and number of guesses they would like. 'Easy mode' is simple enough... generate a random number to use as the list's index and check that the chosen word is the right length. However, 'hard mode' requires the list to be refined as the game progresses, choosing the largest list of possibilities given the letters guessed.
I should note, we are not using the C# List class but instead, creating array-based structs:
struct ListType
{
public type[] items;
public int count;
}
//defined as:
ListType myList = new ListType();
myList.items = new type[max value];
myList.count = 0;
Anyway, here's an example of the way 'hard mode' should go:
Word List:
hole
airplane
lame
photos
cart
mole
(player chooses word length of 4)
Word List (refined):
hole
lame
cart
mole
(player guesses "l", then "e")
Word List (refined):
hole
mole
"Lame" is omitted because more words have the "...le..." pattern. The technique that makes sense to me (but isn't working the way I'd like) is storing each word's pattern to an array (ie: "mole" and "hole" = 0011, and "lame" = 1001), and counting up the duplicates to determine the larger list.
Is this the way I should be doing it? I'm new to programming and have just under a year's worth of experience, so I guess answer as such.
Thanks!!
There are a few ways of approaching this. A simple way would be to keep track of a list for all candidate words and calculate the amount of matching sequences for that word as well as log the best matching sequence. This way you can both sort on the best sequence and the amount of sequences when the best sequence alone is not a good enough measurement tool. I hope it becomes obvious how to modify this code in order to only sort on the best sequence.
Firstly i setup a test case like:
// mimic the scenario given by the QA
string[] wordList = new string[] { "hole", "airplane", "lame", "photos", "cart", "mole" };
int wordLength = 4;
List<char> requiredCharacters = new List<char>{ 'l', 'e'};
After which i filter the wordList and calculate the best matches which i finally group together to produce the desired result:
// filter all words that dont match the required length
var candidateWords = wordList.Where(x => x.Length == wordLength);
// define a result set holding all the words and all their matches
Dictionary<string, List<int>> refinedWordSet = new Dictionary<string, List<int>>();
foreach (string word in candidateWords)
{
List<int> matches = new List<int>() { 0 };
int currentMatchCount = 0;
foreach (char character in word)
{
if (requiredCharacters.Contains(character))
{
currentMatchCount++;
}
else
{
// if there were previous matches
if (currentMatchCount > 0)
{
// save the current match
matches.Add(currentMatchCount);
currentMatchCount = 0;
}
}
}
// if there was a match at the end
if (currentMatchCount > 0)
{
// save the last match
matches.Add(currentMatchCount);
}
refinedWordSet.Add(word, matches);
}
// sort by a combination of the total amount of matches as well as the highest match
var goupedRefinedWords = from entry in refinedWordSet
group entry.Key by new { Max = entry.Value.Max(), Total = entry.Value.Sum() } into grouped
select grouped;
foreach (var entry in goupedRefinedWords)
{
Console.WriteLine("Word list with best match: {0} and total match {1}: {2}",
entry.Key.Max,
entry.Key.Total,
entry.Aggregate("", (result, nextWord) => result += nextWord + ", "));
}
Console.ReadLine();
Pay attention to the comments in the code
So you look through the array for strings that match the guess pattern.
In the specific case of "le" you culd simply use String.IndexOf(). If you require a more cmplex pattern.. say "*le?" (where * and ? follow DOS-like wildcard pattern) you could employ a dynamicly-cnstructed regex pattern (easy, but performace-heavy if used in a near-realtime system), or character scanning (read each char from the screen and match to your pattern) (more difficult, harder to maintain, better performance for a small number of elements in a near-RT system).
As this is homework, I wouldn't worry about performace profiling at all right now.
Also, that struct looks mighty goofy. There are certainly better constructs for this type of thing. Like a List<String>, or just a String[]... both of which have a .Count property.

Categories