Currently fiddling with a little project I'm working on, it's a count down type game (the tv show).
Currently, the program allows the user to pick a vowel or consonant to a limit of 9 letters and then asks them to input the longest word they can think of using these 9 letters.
I have a large text file acting as a dictionary that i search through using the user inputted string to try match a result to check if the word they entered is a valid word. My problem, is that I want to then search my dictionary for the longest word made up of the nine letters, but i just cant seem to find a way to implement it.
So far I've tried putting every word into an array and searching through each element to check if it contains the letters but this wont cover me if the longest word that can be made out of the 9 letters is a 8 letter word. Any idea's?
Currently I have this (This is under the submit button on the form, sorry for not providing code or mentioning it's a windows form application):
StreamReader textFile = new StreamReader("C:/Eclipse/Personal Projects/Local_Projects/Projects/CountDown/WindowsFormsApplication1/wordlist.txt");
int counter1 = 0;
String letterlist = (txtLetter1.Text + txtLetter2.Text + txtLetter3.Text + txtLetter4.Text + txtLetter5.Text + txtLetter6.Text + txtLetter7.Text + txtLetter8.Text + txtLetter9.Text); // stores the letters into a string
char[] letters = letterlist.ToCharArray(); // reads the letters into a char array
string[] line = File.ReadAllLines("C:/Eclipse/Personal Projects/Local_Projects/Projects/CountDown/WindowsFormsApplication1/wordlist.txt"); // reads every line in the word file into a string array (there is a new word on everyline, and theres 144k words, i assume this will be a big performance hit but i've never done anything like this before so im not sure ?)
line.Any(x => line.Contains(x)); // just playing with linq, i've no idea what im doing though as i've never used before
for (int i = 0; i < line.Length; i++)// a loop that loops for every word in the string array
// if (line.Contains(letters)) //checks if a word contains the letters in the char array(this is where it gets hazy if i went this way, i'd planned on only using words witha letter length > 4, adding any words found to another text file and either finding the longest word then in this text file or keeping a running longest word i.e. while looping i find a word with 7 letters, this is now the longest word, i then go to the next word and it has 8 of our letters, i now set the longest word to this)
counter1++;
if (counter1 > 4)
txtLongest.Text += line + Environment.NewLine;
Mike's code:
using System;
using System.Collections.Generic;
using System.Linq;
class Program
static void Main(string[] args) {
var letters = args[0];
var wordList = new List<string> { "abcbca", "bca", "def" }; // dictionary
var results = from string word in wordList // makes every word in dictionary into a seperate string
where IsValidAnswer(word, letters) // calls isvalid method
orderby word.Length descending // sorts the word with most letters to top
select word; // selects that word
foreach (var result in results) {
Console.WriteLine(result); // outputs the word
}
}
private static bool IsValidAnswer(string word, string letters) {
foreach (var letter in word) {
if (letters.IndexOf(letter) == -1) { // checks if theres letters in the word
return false;
}
letters = letters.Remove(letters.IndexOf(letter), 1);
}
return true;
}
}
Here's an answer I knocked together in a couple of minutes which should do what you want. As others have said, this problem is complex and so the algorithm is going to be slow. The LINQ query evaluates each string in the dictionary, checking whether the supplied letters can be used to produce said word.
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args) {
var letters = args[0];
var wordList = new List<string> { "abcbca", "bca", "def" };
var results = from string word in wordList
where IsValidAnswer(word, letters)
orderby word.Length descending
select word;
foreach (var result in results) {
Console.WriteLine(result);
}
}
private static bool IsValidAnswer(string word, string letters) {
foreach (var letter in word) {
if (letters.IndexOf(letter) == -1) {
return false;
}
letters = letters.Remove(letters.IndexOf(letter), 1);
}
return true;
}
}
So where are you getting stuck? Start with the slow brute-force method and just find all the words that contain all the characters. Then order the words by length to get the longest. If you don't want to return a word that is shorter than the number of characters being sought (which I guess is only an issue if there are duplicate characters???), then add a test and eliminate that case.
I've had some more thoughts about this. I think the way to do it efficiently is by preprocessing the dictionary, ordering the letters in each word in alphabetical order and ordering the words in the list alphabetically too (you'll probably have to use some sort of multimap structure to store the original word and the sorted word).
Once you've done that you can much more efficiently find the words that can be generated from your pool of letters. I'll come back and flesh out an algorithm for doing this later, if someone else doesn't beat me to it.
Step 1: Construct a trie structure with each word sort by letter.
Example: EACH is sorted to ACEH is stored as A->C->E->H->(EACH, ACHE, ..) in the trie (ACHE is an anagram of EACH).
Step 2: Sort the input letters and find find the longest word corresponding to that set of letters in the trie.
Have you tried implementing something like this? It would be great to see your code you have tried.
string[] strArray = {"ABCDEFG", "HIJKLMNOP"};
string findThisString = "JKL";
int strNumber;
int strIndex = 0;
for (strNumber = 0; strNumber < strArray.Length; strNumber++)
{
strIndex = strArray[strNumber].IndexOf(findThisString);
if (strIndex >= 0)
break;
}
System.Console.WriteLine("String number: {0}\nString index: {1}",
strNumber, strIndex);
This must do the job :
private static void Main()
{
char[] picked_char = {'r', 'a', 'j'};
string[] dictionary = new[] {"rajan", "rajm", "rajnujaman", "rahim", "ranjan"};
var words = dictionary.Where(word => picked_char.All(word.Contains)).OrderByDescending(word => word.Length);
foreach (string needed_words in words)
{
Console.WriteLine(needed_words);
}
}
Output :
rajnujaman
ranjan
rajan
rajm
Related
string testSentence = "this is a test sentence and I WANT TO SEE HOW IT WILL LOOK LIKE hoping this part is big";
int firstLetter = testSentence.IndexOf("this");
int length = "this is a test sentence and".Length;
string upperSentence = testSentence.Substring(firstLetter, length).ToUpper();
int secondLetter = testSentence.IndexOf(" I");
int length2 = " I WANT TO SEE HOW IT WILL LOOK LIKE".Length;
string lowerSentence = testSentence.Substring(secondLetter, length2).ToLower();
int thirdSentence = testSentence.IndexOf(" hoping");
int length1 = " hoping this part is big".Length;
string get = testSentence.Substring(thirdSentence, length1).ToUpper();
Console.WriteLine(upperSentence + lowerSentence + get);
Can somebody please tell me how would you capitalize in all big or small letters only one word in the middle of the sentence? For example, make the word ''LOOK'' in small case letters. Does the ''.Length'' call has to be used or is there a different way than literally typing the word or part of the sentence that I want to convert to upper or lower cases?
The problem I have with this is, I cannot isolate just one word and make it low/big letters because then the rest of the string after the particular word is also in the lower/upper cases
In line with #Franck's advice, think about the problem as a human. You have three things:
a sentence
words that you want to be big
words that you want to be small
The sentence can really be broken down into a collection of words (ignoring punctuation). You want to go through the collection of words and change some of them to be uppercase and some of them to be lowercase - all the rest of the words you want to leave as they are.
Here is some code that does this:
using System;
using System.Linq;
public class Program
{
public static void Main()
{
// modify these as you please
string testSentence = "this is a test sentence and I WANT TO SEE HOW IT WILL LOOK LIKE hoping this part is big";
string[] wordsToUpperCase = new string[] { "hoping", "test" };
string[] wordsToLowerCase = new string[] { "look", "is" };
// start processing
// split the sentence into words
string[] wordsInSentence = testSentence.Split(' ');
// a List to hold the reworked words
var outputWords = new System.Collections.Generic.List<string>();
// process the words
foreach (string currentWord in wordsInSentence)
{
// check the wordsToUpperCase array for the current word
bool shouldUpperCaseThisWord = wordsToUpperCase.Any(stringToTest => stringToTest.Equals(currentWord, StringComparison.CurrentCultureIgnoreCase));
// check the wordsToLowerCase array for the current word
bool shouldLowerCaseThisWord = wordsToLowerCase.Any(stringToTest => stringToTest.Equals(currentWord, StringComparison.CurrentCultureIgnoreCase));
// add the current word to the output list
if (shouldUpperCaseThisWord)
outputWords.Add(currentWord.ToUpper());
else if (shouldLowerCaseThisWord)
outputWords.Add(currentWord.ToLower());
else
outputWords.Add(currentWord);
}
string finalOutput = String.Join(" ", outputWords);
Console.WriteLine(finalOutput);
}
}
Okay, so I'm creating a hangman game and everything functions so far, including what I'm TRYING to do in the question.
But it feels like there is a much more efficient method of obtaining the char that is also easier to manipulate the index.
protected static void alphabetSelector(string activeWordAlphabet)
{
char[] activeWord = activeWordAlphabet.ToCharArray();
string activeWordString = new string(activeWord);
Console.WriteLine("If you'd like to guess a letter, enter the letter. \n
If you'd like to guess the word, please type in the word. --- testing answer{0}",
activeWordString);
//Console.WriteLine("For Testing Purposes ONLY");
String chosenLetter = Console.ReadLine();
//Char[] letterFinder = Array.FindAll(activeWord, s => s.Equals(chosenLetter));
//string activeWordString = new string(activeWord);
foreach (char letter in activeWord);
{
if(activeWordString.Contains(chosenLetter))
{
Console.WriteLine("{0}", activeWordString);
Console.ReadLine();
}
else
{
Console.WriteLine("errrr...wrong!");
Console.ReadLine();
}
}
}
I have broken up the code in some areas to prevent the reader from having to scroll sideways. If this is bothersome, please let me know and I'll leave it in the future.
So this code will successfully print out the 'word' whenever I select the correct letter from the random word (I have the console print the actual word so that I can test it successfully each time). It will also print 'wrong' when I choose a letter NOT in the string.
But I feel like I should be able to use the
Array.FindAll(activeWord, ...)
functionality or some other way. But every time I try and reorder the arguments, it gives me all kinds of different errors and tells me to redo my arguments.
So, if you can look at this and find an easier method of searching the actual array for the user-selected 'letter', please help!! Even if it's not using the Array.FindAll method!!
Edit
Okay, it seems like there's some confusion with what I've done and why I've done it.
I'm ONLY printing the word inside that 'if' statement to test and make sure that the foreach{if{}} will actually work to find the char inside the string. But I ultimately need to be able to provide a placeholder for a char that is successfully found, as well as being able to 'cross out' the letter (from the alphabet list not shown here).
It's hangman - surely you guys know what I'm needing it to do. It has to keep track of which letters are left in the word, which letters have been chosen, as well as which letters are left in the entire alphabet.
I'm a 4-day old newb when it comes to programming, so please. . . I'm only doing what I know to do and when I get errors, I comment things out and write more until I find something that works.
Take a look at this demo I put together for you: https://dotnetfiddle.net/eP9TQM
I'd suggest creating a second string for the display string. Use a StringBuilder, and you can replace the characters in it at specific indices while creating the fewest number of stringobjects in the process.
string word = "your word or phrase here";
//Initialize a new StringBuilder that will display the word with placeholders.
StringBuilder display = new StringBuilder(word.Length); //You know the display word is the same length as the original word
display.Append('-', word.Length); //Fill it with placeholders.
So now you have your phrase/word, and a string builder full of characters that need to be discovered.
Go ahead and convert the display StringBuilder to a string that you can check on each pass to see if it equals your word:
var displayString = display.ToString();
//Loop until the display string is equal to the word
while (!displayString.Equals(word))
{
//Inside here your logic will follow.
}
So you are basically looping until the person answers here. You could of course go back and add logic to limit the number of attempts, or whatever you desire as an alternate exit strategy.
Inside this logic, you will check if they guessed a letter or a word based on how many characters they entered.
If they guessed a word, the logic is simple. Check if the guessed word is the same as the hidden word. If it is, then you break the loop and they are done. Otherwise, guessing loops back around.
If they guessed a letter, the logic is pretty straightforward, but more involved.
First get the character they guessed, just because it may be easier to work with this way.
char guess = input[0];
Now, look over the word for instances of that character:
//Look for instances of the character in the word.
for (int i = 0; i < word.Length; ++i)
{
//If the current index in the word matches their guess, then update the display.
if (char.ToUpperInvariant(word[i]) == char.ToUpperInvariant(guess))
display[i] = word[i];
}
The comments above should explain the idea here.
Update your displayString at the bottom of the loop so that it will check against the hidden word again:
displayString = display.ToString();
That's really all you need to do here. No fancy Linq needed.
Ok your code is really confusing, even with your edit.
First, why these 2 lines of code since activeWordAlphabet is a string :
char[] activeWord = activeWordAlphabet.ToCharArray();
string activeWordString = new string(activeWord);
Then you do your foreach.
For the word "FooBar", if the player types 'F', you will print
FooBar
FooBar
FooBar
FooBar
FooBar
FooBar
How does this help you in anything?
I think you have to review your algorithm. The string type have the function you need
int chosenLetterPosition = activeWord.IndexOf(chosenLetter, alreadyFoundPosition)
alreadyFoundPosition is an int from where the function will search the letter
IndexOf() returns -1 if the letter is not find or a positive number.
You can save this position with your letter in a dictionary to use it again as your new 'alreadyFoundPosition' if the chosenLetter is already in the dictionary
This is my answer. Because I don't have a lot of tasks today :)
class Letter
{
public bool ischosen { get; set; }
public char value { get; set; }
}
class LetterList
{
public LetterList(string word)
{
_lst = new List<Letter>();
word.ToList().ForEach(x => _lst.Add(new Letter() { value = x }));
}
public bool FindLetter(char letter)
{
var search = _lst.Where(x => x.value == letter).ToList();
search.ForEach(x=>x.ischosen=true);
return search.Count > 0 ? true : false;
}
public string NotChosen()
{
var res = "";
_lst.Where(x => !x.ischosen).ToList().ForEach(x => { res += x.value; });
return res;
}
List<Letter> _lst;
}
How to use
var abc = new LetterList("abcdefghijklmnopqrstuvwxyz");
var answer = new LetterList("myanswer");
Console.WriteLine("This my question. Why? write your answer please");
char x = Console.ReadLine()[0];
if (answer.FindLetter(x))
{
Console.WriteLine("you are right!");
}
else
{
Console.WriteLine("fail");
}
abc.FindLetter(x);
Console.WriteLine("not chosen abc:{0} answer:{1}", abc.NotChosen(), answer.NotChosen());
At least we used to play this game like that when i was a child.
I've been working with this for a few days and I am trying to complete a console application where we are prompted to type in a string of our own and the output makes a list of every unique character and puts a count after it. At the end of the results, a count is displayed showing how many unique characters were found. Everything is translated to lowercase despite whether it is uppercase or not. They key is to use collections. Here is what I have so far. My output shows two space characters in the results despite the fact that I used an if statement to catch them. Can anyone point out a concept that I have overlooked?
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
public class LetterCountTest
{
public static void Main(string[] args)
{
// create sorted dictionary based on user input
SortedDictionary<string, int> dictionary = CollectLetters();
// display sorted dictionary content
DisplayDictionary(dictionary);
} // end Main
// create sorted dictionary from user input
private static SortedDictionary<string, int> CollectLetters()
{
// create a new sorted dictionary
SortedDictionary<string, int> dictionary =
new SortedDictionary<string, int>();
Console.WriteLine("Enter a string: "); // prompt for user input
string input = Console.ReadLine(); // get input
// split every individual letter for the count
string[] letters = Regex.Split(input, "");
// processing input letters
foreach (var letter in letters)
{
string letterKey = letter.ToLower(); // get letter in lowercase
if (letterKey != " ") // statement to exclude whitespace count
{
// if the dictionary contains the letter
if (dictionary.ContainsKey(letterKey))
{
++dictionary[letterKey];
} // end if
else
// add new letter with a count of 1 to the dictionary
dictionary.Add(letterKey, 1);
}
} // end foreach
return dictionary;
} // end method CollectLetters
// display dictionary content
private static void DisplayDictionary<K, V>(
SortedDictionary<K, V> dictionary)
{
Console.WriteLine("\nSorted dictionary contains:\n{0,-12}{1,-12}",
"Key:", "Value:");
// generate output for each key in the sorted dictionary
// by iterating through the Keys property with a foreach statement
foreach (K key in dictionary.Keys)
Console.WriteLine("{0,-12}{1,-12}", key, dictionary[key]);
Console.WriteLine("\nsize: {0}", dictionary.Count);
Console.ReadLine();
} // end method DisplayDictionary
} // end class LetterCountTest
My output states that I am using every letter in the alphabet but also has a whitespace above the 'a' and two instances of it. I don't know where this is coming from, but my guess is that it's counting null characters or carriage returns. The string that I use is...
The quick brown fox jumps over the lazy dog
Aside from counting every letter once, it counts the e three times, the h two times, the o four times, the r two times, the t two times, and the u two times.
The problem your having relies in Regex.Split()'s behavior. Taking an excerpt from the msdn page on the method...
If a match is found at the beginning or the end of the input string, an empty string is included at the beginning or the end of the returned array.
This is exactly what is happening when you call Regex.Split(input, "");. To combat this, you can remove the regex matching from your code, change all instances of your dictionaries to a SortedDictionary<char, int>, and use a foreach loop to iterate through each character of your input string instead.
foreach (char letter in input.ToLower())
{
if (!Char.IsWhiteSpace(letter))
{
//Do your stuff
}
}
Since a string is already an array or characters you don't need that Regex.Split, just use:
foreach (var letter in input)
Then change your if statement to:
foreach (var letter in input)
{
if (!char.IsWhiteSpace(letter))
{
var letterKey = letter.ToString();
if (dictionary.ContainsKey(letterKey))
{
++dictionary[letterKey];
}
else
dictionary.Add(letterKey, 1);
}
}
Then it should exclude the empty strings and the white-spaces.I suspect you get some empty strings after Regex.Split, and you checking for only white-space therefore you get unexpected results.If you work with chars the you don't need to worry about empty strings.
I have a large string, where there can be specific words (text followed by a single colon, like "test:") occurring more than once. For example, like this:
word:
TEST:
word:
TEST:
TEST: // random text
"word" occurs twice and "TEST" occurs thrice, but the amount can be variable. Also, these words don't have to be in the same order and there can be more text in the same line as the word (as shown in the last example of "TEST"). What I need to do is append the occurrence number to each word, for example the output string needs to be this:
word_ONE:
TEST_ONE:
word_TWO:
TEST_TWO:
TEST_THREE: // random text
The RegEx for getting these words which I've written is ^\b[A-Za-z0-9_]{4,}\b:. However, I don't know how to accomplish the above in a fast way. Any ideas?
Regex is perfect for this job - using Replace with a match evaluator:
This example is not tested nor compiled:
public class Fix
{
public static String Execute(string largeText)
{
return Regex.Replace(largeText, "^(\w{4,}):", new Fix().Evaluator);
}
private Dictionary<String, int> counters = new Dictionary<String, int>();
private static String[] numbers = {"ONE", "TWO", "THREE",...};
public String Evaluator(Match m)
{
String word = m.Groups[1].Value;
int count;
if (!counters.TryGetValue(word, out count))
count = 0;
count++;
counters[word] = count;
return word + "_" + numbers[count-1] + ":";
}
}
This should return what you requested when calling:
result = Fix.Execute(largeText);
i think you can do this with Regax.Replace(string, string, MatchEvaluator) and a dictionary.
Dictionary<string, int> wordCount=new Dictionary<string,int>();
string AppendIndex(Match m)
{
string matchedString = m.ToString();
if(wordCount.Contains(matchedString))
wordCount[matchedString]=wordCount[matchedString]+1;
else
wordCount.Add(matchedString, 1);
return matchedString + "_"+ wordCount.ToString();// in the format: word_1, word_2
}
string inputText = "....";
string regexText = #"";
static void Main()
{
string text = "....";
string result = Regex.Replace(text, #"^\b[A-Za-z0-9_]{4,}\b:",
new MatchEvaluator(AppendIndex));
}
see this:
http://msdn.microsoft.com/en-US/library/cft8645c(v=VS.80).aspx
If I understand you correctly, regex is not necessary here.
You can split your large string by the ':' character. Maybe you also need to read line by line (split by '\n'). After that you just create a dictionary (IDictionary<string, int>), which counts the occurrences of certain words. Every time you find word x, you increase the counter in the dictionary.
EDIT
Read your file line by line OR split the string by '\n'
Check if your delimiter is present. Either by splitting by ':' OR using regex.
Get the first item from the split array OR the first match of your regex.
Use a dictionary to count your occurrences.
if (dictionary.Contains(key)) dictionary[key]++;
else dictionary.Add(key, 1);
If you need words instead of numbers, then create another dictionary for these. So that dictionary[key] equals one if key equals 1. Mabye there is another solution for that.
Look at this example (I know it's not perfect and not so nice)
lets leave the exact argument for the Split function, I think it can help
static void Main(string[] args)
{
string a = "word:word:test:-1+234=567:test:test:";
string[] tks = a.Split(':');
Regex re = new Regex(#"^\b[A-Za-z0-9_]{4,}\b");
var res = from x in tks
where re.Matches(x).Count > 0
select x + DecodeNO(tks.Count(y=>y.Equals(x)));
foreach (var item in res)
{
Console.WriteLine(item);
}
Console.ReadLine();
}
private static string DecodeNO(int n)
{
switch (n)
{
case 1:
return "_one";
case 2:
return "_two";
case 3:
return "_three";
}
return "";
}
I recently made a little application to read in a text file of lyrics, then use a Dictionary to calculate how many times each word occurs. However, for some reason I'm finding instances in the output where the same word occurs multiple times with a tally of 1, instead of being added onto the original tally of the word. The code I'm using is as follows:
StreamReader input = new StreamReader(path);
String[] contents = input.ReadToEnd()
.ToLower()
.Replace(",","")
.Replace("(","")
.Replace(")", "")
.Replace(".","")
.Split(' ');
input.Close();
var dict = new Dictionary<string, int>();
foreach (String word in contents)
{
if (dict.ContainsKey(word))
{
dict[word]++;
}else{
dict[word] = 1;
}
}
var ordered = from k in dict.Keys
orderby dict[k] descending
select k;
using (StreamWriter output = new StreamWriter("output.txt"))
{
foreach (String k in ordered)
{
output.WriteLine(String.Format("{0}: {1}", k, dict[k]));
}
output.Close();
timer.Stop();
}
The text file I'm inputting is here: http://pastebin.com/xZBHkjGt (it's the lyrics of the top 15 rap songs, if you're curious)
The output can be found here: http://pastebin.com/DftANNkE
A quick ctrl-F shows that "girl" occurs at least 13 different times in the output. As far as I can tell, it is the exact same word, unless there's some sort of difference in ASCII values. Yes, there are some instances on there with odd characters in place of a apostrophe, but I'll worry about those later. My priority is figuring out why the exact same word is being counted 13 different times as different words. Why is this happening, and how do I fix it? Any help is much appreciated!
Another way is to split on non words.
var lyrics = "I fly with the stars in the skies I am no longer tryin' to survive I believe that life is a prize But to live doesn't mean your alive Don't worry bout me and who I fire I get what I desire, It's my empire And yes I call the shots".ToLower();
var contents = Regex.Split(lyrics, #"[^\w'+]");
Also here's an alternative (and probably more obscure) loop
int value;
foreach (var word in contents)
{
dict[word] = dict.TryGetValue(word, out value) ? ++value : 1;
}
dict.Remove("");
If you notice, the repeat occurrences appear on a line following a word which apparently doesn't have a count.
You're not stripping out newlines, so em\r\ngirl is being treated as a different word.
String[] contents = input.ReadToEnd()
.ToLower()
.Replace(",", "")
.Replace("(", "")
.Replace(")", "")
.Replace(".", "")
.Split("\r\n ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Works better.
Add Trim to each word:
foreach (String word in contents.Select(w => w.Trim()))