This question already has answers here:
Comparing list of strings with an available dictionary/thesaurus
(2 answers)
Closed 7 years ago.
I'm using C# to write a program that generates lines of text over and over. The user enters a set of numbers, 1-26, in whatever order, and the program matches each number to a letter.
The point is to have it go through every order of the alphabet until it eventually generates an actual word. For example, someone could enter 7-2-15-26-3, and it would eventually read that set of numbers as "hello".
I got the program to work and to print every outcome to a txt file, but because there are so many different possible outcomes, it is almost impossible to find an actual word in the file without going through every single line.
One of my tests only had 11 letters to choose from, it took a few minutes to finish and the txt file was so big, it would not open.
So my question is, does anyone know of a library or spell check that I could use to check if each string is an actual word? If I could check it each time, I could have it only print the outcomes that are words. I would have it check against preset words, but I won't always know what the outcome will be so I need to check against everything.
I have searched online but haven't found much. Again, I'm using C#. Thank you for any help.
Edit: Sorry about asking a question that had already been answered, I didn't see the other question before. I'll try the NHunspell and see how that works.
Try Nhunspell, it's free (.Net version of popular "Hunspell")
E.g.
Check Spelling,
bool correct = hunspell.Spell("Recommendation");
Get suggestions,
List<string> suggestions = hunspell.Suggest("Recommendatio");
More c# code samples
I suggest that you incorporate an english dictionary into your application so that you have something to check against.
Every time a new word is generated, it checks through the dictionary and takes all the matching results through regex and returns null if no word matches.
Hope this helps.
Related
This question already has answers here:
How to get all the unique n-long combinations of a set of duplicatable elements?
(5 answers)
Closed 3 years ago.
I am trying to save every combination of AAAAAAAA - ZZZZZZZZ to a text file. So far after having many many errors, I have got almost nowhere. I could post my code if needed, but it doesn't work or get near the wanted outcome.
So I was wondering how to do this in c#. My method at the moment is beyond repair, I will have to start all again in order to fix this.
As the output I would like something along the lines of
AAAAAAAA, AAAAAAAB, AAAAAAAC ... ZZZZZZZX, ZZZZZZZY, ZZZZZZZZ
Thanks in advance for any help.
This is a basic combinatorics question:
You want to write a string of 8 characters.
Each character can be a letter between A-Z (26 options), therefore, there are 26^8 combinations: 26*26*26*...26.
That is 208827064576 combinations.
Each combination is 10 bytes (8 for string, then \r\n), which is a total of 1944.85 GB.
Are you sure you want to write it to a file?
This will take about 1.5-2 Terabytes. That's a huge text file to start with, probably impractical.
Secondly, the way to do this simply is to have 8 nested loops, each running through A to Z, then concatenate the string inside the inner loop, appending to the data store each time.
This question already has answers here:
Counting the occurrences of every duplicate words in a string using dictionary in c# [closed]
(3 answers)
Closed 6 years ago.
I am making something like, the user will input any url and the text will be obtained.
The text will then be parsed and the words will be counted.
I am currently reading this article from microsoft:
https://msdn.microsoft.com/en-us/library/bb546166.aspx
I can now get the text and i am currently trying to think of an efficient way to count every words.
The article example required a search data but i need to search every word and not a specific word.
Here is what i am thinking:
get the text and convert it to string
split them (delimiters) and store in array
loop through the array then check every occurrences of it.
would this be efficient?
Using Linq
If you have a small amount of data can just do a split on spaces, and create a group
var theString = MethodToGetStringFromUrl(urlString);
var wordCount = theString
.Split(' ')
.GroupBy(a=>a)
.Select(a=>new { word = a.Key , Count = a.Count() });
see fiddle for more a working copy
Some Experiments and Results
Messed around in .net fiddle a little bit and using Regexs actually decreased the performance and increased the amount of memory used see here to see what I am talking about
Other alternative
Because you are getting the request from a Url it might be more performant to search inside of the stream before converting it to a string and then performing the search
Don't optimize unless you need to
Why do you need to find a performant way to do this count? Have you run into any issues or just think you will, a good rule of thumb is generally not to prematurely optimize, for more information check out this good question on the topic : When is optimisation premature?
This question already has answers here:
Read last line of text file
(6 answers)
Closed 8 years ago.
Scenario is the following:
A (weather) service dumps sensor data into a log file/text file.
The new readings are appended to the bottom of a given (existing) file
New data is added at regular intervals (interval may or may not be known)
I need to parse the new information/line and send it off to another service.
I don't want to read the whole file every time, unless I have to.
EDIT: Sorry for the bad wording. "unless I have to" should be understood as if there is no other way around. I have seen the post/answer referenced and it seems a little extensive.
Framework is 4.5.x.
Thank you.
To get the the last line of a text file you can use this
File.ReadLines(myFileName).Last();
This is the simplest method, but is inefficient. You can write your own parser as show here
This question already has answers here:
How to validate phone numbers using regex
(43 answers)
Closed 9 years ago.
My question was marked as a duplicate so I've made a couple edits. As I said, I was able to find many similar questions when I searched but none were quite what I needed. I am not validating a string where the only thing present will be the phone number (this seems to be what most of the other questions are addressing). Rather, I am attempting to pull out all phone numbers (which will then be manually checked by the user) from a larger block of text. The problem I am having is that my regular expression is matching zip codes with extensions (ex: 45202-4787), and I am not sure how to alter my regex to avoid that. If this truly is a duplicate question then I apologize for not being able to find the existing one that deals with my issue.
My specifications for phone number format are:
1) -, ., and space as delimiters (and in any combination)
2) area code may appear with or without parentheses
A few examples:
(xxx) xxx-xxxx
(xxx) xxx.xxxx
xxx-xxx-xxxx
xxx xxx-xxxx
xxxxxxxxxxx
I am using Anirudh's regex from the comments:
(\(?\d{3}\)?)?[. -]?\d{3}[. -]?\d{4}
Again, my problem is that this regex matches zip codes with extensions (ex: 45202-4787).
I would be grateful for any help, as I'm very new to using regular expressions. Thanks!
This should do it:
^(\([0-9]{3}\)|[0-9]{3})[ -\.]?[0-9]{3}[ -\.]?[0-9]{4}$
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Parsing CSV files in C#
I have a C# application that parses a pipe delimited file. It uses the Regex.Split method:
Regex.Split(line, #"(?<!(?<!\\)*\\)\|")
However recently a data file came across with a pipe included in one of the data fields. The data field in question used quoted identifers so when you open in Excel it opens correctly.
For example I have a file that looks like:
Field1|Field2|"Field 3 has a | inside the quotes"|Field4
When I use the above regex it parses to:
Field1
Field2
Field 3 has a
inside the quotes
Field4
when I would like
Field1
Field2
Field 3 has a | inside the quotes
Field4
I've done a fair amount of research and can't seem to get the Regex.Split to split the file on pipes but respect the quoted identifiers. Any help is greatly appreciated!
Here is a quick expression I've thrown together than seems to do the trick:
"([^"]+)"|([^\|]+)
Though your expression seems to be doing something with \'s as well, so you might need to add to this expression any other needs you have. I've ignored them in my answer because they were not explained in the question and therefore I cannot provide a solution without knowing why they are there - they may in fact not need to be there at all.
Also, my expression ignores empty fields though (i.e. 1||2|3 would come out as 1, 2 and 3 only) and I don't know whether this is what you need, if it isn't let me know and I can change the expression to something that would cater for that too.
Hope this helps anyway.