That's what I've written so far:
string omgwut;
omgwut = textBox1.Text;
omgwut = omgwut.Replace(" ", "snd\\space.wav");
omgwut = omgwut.Replace("a", "snd\\a.wav");
Now, the problem is that this code would turn
"snd\space.wav"
into
"snd\spsnd\a.wavce.wsnd\a.wavv"
in line four. Not what I'd want! Now I know I'm not good at C#, so that's why I'm asking.
Solutions would be great! Thanks!
You'll still need to write the getSoundForChar() function, but this should do what you're asking. I'm not sure, though, that what you're asking will do what you want, i.e., play the sound for the associated character. You might be better off putting them in a List<string> for that.
StringBuilder builder = new StringBuilder();
foreach (char c in textBox1.Text)
{
string sound = getSoundForChar( c );
builder.Append( sound );
}
string omgwut = builder.ToString();
Here's a start:
public string getSoundForChar( char c )
{
string sound = null;
if (sound == " ")
{
sound = "snd\\space.wav";
}
... handle other special characters
else
{
sound = string.Format( "snd\\{0}.wav", c );
}
return sound;
}
The problem is that you are doing multiple passes of the data. Try just stepping through the characters of the string in a loop and replacing each 'from' character by its 'to' string. That way you're not going back over the string and re-doing those characters already replaced.
Also, create a separate output string or array, instead of modifying the original. Ideally use a StringBuilder, and append the new string (or the original character if not replacing this character) to it.
I do not know of a way to simultaneously replace different characters in C#.
You could loop over all characters and build a result string from that (use a stringbuilder if the input string can be long). For each character, you append its replacement to the result string(builder).
But what are you trying to do? I cannot think of a useful application of appending file paths without any separator.
Related
How to get whole text from document contacted into the string. I'm trying to split text by dot: string[] words = s.Split('.'); I want take this text from text document. But if my text document contains empty lines between strings, for example:
pat said, “i’ll keep this ring.”
she displayed the silver and jade wedding ring which, in another time track,
she and joe had picked out; this
much of the alternate world she had elected to retain. he wondered what - if any - legal basis she had kept in addition. none, he hoped; wisely, however, he said nothing. better not even to ask.
result looks like this:
1. pat said ill keep this ring
2. she displayed the silver and jade wedding ring which in another time track
3. she and joe had picked out this
4. much of the alternate world she had elected to retain
5. he wondered what if any legal basis she had kept in addition
6. none he hoped wisely however he said nothing
7. better not even to ask
but desired correct output should be like this:
1. pat said ill keep this ring
2. she displayed the silver and jade wedding ring which in another time track she and joe had picked out this much of the alternate world she had elected to retain
3. he wondered what if any legal basis she had kept in addition
4. none he hoped wisely however he said nothing
5. better not even to ask
So to do this first I need to process text file content to get whole text as single string, like this:
pat said, “i’ll keep this ring.” she displayed the silver and jade wedding ring which, in another time track, she and joe had picked out; this much of the alternate world she had elected to retain. he wondered what - if any - legal basis she had kept in addition. none, he hoped; wisely, however, he said nothing. better not even to ask.
I can't to do this same way as it would be with list content for example: string concat = String.Join(" ", text.ToArray());,
I'm not sure how to contact text into string from text document
I think this is what you want:
var fileLocation = #"c:\\myfile.txt";
var stringFromFile = File.ReadAllText(fileLocation);
//replace Environment.NewLine with any new line character your file uses
var withoutNewLines = stringFromFile.Replace(Environment.NewLine, "");
//modify to remove any unwanted character
var withoutUglyCharacters = Regex.Replace(withoutNewLines, "[“’”,;-]", "");
var withoutTwoSpaces = withoutUglyCharacters.Replace(" ", " ");
var result = withoutTwoSpaces.Split('.').Where(i => i != "").Select(i => i.TrimStart()).ToList();
So first you read all text from your file, then you remove all unwanted characters and then split by . and return non empty items
Have you tried replacing double new-lines before splitting using a period?
static string[] GetSentences(string filePath) {
if (!File.Exists(filePath))
throw new FileNotFoundException($"Could not find file { filePath }!");
var lines = string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line)));
var sentences = Regex.Split(lines, #"\.[\s]{1,}?");
return sentences;
}
I haven't tested this, but it should work.
Explanation:
if (!File.Exists(filePath))
throw new FileNotFoundException($"Could not find file { filePath }!");
Throws an exception if the file could not be found. It is advisory you surround the method call with a try/catch.
var lines = string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line)));
Creates a string, and ignores any lines which are purely whitespace or empty.
var sentences = Regex.Split(lines, #".[\s]{1,}?");
Creates a string array, where the string is split at every period and whitespace following the period.
E.g:
The string "I came. I saw. I conquered" would become
I came
I saw
I conquered
Update:
Here's the method as a one-liner, if that's your style?
static string[] SplitSentences(string filePath) => File.Exists(filePath) ? Regex.Split(string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line))), #"") : null;
I would suggest you to iterate through all characters and just check if they are in range of 'a' >= char <= 'z' or if char == ' '. If it matches the condition then add it to the newly created string else check if it is '.' character and if it is then end your line and add another one :
List<string> lines = new List<string>();
string line = string.Empty;
foreach(char c in str)
{
if((char.ToLower(c) >= 'a' && char.ToLower(c) <= 'z') || c == 0x20)
line += c;
else if(c == '.')
{
lines.Add(line.Trim());
line = string.Empty;
}
}
Working online example
Or if you prefer "one-liner"s :
IEnumerable<string> lines = new string(str.Select(c => (char)(((char.ToLower(c) >= 'a' && char.ToLower(c) <= 'z') || c == 0x20) ? c : c == '.' ? '\n' : '\0')).ToArray()).Split('\n').Select(s => s.Trim());
I may be wrong about this. I would think that you may not want to alter the string if you are splitting it. Example, there are double/single quote(s) (“) in part of the string. Removing them may not be desired which brings up the possibly of a question, reading a text file that contains single/double quotes (as your example data text shows) like below:
var stringFromFile = File.ReadAllText(fileLocation);
will not display those characters properly in a text box or the console because the default encoding using the ReadAllText method is UTF8. Example the single/double quotes will display (replacement characters) as diamonds in a text box on a form and will be displayed as a question mark (?) when displayed to the console. To keep the single/double quotes and have them display properly you can get the encoding for the OS’s current ANSI encoding by adding a parameter to the ReadAllText method like below:
string stringFromFile = File.ReadAllText(fileLocation, ASCIIEncoding.Default);
Below is code using a simple split method to .split the string on periods (.) Hope this helps.
private void button1_Click(object sender, EventArgs e) {
string fileLocation = #"C:\YourPath\YourFile.txt";
string stringFromFile = File.ReadAllText(fileLocation, ASCIIEncoding.Default);
string bigString = stringFromFile.Replace(Environment.NewLine, "");
string[] result = bigString.Split('.');
int count = 1;
foreach (string s in result) {
if (s != "") {
textBox1.Text += count + ". " + s.Trim() + Environment.NewLine;
Console.WriteLine(count + ". " + s.Trim());
count++;
}
else {
// period at the end of the string
}
}
}
I'm currently trying to strip a string of data that is may contain the hyphen symbol.
E.g. Basic logic:
string stringin = "test - 9894"; OR Data could be == "test";
if (string contains a hyphen "-"){
Strip stringin;
output would be "test" deleting from the hyphen.
}
Console.WriteLine(stringin);
The current C# code i'm trying to get to work is shown below:
string Details = "hsh4a - 8989";
var regexItem = new Regex("^[^-]*-?[^-]*$");
string stringin;
stringin = Details.ToString();
if (regexItem.IsMatch(stringin)) {
stringin = stringin.Substring(0, stringin.IndexOf("-") - 1); //Strip from the ending chars and - once - is hit.
}
Details = stringin;
Console.WriteLine(Details);
But pulls in an Error when the string does not contain any hyphen's.
How about just doing this?
stringin.Split('-')[0].Trim();
You could even specify the maximum number of substrings using overloaded Split constructor.
stringin.Split('-', 1)[0].Trim();
Your regex is asking for "zero or one repetition of -", which means that it matches even if your input does NOT contain a hyphen. Thereafter you do this
stringin.Substring(0, stringin.IndexOf("-") - 1)
Which gives an index out of range exception (There is no hyphen to find).
Make a simple change to your regex and it works with or without - ask for "one or more hyphens":
var regexItem = new Regex("^[^-]*-+[^-]*$");
here -------------------------^
It seems that you want the (sub)string starting from the dash ('-') if original one contains '-' or the original string if doesn't have dash.
If it's your case:
String Details = "hsh4a - 8989";
Details = Details.Substring(Details.IndexOf('-') + 1);
I wouldn't use regex for this case if I were you, it makes the solution much more complex than it can be.
For string I am sure will have no more than a couple of dashes I would use this code, because it is one liner and very simple:
string str= entryString.Split(new [] {'-'}, StringSplitOptions.RemoveEmptyEntries)[0];
If you know that a string might contain high amount of dashes, it is not recommended to use this approach - it will create high amount of different strings, although you are looking just for the first one. So, the solution would look like something like this code:
int firstDashIndex = entryString.IndexOf("-");
string str = firstDashIndex > -1? entryString.Substring(0, firstDashIndex) : entryString;
you don't need a regex for this. A simple IndexOf function will give you the index of the hyphen, then you can clean it up from there.
This is also a great place to start writing unit tests as well. They are very good for stuff like this.
Here's what the code could look like :
string inputString = "ho-something";
string outPutString = inputString;
var hyphenIndex = inputString.IndexOf('-');
if (hyphenIndex > -1)
{
outPutString = inputString.Substring(0, hyphenIndex);
}
return outPutString;
Okay, so I'm creating a hangman game and everything functions so far, including what I'm TRYING to do in the question.
But it feels like there is a much more efficient method of obtaining the char that is also easier to manipulate the index.
protected static void alphabetSelector(string activeWordAlphabet)
{
char[] activeWord = activeWordAlphabet.ToCharArray();
string activeWordString = new string(activeWord);
Console.WriteLine("If you'd like to guess a letter, enter the letter. \n
If you'd like to guess the word, please type in the word. --- testing answer{0}",
activeWordString);
//Console.WriteLine("For Testing Purposes ONLY");
String chosenLetter = Console.ReadLine();
//Char[] letterFinder = Array.FindAll(activeWord, s => s.Equals(chosenLetter));
//string activeWordString = new string(activeWord);
foreach (char letter in activeWord);
{
if(activeWordString.Contains(chosenLetter))
{
Console.WriteLine("{0}", activeWordString);
Console.ReadLine();
}
else
{
Console.WriteLine("errrr...wrong!");
Console.ReadLine();
}
}
}
I have broken up the code in some areas to prevent the reader from having to scroll sideways. If this is bothersome, please let me know and I'll leave it in the future.
So this code will successfully print out the 'word' whenever I select the correct letter from the random word (I have the console print the actual word so that I can test it successfully each time). It will also print 'wrong' when I choose a letter NOT in the string.
But I feel like I should be able to use the
Array.FindAll(activeWord, ...)
functionality or some other way. But every time I try and reorder the arguments, it gives me all kinds of different errors and tells me to redo my arguments.
So, if you can look at this and find an easier method of searching the actual array for the user-selected 'letter', please help!! Even if it's not using the Array.FindAll method!!
Edit
Okay, it seems like there's some confusion with what I've done and why I've done it.
I'm ONLY printing the word inside that 'if' statement to test and make sure that the foreach{if{}} will actually work to find the char inside the string. But I ultimately need to be able to provide a placeholder for a char that is successfully found, as well as being able to 'cross out' the letter (from the alphabet list not shown here).
It's hangman - surely you guys know what I'm needing it to do. It has to keep track of which letters are left in the word, which letters have been chosen, as well as which letters are left in the entire alphabet.
I'm a 4-day old newb when it comes to programming, so please. . . I'm only doing what I know to do and when I get errors, I comment things out and write more until I find something that works.
Take a look at this demo I put together for you: https://dotnetfiddle.net/eP9TQM
I'd suggest creating a second string for the display string. Use a StringBuilder, and you can replace the characters in it at specific indices while creating the fewest number of stringobjects in the process.
string word = "your word or phrase here";
//Initialize a new StringBuilder that will display the word with placeholders.
StringBuilder display = new StringBuilder(word.Length); //You know the display word is the same length as the original word
display.Append('-', word.Length); //Fill it with placeholders.
So now you have your phrase/word, and a string builder full of characters that need to be discovered.
Go ahead and convert the display StringBuilder to a string that you can check on each pass to see if it equals your word:
var displayString = display.ToString();
//Loop until the display string is equal to the word
while (!displayString.Equals(word))
{
//Inside here your logic will follow.
}
So you are basically looping until the person answers here. You could of course go back and add logic to limit the number of attempts, or whatever you desire as an alternate exit strategy.
Inside this logic, you will check if they guessed a letter or a word based on how many characters they entered.
If they guessed a word, the logic is simple. Check if the guessed word is the same as the hidden word. If it is, then you break the loop and they are done. Otherwise, guessing loops back around.
If they guessed a letter, the logic is pretty straightforward, but more involved.
First get the character they guessed, just because it may be easier to work with this way.
char guess = input[0];
Now, look over the word for instances of that character:
//Look for instances of the character in the word.
for (int i = 0; i < word.Length; ++i)
{
//If the current index in the word matches their guess, then update the display.
if (char.ToUpperInvariant(word[i]) == char.ToUpperInvariant(guess))
display[i] = word[i];
}
The comments above should explain the idea here.
Update your displayString at the bottom of the loop so that it will check against the hidden word again:
displayString = display.ToString();
That's really all you need to do here. No fancy Linq needed.
Ok your code is really confusing, even with your edit.
First, why these 2 lines of code since activeWordAlphabet is a string :
char[] activeWord = activeWordAlphabet.ToCharArray();
string activeWordString = new string(activeWord);
Then you do your foreach.
For the word "FooBar", if the player types 'F', you will print
FooBar
FooBar
FooBar
FooBar
FooBar
FooBar
How does this help you in anything?
I think you have to review your algorithm. The string type have the function you need
int chosenLetterPosition = activeWord.IndexOf(chosenLetter, alreadyFoundPosition)
alreadyFoundPosition is an int from where the function will search the letter
IndexOf() returns -1 if the letter is not find or a positive number.
You can save this position with your letter in a dictionary to use it again as your new 'alreadyFoundPosition' if the chosenLetter is already in the dictionary
This is my answer. Because I don't have a lot of tasks today :)
class Letter
{
public bool ischosen { get; set; }
public char value { get; set; }
}
class LetterList
{
public LetterList(string word)
{
_lst = new List<Letter>();
word.ToList().ForEach(x => _lst.Add(new Letter() { value = x }));
}
public bool FindLetter(char letter)
{
var search = _lst.Where(x => x.value == letter).ToList();
search.ForEach(x=>x.ischosen=true);
return search.Count > 0 ? true : false;
}
public string NotChosen()
{
var res = "";
_lst.Where(x => !x.ischosen).ToList().ForEach(x => { res += x.value; });
return res;
}
List<Letter> _lst;
}
How to use
var abc = new LetterList("abcdefghijklmnopqrstuvwxyz");
var answer = new LetterList("myanswer");
Console.WriteLine("This my question. Why? write your answer please");
char x = Console.ReadLine()[0];
if (answer.FindLetter(x))
{
Console.WriteLine("you are right!");
}
else
{
Console.WriteLine("fail");
}
abc.FindLetter(x);
Console.WriteLine("not chosen abc:{0} answer:{1}", abc.NotChosen(), answer.NotChosen());
At least we used to play this game like that when i was a child.
Say I have a string like
MyString1 = "ABABABABAB";
MyString2 = "ABCDABCDABCD";
MyString3 = "ABCAABCAABCAABCA";
MyString4 = "ABABACAC";
MyString5 = "AAAAABBBBB";
and I need to get the following output
Output1 = "5(AB)";
Output2 = "3(ABCD)";
Output3 = "4(ABCA)";
Output4 = "2(AB)2(AC)";
Output5 = "5(A)5(B)";
I have been looking at RLE but I can't figure out how to do the above.
The code I have been using is
public static string Encode(string input)
{
return Regex.Replace(input, #"(.)\1*", delegate(Match m)
{
return string.Concat(m.Value.Length, "(", m.Groups[1].Value, ")");
});
}
This works for Output5 but can I do the other Outputs with Regex or should I be using something like Linq?
The purpose of the code is to display MyString in a simple manner as I can get MyString being up to a 1000 characters generally with a pattern to it.
I am not too worried about speed.
Using RLE with single characters is easy, there never is an overlap between matches. If the number of characters to repeat is variable, you'd have a problem:
AAABAB
Could be:
3(A)BAB
Or
AA(2)AB
You'll have to define what rules you want to apply. Do you want the absolute best compression? Does speed matter?
I doubt Regex can look forward and select "the best" combination of matches - So to answer your question I would say "no".
RLE is of no help here - it's just an extremely simple compression where you repeat a single code-point a given number of times. This was quite useful for e.g. game graphics and transparent images ("next, there's 50 transparent pixels"), but is not going to help you with variable-length code-points.
Instead, have a look at Huffman encoding. Expanding it to work with variable-length codewords is not exactly cheap, but it's a start - and it saves a lot of space, if you can afford having the table there.
But the first thing you have to ask yourself is, what are you optimizing for? Are you trying to get the shortest possible string on output? Are you going for speed? Do you want as few code-words as possible, or do you need to balance the repetitions and code-word counts in some way? In other words, what are you actually trying to do? :))
To illustrate this on your "expected" return values, Output4 results in a longer string than MyString4. So it's not the shortest possible representation. You're not trying for the least amounts of code-words either, because then Output5 would be 1(AAAAABBBBB). Least amount of repetitions is of course silly (it would always be 1(...)). You're not optimizing for low overhead either, because that's again broken in Output4.
And whichever of those are you trying to do, I'm thinking it's not going to be possible with regular expressions - those only work for regular languages, and encoding like this doesn't seem all that regular to me. The decoding does, of course; but I'm not so sure about the encoding.
Here is a Non-Regex way given the data that you provided. I'm not sure of any edge cases, right now, that would trip this code up. If so, I'll update accordingly.
string myString1 = "ABABABABAB";
string myString2 = "ABCDABCDABCD";
string myString3 = "ABCAABCAABCAABCA";
string myString4 = "ABABACAC";
string myString5 = "AAAAABBBBB";
CountGroupOccurrences(myString1, "AB");
CountGroupOccurrences(myString2, "ABCD");
CountGroupOccurrences(myString3, "ABCA");
CountGroupOccurrences(myString4, "AB", "AC");
CountGroupOccurrences(myString5, "A", "B");
CountGroupOccurrences() looks like the following:
private static void CountGroupOccurrences(string str, params string[] patterns)
{
string result = string.Empty;
while (str.Length > 0)
{
foreach (string pattern in patterns)
{
int count = 0;
int index = str.IndexOf(pattern);
while (index > -1)
{
count++;
str = str.Remove(index, pattern.Length);
index = str.IndexOf(pattern);
}
result += string.Format("{0}({1})", count, pattern);
}
}
Console.WriteLine(result);
}
Results:
5(AB)
3(ABCD)
4(ABCA)
2(AB)2(AC)
5(A)5(B)
UPDATE
This worked with Regex
private static void CountGroupOccurrences(string str, params string[] patterns)
{
string result = string.Empty;
foreach (string pattern in patterns)
{
result += string.Format("{0}({1})", Regex.Matches(str, pattern).Count, pattern);
}
Console.WriteLine(result);
}
The code below is designed to take a string in and remove any of a set of arbitrary words that are considered non-essential to a search phrase.
I didn't write the code, but need to incorporate it into something else. It works, and that's good, but it just feels wrong to me. However, I can't seem to get my head outside the box that this method has created to think of another approach.
Maybe I'm just making it more complicated than it needs to be, but I feel like this might be cleaner with a different technique, perhaps by using LINQ.
I would welcome any suggestions; including the suggestion that I'm over thinking it and that the existing code is perfectly clear, concise and performant.
So, here's the code:
private string RemoveNonEssentialWords(string phrase)
{
//This array is being created manually for demo purposes. In production code it's passed in from elsewhere.
string[] nonessentials = {"left", "right", "acute", "chronic", "excessive", "extensive",
"upper", "lower", "complete", "partial", "subacute", "severe",
"moderate", "total", "small", "large", "minor", "multiple", "early",
"major", "bilateral", "progressive"};
int index = -1;
for (int i = 0; i < nonessentials.Length; i++)
{
index = phrase.ToLower().IndexOf(nonessentials[i]);
while (index >= 0)
{
phrase = phrase.Remove(index, nonessentials[i].Length);
phrase = phrase.Trim().Replace(" ", " ");
index = phrase.IndexOf(nonessentials[i]);
}
}
return phrase;
}
Thanks in advance for your help.
Cheers,
Steve
This appears to be an algorithm for removing stop words from a search phrase.
Here's one thought: If this is in fact being used for a search, do you need the resulting phrase to be a perfect representation of the original (with all original whitespace intact), but with stop words removed, or can it be "close enough" so that the results are still effectively the same?
One approach would be to tokenize the phrase (using the approach of your choice - could be a regex, I'll use a simple split) and then reassemble it with the stop words removed. Example:
public static string RemoveStopWords(string phrase, IEnumerable<string> stop)
{
var tokens = Tokenize(phrase);
var filteredTokens = tokens.Where(s => !stop.Contains(s));
return string.Join(" ", filteredTokens.ToArray());
}
public static IEnumerable<string> Tokenize(string phrase)
{
return string.Split(phrase, ' ');
// Or use a regex, such as:
// return Regex.Split(phrase, #"\W+");
}
This won't give you exactly the same result, but I'll bet that it's close enough and it will definitely run a lot more efficiently. Actual search engines use an approach similar to this, since everything is indexed and searched at the word level, not the character level.
I guess your code is not doing what you want it to do anyway. "moderated" would be converted to "d" if I'm right. To get a good solution you have to specify your requirements a bit more detailed. I would probably use Replace or regular expressions.
I would use a regular expression (created inside the function) for this task. I think it would be capable of doing all the processing at once without having to make multiple passes through the string or having to create multiple intermediate strings.
private string RemoveNonEssentialWords(string phrase)
{
return Regex.Replace(phrase, // input
#"\b(" + String.Join("|", nonessentials) + #")\b", // pattern
"", // replacement
RegexOptions.IgnoreCase)
.Replace(" ", " ");
}
The \b at the beginning and end of the pattern makes sure that the match is on a boundary between alphanumeric and non-alphanumeric characters. In other words, it will not match just part of the word, like your sample code does.
Yeah, that smells.
I like little state machines for parsing, they can be self-contained inside a method using lists of delegates, looping through the characters in the input and sending each one through the state functions (which I have return the next state function based on the examined character).
For performance I would flush out whole words to a string builder after I've hit a separating character and checked the word against the list (might use a hash set for that)
I would create A Hash table of Removed words parse each word if in the hash remove it only one time through the array and I believe that creating a has table is O(n).
How does this look?
foreach (string nonEssent in nonessentials)
{
phrase.Replace(nonEssent, String.Empty);
}
phrase.Replace(" ", " ");
If you want to go the Regex route, you could do it like this. If you're going for speed it's worth a try and you can compare/contrast with other methods:
Start by creating a Regex from the array input. Something like:
var regexString = "\\b(" + string.Join("|", nonessentials) + ")\\b";
That will result in something like:
\b(left|right|chronic)\b
Then create a Regex object to do the find/replace:
System.Text.RegularExpressions.Regex regex = new System.Text.RegularExpressions.Regex(regexString, System.Text.RegularExpressions.RegexOptions.IgnoreCase);
Then you can just do a Replace like so:
string fixedPhrase = regex.Replace(phrase, "");