String methods cutting out string parts - c#

I've built a string builder to add spaces into text if it is capital. The sentence entered would look like this : "ThisIsASentence." Since it starts with a capital, the string builder would modify the sentence to look like this: " This Is A Sentence."
My problem is, If I were to have a sentence like "thisIsASentence." the string builder will separate the sentence like normal : " this Is A Sentence."
Still both have a space in front of the first character.
When the sentence runs through this line:
result = result.Substring(1, 1).ToUpper() + result.Substring(2).ToLower();
If the first letter entered was lowercase, it gets cut off and the second letter becomes uppercase.
The line was meant to keep the first letter entered capitalized and set the rest lowercase.
Adding a trim statement before running that line changes nothing with the output.
Here is my overall code right now:
private void btnChange_Click(object sender, EventArgs e)
{
// New string named sentence, assigning the text input to sentence.
string sentence;
sentence = txtSentence.Text;
// String builder to let us modify string
StringBuilder sentenceSB = new StringBuilder();
/*
* For every character in the string "sentence" if the character is uppercase,
* add a space before the letter,
* if it isn't, add nothing.
*/
foreach (char c in sentence)
{
if (char.IsUpper(c))
{
sentenceSB.Append(" ");
}
sentenceSB.Append(c);
}
// Store the edited sentence into the "result" string
string result = sentenceSB.ToString();
// Starting at the 2nd spot and going 1, makes the first character capitalized
// Starting at position 3 and going to end change them to lower case.
result = result.Substring(1, 1).ToUpper() + result.Substring(2).ToLower();
// set the label text to equal "result" and set it visible.
lblChanged.Text = result.ToString();
lblChanged.Visible = true;

When you run the code with "thisIsASentence", After your foreach loop, result will be "this Is A Sentence", since it will not insert a space at the beginning.
Then your next line, will take the Character at index 1 (which is the 'h' in this), Make it uppercase, and then append the rest of the string, resulting in "His Is A Sentence"
To fix this, you can do result = result.Trim() after the loop, and then start at index 0, making the next line result = result.Substring(0, 1).ToUpper() + result.Substring(1).ToLower();

With result.SubString(1,1), you are assuming the first letter of the input is always capitalized, so your will always add a space in the beginning of the string. You have already seen that this isn't the case.
So I see basically two options for you:
Wrap that line in an if block that checks for spaces before replacing;
Capitalize the first letter of your input, if it's allowed by your spec.

Related

Determine if string is made up of characters from a different string (Scrabble-like program)

I am writing a program in C# that goes through a list of words and determines if they can be made up by a string that a user input. Just like the Scrabble game.
For example, when the user inputs the string "vacation", my program is supposed to go through a list of words that I already have and should return true when it gets to words like "cat". So it doesn't necessarily have to user ALL the letters.
Another example could be the word "overflow", it should return true with words like "over", "flow", "low", "lover". If the input word has repeating characters by N times, the word that matches can also have that letter up to N times but no more.
I currently have something like this:
var desiredChars = "ent";
var word = "element";
bool contains = desiredChars.All(word.Contains);
However, this checks if it contains all of the letters. I want to check if it contains ONLY those letters or less but ONLY those that can be made up with letters that the user passed.
If it wasn't for the issue of possible multiple letters (for "overflow", the word "fool" is a match, but "wow" isn't, because there aren't two w characters in the letter set), this Linq code would work
string letters = "overflow";
string word = "lover";
bool match = !word.Except(letters).Any(); // unfortunately, not sufficient
So, to handle the multiple letter issue, something like this is needed:
var letterChars = letters.ToList();
bool match = word.All(i => letterChars.Remove(i));
Here, we return true only if all the letters in the word can successfully be removed from the set of letters. Note that you only need to check those words in your dictionary that start with one of the letters in your letter set.
That worked for your example:
public static bool IsWordPartOfString(string startString, string word)
{
var tempTable = startString.ToArray();
foreach (var c in word)
{
var index = Array.FindIndex(tempTable, myChar => myChar == c);
if (index == -1)
{
return false;
}
tempTable[index] = ' ';
}
return true;
}
Steps:
1) Convert startString into an array
2) Iterate chars of the tested word
3) If char not found in startString return false
4) If char found in startString find it in the tempTable and remove so it
cannot be reused (to prevent scenario when startString has only one occurrence of a letter but the test word has multiple)
5) If possible to iterate through the whole word it means it all can be constructed from the letters in initial string so return true.

Pig Latin Console

Hi I'm doing Pig Latin for class, the instructions were first consonant is removed from the front of the word, and put on the back of the word. Then followed by the letters "ay." examples are, book becomes ookbay, and strength becomes engthstray. I'm having trouble because it doesn't do the first consonant.
// button, three, nix, eagle, and troubadour
Console.Write("Enter word you want in Pig Latin: ");
string word1 = Console.ReadLine();
string pig = "";
string vowels = "aeiouAEIOU";
string space = " ";
string extra = ""; //extra letters
int pos = 0; //position
foreach (string word in word1.Split())
{
if (pos != 0)
{
pig = pig + space;
}
else
{
pos = 1;
}
vowels = word.Substring(0,1);
extra = word.Substring(1, word.Length - 1);
pig = pig + extra + vowels + "ay";
}
Console.WriteLine(pig.ToString());
For example if I do strength it will come up as trengthsay and not like the example
You've got a number of problems there. First of all, your definition of the problem:
the instructions were first consonant is removed from the front of the word
That is precisely what you have done. strength does become trengths if you move the first consonant. You need to change your definition to all leading consonants up to the first vowel. Also, what do you do in the case of eagle? Does it become eagleay? Your instructions don't specify how to deal with a leading vowel.
This is another problem
vowels = word.Substring(0,1); // This will overwrite your vowel array with the first letter
Don't worry about writing real code just yet, write some pseudo-code to work out your logic first. #Chris's comment about looking for the first vowel is a good one. Your pseudo code may look something like:
Check if word begins with consonant
{
If so, look for index of first vowel
Take substring of word up to first vowel.
Append it to end
}
Otherwise
{
Deal with leading vowel
}

C# Replacing Multiple Spaces with 1 space leaving special characters intact

Having a bit of a problem as I have to translate a string into a table. I'd like to remove multiple spaces, but not all of them. So the data in text comes back with lots of spaces in between like so:
SESSIONNAME USERNAME ID STATE TYPE DEVICE\r\n
services 0 Disc \r\n
console 1 Conn \r\n
alinav 2 Disc \r\n
rdp-tcp 65536 Listen \r\n
I would like to still keep the \r\n\ values that will define my rows, and I want to keep the empty value which would be legit under the columns, and I want to keep the spaces to define the columns. But I want to remove the extra spaces that I don't want to be fed into the values.
I've tried:
output = Regex.Replace(output, #"\s{2,}", " ", RegexOptions.Multiline);
output = output.Replace(" ", " ");
But the first one just removes everything (things I need and don't need). And the second one still leaves too many spaces.
Thanks.
You can do two things:
Use space explicitly in the regular expression, \s includes weird characters like (\n, \r, \t,...) as well, thus:
output = Regex.Replace(output, #" +", " ", RegexOptions.Multiline);
Or apply the second method until convergence:
string s2 = output;
do {
output = s2;
s2 = s2.Replace(" "," ");
} while(output != s2);
In most cases the first method will outperform the second one. This because the first method groups all substrings with two or more spaces. Regexes are in general a bit slower than simple string replacement, but if the string contains sequences with many spaces, the first method will be faster.
In your example the data is delimited by position, not by characters; is that correct? If so, you should extract by position; something like:
foreach (string s in output.Split())
{
var sessionName = s.Substring(0, 18).Trim();
var userName = s.Substring(18, 19).Trim();
var id = Int32.Parse(s.Substring(37, 8).Trim());
var whateverType = s.Substring(45, 12).Trim();
var device = s.Substring(57, 6).Trim();
}
Of course you need to do proper error checking, and should probably put the field widths in an array and calculate positions instead of hard-coding them as I have shown.

How to skip `\r \n ` in string

I am working on a simple converter which converts a text to another language,
suppose i have two textboxes and in 1st box you enter the word Index and press the convert button.
I will replace your text with this فہرست an alternative of Index in urdu language but i have a problem if you enter word index and gives some spaces or gives some returns then i get text of that textbox in c# like this Index \r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n now how can i get rid of this i want to get simple Index always .
Thanks for answer and please feel free to comment if you have any question
Try using the Trim method if the new lines are only at the end of beginning:
input = input.Trim();
You can use Replace, if you want to remove new lines anywhere in the string:
// Replace line break with spaces
input = input.Replace("\r\n", " ");
// (Optionally) Combine consecutive spaces to one space (probalby not most efficient but should work)
while (input.Contains(" ")) { input = input.Replace(" ", " "); }
If you want to prevent newlines completely, most TextBox Controls have a property like MultiLine or similar, that, when set, prevents entering more than one line.
input.Replace(Environment.NewLine, string.Empty).Replace(" ", string.Empty);
User Replace to remove characters from the 'inside' of the string. Trim removes characters only at the begining and end of string.
This should suffice to remove whitespaces as defined by Char.IsWhiteSpace (blanks, newlines etc)
string wordToTranslate = textBox1.Text.Trim();
however, if your textbox contains multiple words then you should use a different approach
string[] words = textBox1.Text.Split((char[]) null, StringSplitOptions.RemoveEmptyEntries);
foreach(string wordToTranslate in words)
ExecTranslation(wordToTranslate);
using Split with char[] null as separator allows to identify every whitespaces as valid word separator
Add all chars you want to ignore to the string:
var cleanChars = text.Where(c => !"\n\r".Contains(c));
string cleanText = new string(cleanChars.ToArray());
That works because string implements IEnumerable<char>.

How to capitalize first letter of each sentence?

I know how to capitalize first letter in each word. But I want to know how to capitalize first letter of each sentence in C#.
This is not necessarily a trivial problem. Sentences can end with a number of different punctuation marks, and those same punctuation marks don't always denote the end of a sentence (abbreviations like Dr. may pose a particular problem because there are potentially many of them).
That being said, you might be able to get a "good enough" solution by using regular expressions to look for words after a sentence-ending punctuation, but you would have to add quite a few special cases. It might be easier to process the string character by character or word by word. You would still have to handle all the same special cases, but it might be easier than trying to build that into a regex.
There are lots of weird rules for grammar and punctuation. Any solution you come up with probably won't be able to take them all into account. Some things to consider:
Sentences can end with different punctuation marks (. ! ?)
Some punctuation marks that end sentences might also be used in the middle of a sentence (e.g. abbreviations such as Dr. Mr. e.g.)
Sentences could contain nested sentences. Quotations could pose a particular problem (e.g. He said, "This is a hard problem! I wonder," he mused, "if it can be solved.")
As a first approximation, you could probably treat any sequence like [a-z]\.[ \n\t] as the end of a sentence.
Consider a sentence as a word containing spaces an ending with a period.
There's some VB code on this page which shouldn't be too hard to convert to C#.
However, subsequent posts point out the errors in the algorithm.
This blog has some C# code which claims to work:
It auto capitalises the first letter after every full stop (period), question mark and exclamation mark.
UPDATE 16 Feb 2010: I’ve reworked it so that it doesn’t affect strings such as URL’s and the like
Don't forget sentences with parentheses. Also, * if used as an idicator for bold text.
http://www.grammarbook.com/punctuation/parens.asp
I needed to do something similar, and this served my purposes. I pass in my "sentences" as a IEnumerable of strings.
// Read sentences from text file (each sentence on a separate line)
IEnumerable<string> lines = File.ReadLines(inputPath);
// Call method below
lines = CapitalizeFirstLetterOfEachWord(lines);
private static IEnumerable<string> CapitalizeFirstLetterOfString(IEnumerable<string> inputLines)
{
// Will output: Lorem lipsum et
List<string> outputLines = new List<string>();
TextInfo textInfo = new CultureInfo("en-US", false).TextInfo;
foreach (string line in inputLines)
{
string lineLowerCase = textInfo.ToLower(line);
string[] lineSplit = lineLowerCase.Split(' ');
bool first = true;
for (int i = 0; i < lineSplit.Length; i++ )
{
if (first)
{
lineSplit[0] = textInfo.ToTitleCase(lineSplit[0]);
first = false;
}
}
outputLines.Add(string.Join(" ", lineSplit));
}
return outputLines;
}
I know I'm little late, but just like You, I needed to capitalize every first character on each of my sentences.
I just fell here (and a lot of other pages while I was researching) and found nothing to help me out. So, I burned some neurons, and made a algorithm by myself.
Here is my extension method to capitalize sentences:
public static string CapitalizeSentences(this string Input)
{
if (String.IsNullOrEmpty(Input))
return Input;
if (Input.Length == 1)
return Input.ToUpper();
Input = Regex.Replace(Input, #"\s+", " ");
Input = Input.Trim().ToLower();
Input = Char.ToUpper(Input[0]) + Input.Substring(1);
var objDelimiters = new string[] { ". ", "! ", "? " };
foreach (var objDelimiter in objDelimiters)
{
var varDelimiterLength = objDelimiter.Length;
var varIndexStart = Input.IndexOf(objDelimiter, 0);
while (varIndexStart > -1)
{
Input = Input.Substring(0, varIndexStart + varDelimiterLength) + (Input[varIndexStart + varDelimiterLength]).ToString().ToUpper() + Input.Substring((varIndexStart + varDelimiterLength) + 1);
varIndexStart = Input.IndexOf(objDelimiter, varIndexStart + 1);
}
}
return Input;
}
Details about the algorithm:
This simple algorithm starts removing all double spaces. Then, it capitalize the first character of the string. then search for every delimiter. When find one, capitalize the very next character.
I made it easy to Add/Remove or Edit the delimiters, so You can change a lot how code works with a little change on it.
It doesn't check if the substrings go out of the string length, because the delimiters end with spaces, and the algorithm starts with a "Trim()", so every delimiter if found in the string will be followed by another character.
Important:
You didn't specify what were exactly your needs. I mean, it's a grammar corrector, it's just to prettify a text, etc... So, it's important to consider that my algorithm is just perfect for my needs, that can be different of yours.
*This algorithm was created to format a "Product Description" that isn't normalized (almost always it's entirely uppercased) in a nice format to the user (To be more specific, I need to show a pretty and "smaller" text for user. So, all characters in Upper Case is just opposite of what I want). So, it was not created to be grammatically perfect.
*Also, there maybe some exceptions where the character will not be uppercased because bad formatting.
*I choose to include spaces in the delimiter, so "http://www.stackoverflow.com" will not become "http://www.Stackoverflow.Com". In the other hand, sentences like "the box is blue.it's on the floor" will become "The box is blue.it's on the floor", and not "The box is blue.It's on the floor"
*In abbreviations cases, it will capitalize, but once again, it's not a problem because my needs is just show a product description (where grammar is not extremely critic). And in abbreviations like Mr. or Dr. the very first character is a name, so, it's perfect to be capitalized.
If You, or somebody else needs a more accurate algorithm, I'll be glad to improve it.
Hope I could help somebody!
However you can make a class or method to convert each text in TitleCase. Here is the example you just need to call the method.
public static string ToTitleCase(string strX)
{
string[] aryWords = strX.Trim().Split(' ');
List<string> lstLetters = new List<string>();
List<string> lstWords = new List<string>();
foreach (string strWord in aryWords)
{
int iLCount = 0;
foreach (char chrLetter in strWord.Trim())
{
if (iLCount == 0)
{
lstLetters.Add(chrLetter.ToString().ToUpper());
}
else
{
lstLetters.Add(chrLetter.ToString().ToLower());
}
iLCount++;
}
lstWords.Add(string.Join("", lstLetters));
lstLetters.Clear();
}
string strNewString = string.Join(" ", lstWords);
return strNewString;
}

Categories