Text to pig latin translator inserts duplicate characters - c#

I'm creating a c# Windows Forms app to convert English Text to Pig Latin, but the program is inserting the first letter (if it's a consonant) five times in the end, instead of just one.
I was able to insert "way" at the end of the text by using an if statement, that checks if the first letter is a vowel. However, my issue starts when it checks if a word is not a vowel.
string[] vowels = new string[5] { "a", "e", "i", "o", "u" };
private void BtnTranslate_Click(object sender, EventArgs e)
{
string TextEnglish = txtEnglish.Text;
for (int i = 0; i < vowels.Length; i++)
{
if (TextEnglish.StartsWith(vowels[i]))
{
TextEnglish = TextEnglish.Insert(TextEnglish.Length, "way");
}
else if(!TextEnglish.StartsWith(vowels[i]))
{
string TextEnglishSubstring = TextEnglish.Substring(0, 1);
TextEnglish = TextEnglish.Insert(TextEnglish.Length, TextEnglishSubstring);
TextEnglish = TextEnglish.Insert(TextEnglish.Length, "ay");
}
//string substringToInsert = TextEnglish.Substring(0, 1);
//TextEnglish = TextEnglish.Insert(TextEnglish.Length, "c");
txtPigLatin.Text = TextEnglish;
}
}

First I might recommend that you create a separate method for returning a pig-latin translation, and then call that method from your button click event. This allows for better code re-use and will result in cleaner code.
The problem is that you're looping over all the items in the vowels array and changing the text for each iteration. Instead what you want to do is simply determine if the word starts with a vowel or not. Again, this could be written in another method (more code encapsulation, which means this can also be re-used elsewhere if needed).
Note that I've written the vowels as a string, but can treat it as a char[] (because that's kind of what strings really are), and the trick here is to see if the array Contains the first character of the input string:
public static bool BeginsWithAVowel(string input)
{
if (string.IsNullOrWhiteSpace(input)) return false;
var vowels = "AaEeIiOoUu";
return vowels.Contains(input.Substring(0, 1));
}
Now, we can use this method to test our string in our pig latin conversion method:
public static string ConvertToPigLatin(string input)
{
if (string.IsNullOrWhiteSpace(input)) return input;
if (BeginsWithAVowel(input))
{
// Add "way" to the end of the string and return it
return input + "way";
}
// Remove the first character and add it, plus "ay", to the end and return it
return input.Substring(1) + input.Substring(0, 1) + "ay";
}
Now, in the button click event, all we have to do to convert the text is call our method with the original text and then set the Text property to the result:
private void BtnTranslate_Click(object sender, EventArgs e)
{
txtPigLatin.Text = ConvertToPigLatin(txtEnglish.Text);
}

You're checking the first character against each possible vowel, and it can't possibly by all of them, and you're running the code to convert once for each vowel checked.
Instead, you should check if it's a vowel first, set a flag, then do your conversion logic. There are a few ways to manage this, here's an example:
string TextEnglish = txtEnglish.Text;
bool startsWithVowel = vowels.Any(v => TextEnglish.StartsWith(v));
if(startsWithVowel)
{
// Do vowel logic
}
else
{
// Do consonant logic
}

Related

C# Method to Check if a String Contains Certain Letters

I'm trying to create a method which takes two parameters, "word" and "input". The aim of the method is to print any word where all of its characters can be found in "input" no more than once (this is why the character is removed if a letter is found).
Not all the letters from "input" must be in "word" - eg, for input = "cacten" and word = "ace", word would be printed, but if word = "aced" then it would not.
However, when I run the program it produces unexpected results (words being longer than "input", containing letters not found in "input"), and have coded the solution several ways all with the same outcome. This has stumped me for hours and I cannot work out what's going wrong. Any and all help will be greatly appreciated, thanks. My full code for the method is written below.
static void Program(string input, string word)
{
int letters = 0;
List<string> remaining = new List<string>();
foreach (char item in input)
{
remaining.Add(item.ToString());
}
input = remaining.ToString();
foreach (char letter in word)
{
string c = letter.ToString();
if (input.Contains(c))
{
letters++;
remaining.Remove(c);
input = remaining.ToString();
}
}
if (letters == word.Length)
{
Console.WriteLine(word);
}
}
Ok so just to go through where you are going wrong.
Firstly when you assign remaining.ToString() to your input variable. What you actually assign is this System.Collections.Generic.List1[System.String]. Doing to ToString on a List just gives you the the type of list it is. It doesnt join all your characters back up. Thats probably the main thing that is casuing you issues.
Also you are forcing everything into string types and really you don't need to a lot of the time, because string already implements IEnumerable you can get your string as a list of chars by just doing myString.ToList()
So there is no need for this:
foreach (char item in input)
{
remaining.Add(item.ToString());
}
things like string.Contains have overloads that take chars so again no need for making things string here:
foreach (char letter in word)
{
string c = letter.ToString();
if (input.Contains(c))
{
letters++;
remaining.Remove(c);
input = remaining.ToString();
}
}
you can just user the letter variable of type char and pass that into contains and beacuse remaining is now a List<char> you can remove a char from it.
again Don't reassign remaining.ToString() back into input. use string.Join like this
string.Join(string.empty,remaining);
As someone else has posted there is a probably better ways of doing this, but I hope that what I've put here helps you understand what was going wrong and will help you learn
You can also use Regular Expression which was created for such scenarios.
bool IsMatch(string input, string word)
{
var pattern = string.Format("\\b[{0}]+\\b", input);
var r = new Regex(pattern);
return r.IsMatch(word);
}
I created a sample code for you on DotNetFiddle.
You can check what the pattern does at Regex101. It has a pretty "Explanation" and "Quick Reference" panel.
There are a lot of ways to achieve that, here is a suggestion:
static void Main(string[] args)
{
Func("cacten","ace");
Func("cacten", "aced");
Console.ReadLine();
}
static void Func(string input, string word)
{
bool isMatch = true;
foreach (Char s in word)
{
if (!input.Contains(s.ToString()))
{
isMatch = false;
break;
}
}
// success
if (isMatch)
{
Console.WriteLine(word);
}
// no match
else
{
Console.WriteLine("No Match");
}
}
Not really an answer to your question but its always fun to do this sort of thing with Linq:
static void Print(string input, string word)
{
if (word.All(ch => input.Contains(ch) &&
word.GroupBy(c => c)
.All(g => g.Count() <= input.Count(c => c == g.Key))))
Console.WriteLine(word);
}
Functional programming is all about what you want without all the pesky loops, ifs and what nots... Notice that this code does what you'd do in your head without needing to painfully specify step by step how you'd actually do it:
Make sure all characters in word are in input.
Make sure all characters in word are used at most as many times as they are present in input.
Still, getting the basics right is a must, posted this answer as additional info.

replacing text with cleaned word case insensitive c#

I have a list of bad words, that if found in the text string, will be replaced by a cleaned word.
eg. badwords{woof} is replaced by w$$f
But is currently only working when the array list is in the same case as the matched word in the sentence.
var badWords = new List<string>{"woof", "meow"}
var string = "I have a cat named meow and a dog name Woof."
Should become === "I have a cat named m$$w and a dog name W$$f"
public string CensorText(string text)
{
if (string.IsNullOrWhiteSpace(text))
{
return text;
}
foreach (string word in CensoredWords)
{
text = text.Replace(word, WordCleaner(word));
}
return text;
}
private static string WordCleaner(string wordToClean)
{
string firstChar = wordToClean.Substring(0,1);
string lastChar = wordToClean.Substring(wordToClean.Length - 1);
string centerHash = new string('$', wordToClean.Length-2);
return string.Concat(firstChar, centerHash, lastChar);
}
How can make it so that its case insensitive when looping through the words and cleaning them. Simpler the answer is better.
Try replacing:
text = text.Replace(word, WordCleaner(word));
with
text = text.Replace(word.ToLower(), WordCleaner(word));
This converts any upper case letter to a lower case one.
Edit
I've realised that I've made the wrong variable into lower case.
Change:
public string CensorText(string text)
{
To:
public string CensorText(string text)
{
text = text.ToLower();
Edit 2
To retain the original sentence with the censored words changed, it would be much easier to use re instead. First, revert your file back to how it was in the question.
Now replace:
text = text.Replace(word, WordCleaner(word));
with:
text = regex.replace(text,word,WordCleaner(word),RegexOptions.Ignorecase);
Here's a simple option you can use.
The benefit is you don't care which of the word is lower case, it'll work for either cases. Note that compare returns an int, hence why we check it's 0 for a match.
string input = "the Woof is on Fire, we don't need no bucket, leT the ...";
string[] bad_words = new string[] {"woof","fire","BucKet", "Let"};
foreach (var word in input.Split(' ')) {
if (bad_words.Any( b => String.Compare( word, b // Following line does what you want:
, StringComparison.OrdinalIgnoreCase) == 0))
Console.Write(WordCleaner(word));
else
Console.Write(word);
}
Output:
the W$$f is on F$$e we don't need no b$$$$t l$T the ...
Seems fine to me. Note that if you split on space, a word with a comma right after will have that comma as part of the word

How to check if a word starts with a given character?

I have a list of a Sharepoint items: each item has a title, a description and a type.
I successfully retrieved it, I called it result. I want to first check if there is any item in result which starts with A then B then C, etc. I will have to do the same for each alphabet character and then if I find a word starting with this character I will have to display the character in bold.
I initially display the characters using this function:
private string generateHeaderScripts(char currentChar)
{
string headerScriptHtml = "$(document).ready(function() {" +
"$(\"#myTable" + currentChar.ToString() + "\") " +
".tablesorter({widthFixed: true, widgets: ['zebra']})" +
".tablesorterPager({container: $(\"#pager" + currentChar.ToString() +"\")}); " +
"});";
return headerScriptHtml;
}
How can I check if a word starts with a given character?
To check one value, use:
string word = "Aword";
if (word.StartsWith("A"))
{
// do something
}
You can make a little extension method to pass a list with A, B, and C
public static bool StartsWithAny(this string source, IEnumerable<string> strings)
{
foreach (var valueToCheck in strings)
{
if (source.StartsWith(valueToCheck))
{
return true;
}
}
return false;
}
if (word.StartsWithAny(new List<string>() { "A", "B", "C" }))
{
// do something
}
AND as a bonus, if you want to know what your string starts with, from a list, and do something based on that value:
public static bool StartsWithAny(this string source, IEnumerable<string> strings, out string startsWithValue)
{
startsWithValue = null;
foreach (var valueToCheck in strings)
{
if (source.StartsWith(valueToCheck))
{
startsWithValue = valueToCheck;
return true;
}
}
return false;
}
Usage:
string word = "AWord";
string startsWithValue;
if (word.StartsWithAny(new List<string>() { "a", "b", "c" }, out startsWithValue))
{
switch (startsWithValue)
{
case "A":
// Do Something
break;
// etc.
}
}
You could do something like this to check for a specific character.
public bool StartsWith(string value, string currentChar) {
return value.StartsWith(currentChar, true, null);
}
The StartsWith method has an option to ignore the case. The third parameter is to set the culture. If null, it just uses the current culture. With this method, you can loop through your words, run the check and process the word to highlight that first character as needed.
Assuming the properties you're checking are string types, you can use the String.StartsWith() method.. for example: -
if(item.Title.StartsWith("A"))
{
//do whatever
}
Rinse and repeat
Try the following below. You can do either StartsWith or Substring 0,1 (first letter)
if (Word.Substring(0,1) == "A") {
}
You can simply check the first character:
string word = "AWord"
if (word[0] == 'A')
{
// do something
}
Remember that character comparison is more efficient than string comparison.
To return the first character in a string, use:
Word.Substring(0,1) //where word is a string
You could implement Regular Expressions. They are quite powerful, but when you design your expression it will actually accomplish a task for you.
For example finding a number, letter, word, and etc. it is quite expressive and flexible.
They have a really great tutorial on them here:
An example of such an expression would be:
string input = "Some additional string to compare against.";
Match match = Regex.Match(input, #"\ba\w*\b", RegexOptions.IgnoreCase);
That would find all the items that start with an "a" no matter the case. You find even utilize Lambda and Linq to make them flow even better.
Hopefully that helps.

Check string for invalid characters? Smartest way?

I would like to check some string for invalid characters. With invalid characters I mean characters that should not be there. What characters are these? This is different, but I think thats not that importan, important is how should I do that and what is the easiest and best way (performance) to do that?
Let say I just want strings that contains 'A-Z', 'empty', '.', '$', '0-9'
So if i have a string like "HELLO STaCKOVERFLOW" => invalid, because of the 'a'.
Ok now how to do that? I could make a List<char> and put every char in it that is not allowed and check the string with this list. Maybe not a good idea, because there a lot of chars then. But I could make a list that contains all of the allowed chars right? And then? For every char in the string I have to compare the List<char>? Any smart code for this? And another question: if I would add A-Z to the List<char> I have to add 25 chars manually, but these chars are as I know 65-90 in the ASCII Table, can I add them easier? Any suggestions? Thank you
You can use a regular expression for this:
Regex r = new Regex("[^A-Z0-9.$ ]$");
if (r.IsMatch(SomeString)) {
// validation failed
}
To create a list of characters from A-Z or 0-9 you would use a simple loop:
for (char c = 'A'; c <= 'Z'; c++) {
// c or c.ToString() depending on what you need
}
But you don't need that with the Regex - pretty much every regex engine understands the range syntax (A-Z).
I have only just written such a function, and an extended version to restrict the first and last characters when needed. The original function merely checks whether or not the string consists of valid characters only, the extended function adds two integers for the numbers of valid characters at the beginning of the list to be skipped when checking the first and last characters, in practice it simply calls the original function 3 times, in the example below it ensures that the string begins with a letter and doesn't end with an underscore.
StrChr(String, "_0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"));
StrChrEx(String, "_0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ", 11, 1));
BOOL __cdecl StrChr(CHAR* str, CHAR* chars)
{
for (int s = 0; str[s] != 0; s++)
{
int c = 0;
while (true)
{
if (chars[c] == 0)
{
return false;
}
else if (str[s] == chars[c])
{
break;
}
else
{
c++;
}
}
}
return true;
}
BOOL __cdecl StrChrEx(CHAR* str, CHAR* chars, UINT excl_first, UINT excl_last)
{
char first[2] = {str[0], 0};
char last[2] = {str[strlen(str) - 1], 0};
if (!StrChr(str, chars))
{
return false;
}
if (excl_first != 0)
{
if (!StrChr(first, chars + excl_first))
{
return false;
}
}
if (excl_last != 0)
{
if (!StrChr(last, chars + excl_last))
{
return false;
}
}
return true;
}
If you are using c#, you do this easily using List and contains. You can do this with single characters (in a string) or a multicharacter string just the same
var pn = "The String To ChecK";
var badStrings = new List<string>()
{
" ","\t","\n","\r"
};
foreach(var badString in badStrings)
{
if(pn.Contains(badString))
{
//Do something
}
}
If you're not super good with regular expressions, then there is another way to go about this in C#. Here is a block of code I wrote to test a string variable named notifName:
var alphabet = "a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z";
var numbers = "0,1,2,3,4,5,6,7,8,9";
var specialChars = " ,(,),_,[,],!,*,-,.,+,-";
var validChars = (alphabet + "," + alphabet.ToUpper() + "," + numbers + "," + specialChars).Split(',');
for (int i = 0; i < notifName.Length; i++)
{
if (Array.IndexOf(validChars, notifName[i].ToString()) < 0) {
errorFound = $"Invalid character '{notifName[i]}' found in notification name.";
break;
}
}
You can change the characters added to the array as needed. The Array IndexOf method is the key to the whole thing. Of course if you want commas to be valid, then you would need to choose a different split character.
Not enough reps to comment directly, but I recommend the Regex approach. One small caveat: you probably need to anchor both ends of the input string, and you will want at least one character to match. So (with thanks to ThiefMaster), here's my regex to validate user input for a simple arithmetical calculator (plus, minus, multiply, divide):
Regex r = new Regex(#"^[0-9\.\-\+\*\/ ]+$");
I'd go with a regex, but still need to add my 2 cents here, because all the proposed non-regex solutions are O(MN) in the worst case (string is valid) which I find repulsive for religious reasons.
Even more so when LINQ offers a simpler and more efficient solution than nesting loops:
var isInvalid = "The String To Test".Intersect("ALL_INVALID_CHARS").Any();

How to capitalize the first character of each word, or the first character of a whole string, with C#?

I could write my own algorithm to do it, but I feel there should be the equivalent to ruby's humanize in C#.
I googled it but only found ways to humanize dates.
Examples:
A way to turn "Lorem Lipsum Et" into "Lorem lipsum et"
A way to turn "Lorem lipsum et" into "Lorem Lipsum Et"
As discussed in the comments of #miguel's answer, you can use TextInfo.ToTitleCase which has been available since .NET 1.1. Here is some code corresponding to your example:
string lipsum1 = "Lorem lipsum et";
// Creates a TextInfo based on the "en-US" culture.
TextInfo textInfo = new CultureInfo("en-US",false).TextInfo;
// Changes a string to titlecase.
Console.WriteLine("\"{0}\" to titlecase: {1}",
lipsum1,
textInfo.ToTitleCase( lipsum1 ));
// Will output: "Lorem lipsum et" to titlecase: Lorem Lipsum Et
It will ignore casing things that are all caps such as "LOREM LIPSUM ET" because it is taking care of cases if acronyms are in text so that "IEEE" (Institute of Electrical and Electronics Engineers) won't become "ieee" or "Ieee".
However if you only want to capitalize the first character you can do the solution that is over hereā€¦ or you could just split the string and capitalize the first one in the list:
string lipsum2 = "Lorem Lipsum Et";
string lipsum2lower = textInfo.ToLower(lipsum2);
string[] lipsum2split = lipsum2lower.Split(' ');
bool first = true;
foreach (string s in lipsum2split)
{
if (first)
{
Console.Write("{0} ", textInfo.ToTitleCase(s));
first = false;
}
else
{
Console.Write("{0} ", s);
}
}
// Will output: Lorem lipsum et
There is another elegant solution :
Define the function ToTitleCase in an static class of your projet
using System.Globalization;
public static string ToTitleCase(this string title)
{
return CultureInfo.CurrentCulture.TextInfo.ToTitleCase(title.ToLower());
}
And then use it like a string extension anywhere on your project:
"have a good day !".ToTitleCase() // "Have A Good Day !"
Use regular expressions for this looks much cleaner:
string s = "the quick brown fox jumps over the lazy dog";
s = Regex.Replace(s, #"(^\w)|(\s\w)", m => m.Value.ToUpper());
All the examples seem to make the other characters lowered first which isn't what I needed.
customerName = CustomerName <-- Which is what I wanted
this is an example = This Is An Example
public static string ToUpperEveryWord(this string s)
{
// Check for empty string.
if (string.IsNullOrEmpty(s))
{
return string.Empty;
}
var words = s.Split(' ');
var t = "";
foreach (var word in words)
{
t += char.ToUpper(word[0]) + word.Substring(1) + ' ';
}
return t.Trim();
}
If you just want to capitalize the first character, just stick this in a utility method of your own:
return string.IsNullOrEmpty(str)
? str
: str[0].ToUpperInvariant() + str.Substring(1).ToLowerInvariant();
There's also a library method to capitalize the first character of every word:
http://msdn.microsoft.com/en-us/library/system.globalization.textinfo.totitlecase.aspx
CSS technique is ok but only changes the presentation of the string in the browser. A better method is to make the text itself capitalised before sending to browser.
Most of the above implimentations are ok, but none of them address the issue of what happens if you have mixed case words that need to be preserved, or if you want to use true Title Case, for example:
"Where to Study PHd Courses in the USA"
or
"IRS Form UB40a"
Also using CultureInfo.CurrentCulture.TextInfo.ToTitleCase(string) preserves upper case words as in
"sports and MLB baseball" which becomes "Sports And MLB Baseball" but if the whole string is put in upper case, then this causes an issue.
So I put together a simple function that allows you to keep the capital and mixed case words and make small words lower case (if they are not at the start and end of the phrase) by including them in a specialCases and lowerCases string arrays:
public static string TitleCase(string value) {
string titleString = ""; // destination string, this will be returned by function
if (!String.IsNullOrEmpty(value)) {
string[] lowerCases = new string[12] { "of", "the", "in", "a", "an", "to", "and", "at", "from", "by", "on", "or"}; // list of lower case words that should only be capitalised at start and end of title
string[] specialCases = new string[7] { "UK", "USA", "IRS", "UCLA", "PHd", "UB40a", "MSc" }; // list of words that need capitalisation preserved at any point in title
string[] words = value.ToLower().Split(' ');
bool wordAdded = false; // flag to confirm whether this word appears in special case list
int counter = 1;
foreach (string s in words) {
// check if word appears in lower case list
foreach (string lcWord in lowerCases) {
if (s.ToLower() == lcWord) {
// if lower case word is the first or last word of the title then it still needs capital so skip this bit.
if (counter == 0 || counter == words.Length) { break; };
titleString += lcWord;
wordAdded = true;
break;
}
}
// check if word appears in special case list
foreach (string scWord in specialCases) {
if (s.ToUpper() == scWord.ToUpper()) {
titleString += scWord;
wordAdded = true;
break;
}
}
if (!wordAdded) { // word does not appear in special cases or lower cases, so capitalise first letter and add to destination string
titleString += char.ToUpper(s[0]) + s.Substring(1).ToLower();
}
wordAdded = false;
if (counter < words.Length) {
titleString += " "; //dont forget to add spaces back in again!
}
counter++;
}
}
return titleString;
}
This is just a quick and simple method - and can probably be improved a bit if you want to spend more time on it.
if you want to keep the capitalisation of smaller words like "a" and "of" then just remove them from the special cases string array. Different organisations have different rules on capitalisation.
You can see an example of this code in action on this site: Egg Donation London - this site automatically creates breadcrumb trails at the top of the pages by parsing the url eg "/services/uk-egg-bank/introduction" - then each folder name in the trail has hyphens replaced with spaces and capitalises the folder name, so uk-egg-bank becomes UK Egg Bank. (preserving the upper case 'UK')
An extension of this code could be to have a lookup table of acronyms and uppercase/lowercase words in a shared text file, database table or web service so that the list of mixed case words can be maintained from one single place and apply to many different applications that rely on the function.
There is no prebuilt solution for proper linguistic captialization in .NET. What kind of capitialization are you going for? Are you following the Chicago Manual of Style conventions? AMA or MLA? Even plain english sentence capitalization has 1000's of special exceptions for words. I can't speak to what ruby's humanize does, but I imagine it likely doesn't follow linguistic rules of capitalization and instead does something much simpler.
Internally, we encountered this same issue and had to write a fairly large amount code just to handle proper (in our little world) casing of article titles, not even accounting for sentence capitalization. And it indeed does get "fuzzy" :)
It really depends on what you need - why are you trying to convert the sentences to proper capitalization (and in what context)?
I have achieved the same using custom extension methods. For First Letter of First sub-string use the method yourString.ToFirstLetterUpper(). For First Letter of Every sub-string excluding articles and some propositions, use the method yourString.ToAllFirstLetterInUpper(). Below is a console program:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("this is my string".ToAllFirstLetterInUpper());
Console.WriteLine("uniVersity of lonDon".ToAllFirstLetterInUpper());
}
}
public static class StringExtension
{
public static string ToAllFirstLetterInUpper(this string str)
{
var array = str.Split(" ");
for (int i = 0; i < array.Length; i++)
{
if (array[i] == "" || array[i] == " " || listOfArticles_Prepositions().Contains(array[i])) continue;
array[i] = array[i].ToFirstLetterUpper();
}
return string.Join(" ", array);
}
private static string ToFirstLetterUpper(this string str)
{
return str?.First().ToString().ToUpper() + str?.Substring(1).ToLower();
}
private static string[] listOfArticles_Prepositions()
{
return new[]
{
"in","on","to","of","and","or","for","a","an","is"
};
}
}
OUTPUT
This is My String
University of London
Process finished with exit code 0.
Far as I know, there's not a way to do that without writing (or cribbing) code. C# nets (ha!) you upper, lower and title (what you have) cases:
http://support.microsoft.com/kb/312890/EN-US/

Categories