Distinct() doesn't see uppercase letter changed by ToLower() method - c#

This is my code:
string textToEncode = File.ReadAllText(#"C:\Users\ASUS\Desktop\szyfrowanie2\TextSample.txt");
textToEncode = textToEncode.ToLower();
char[] distinctLetters = textToEncode.Distinct().ToArray();
var count = textToEncode.Distinct().Count();
Console.WriteLine("Letters used in text: \n\n");
for (int i = 0; i < count; i++)
{
if (Equals(distinctLetters[i]," "))
{
Console.Write("<space>");
}
else
{
Console.Write(" " + distinctLetters[i] + " ");
}
}
I want to read the .txt file, turn it all to lowercases by ToLower(); method, but then when I want to read all the distinct characters from .txt file and then write them on screen, they don't show up. Yet later when I use
for (int i = 0; i < distinctLetters.Length; i++)
{
Console.Write("Swap " + distinctLetters[i] + " with ");
it shows the letter that indeed was changed into a lowercase, but wasn't visible on screen by first for loop. First word in my TextSample.txt file is "With". The first loop only shows
i t h
But as the second loop starts, it asks
Swap w with
and I have no idea why.
Also the if statement in first loop doesn't work, it doesn't detect the space.

I modified your code a bit and apart from "fixing" if statement (as it was never true because of the fact your array contains chars, but was compared to a string (" " is a string, ' ' is a char)) included a new loop which highlights and fixes what turned out to be the main problem (carriage return and new line symbols, which was kindly pointed out by Wim ten Brink in the comments). The string value I used is there only to demonstrate the issue.
string textToEncode = "abc\rdefabdg";
textToEncode = textToEncode.ToLower();
char[] distinctLetters = textToEncode.Distinct().ToArray();
var count = distinctLetters.Length;
Console.WriteLine("Letters used in text (old):");
for (int i = 0; i < count; i++)
{
var letter = distinctLetters[i];
if (Equals(letter, " "))
{
Console.Write("<space>");
}
else
{
Console.Write(distinctLetters[i]);
}
}
Console.WriteLine();
Console.WriteLine("Letters used in text (new):");
for (int i = 0; i < count; i++)
{
var letter = distinctLetters[i];
if (!char.IsLetter(letter))
continue;
Console.Write(distinctLetters[i]);
}
And the output is:
Letters used in text (old):
defg
Letters used in text (new):
abcdefg

I've also modified your code a bit:
string textToEncode = File.ReadAllText(#"C:\Users\ASUS\Desktop\szyfrowanie2\TextSample.txt").ToLower();
char[] distinctLetters = textToEncode.Distinct().ToArray();
var count = distinctLetters.Count();
Console.WriteLine("Letters used in text: \n\n");
for (int i = 0; i < count; i++)
{
if (Equals(distinctLetters[i], ' ')) { Console.Write("<space>"); }
else if (Equals(distinctLetters[i], '\r')) { Console.Write("<cr>"); }
else if (Equals(distinctLetters[i], '\n')) { Console.Write("<lf>"); }
else { Console.Write(" " + distinctLetters[i] + " "); }
}
Just a few minor things. I merged the two first lines, changed " " into ' ' so it now compares characters, changed the counting of characters to use distinctLetters instead of executing the same Distinct() command again and I added two conditions to handle the carriage return and line feed. (I always mix them up, btw.)
This now shows the right result but should also explain why characters went missing! A simple reason, actually. Your text file has a carriage return character, which will send the cursor back to the left. This will cause the first character to be overwritten by a space...
So your code actually prints " w i ..." but then gets the '\r'. It will then print a space, go back to the beginning of the line and writes another space over the ' '! Then the newline will come next, which prints a second space over the 'w', moves to the next line and prints a space again. Then the rest gets printed...
Simple, isn't it? But by capturing these two special characters with the two extra if statements, it is fixed... :-) The '\r' and '\n' characters are often overlooked in console applications, giving unexpected results when they get printed.

Related

How to capitalize the first character of every sentence

Create an application with a method that accepts a string as an argument and returns a copy of the string with the first character of each sentence capitalized.
This is what I have to far and I can't seem to get it right:
//Create method to process string.
private string Sentences(string input)
{
//Capitalize first letter of input.
char firstLetter = char.ToUpper(input[0]);
//Combine the capitalize letter with the rest of the input.
input = firstLetter.ToString() + input.Substring(1);
//Create a char array to hold all characters in input.
char[] letters = new char[input.Length];
//Read the characters from input into the array.
for (int i = 0; i < input.Length; i++)
{
letters[i] = input[i];
}
//Loop through array to test for punctuation and capitalize a character 2 index away.
for (int index = 0; index < letters.Length; index++)
{
if(char.IsPunctuation(letters[index]))
{
if (!((index + 2) >= letters.Length))
{
char.ToUpper(letters[index+ 2]);
}
}
}
for(int ind = 0; ind < letters.Length; ind++)
{
input += letters[ind].ToString();
}
return input;
}
You could use Linq.Aggregate n - see comments in code and code output for explanation how it work's.
This one will respect "Bla. blubb" as well - you need to check for whitespaces after ".?!"
using System;
using System.Linq;
internal class Program
{
static string Capitalize(string oldSentence )
{
return
// this will look at oldSentence char for char, we start with a
// new string "" (the accumulator, short acc)
// and inspect each char c of oldSentence
// comment all the Console.Writelines in this function, thats
// just so you see whats done by Aggregate, not needed for it to
// work
oldSentence
.Aggregate("", (acc, c) =>
{
System.Console.WriteLine("Accumulated: " + acc);
System.Console.WriteLine("Cecking: " + c);
// if the accumulator is empty or the last character of
// trimmed acc is a ".?!" we append the
// upper case of c to it
if (acc.Length == 0 || ".?!".Any(p => p == acc.Trim().Last())) // (*)
acc += char.ToUpper(c);
else
acc += c; // else we add it unmodified
System.Console.WriteLine($"After: {acc}\n");
return acc; // this returns the acc for the next iteration/next c
});
}
static void Main(string[] args)
{
Console.SetBufferSize(120, 1000);
var oldSentence = "This is a testSentence. some occurences "
+ "need capitalization! for examlpe here. or here? maybe "
+ "yes, maybe not.";
var newSentence = Capitalize(oldSentence);
Console.WriteLine(new string('*', 80));
Console.WriteLine(newSentence);
Console.ReadLine();
}
}
(*)
".?!".Any(p => p == ... )) means does the string ".?!" contain any character that equals ...
acc.Trim().Last() means: remove whitespaces in front/on end of acc and give me the last character
.Last() and .Any() are also Linq. Most of the Linq-esc extension can be found here: https://msdn.microsoft.com/en-us/library/9eekhta0(v=vs.110).aspx
Output (snipped - its rather longish ;o)
Accumulated:
Cecking: T
After: T
Accumulated: T
Cecking: h
After: Th
Accumulated: Th
Cecking: i
After: Thi
Accumulated: Thi
Cecking: s
After: This
Accumulated: This
Cecking:
After: This
Accumulated: This
Cecking: i
After: This i
Accumulated: This i
Cecking: s
After: This is
<snipp - .. you get the idea how Aggregate works ...>
Accumulated: This is a testSentence.
Cecking: s
After: This is a testSentence. S
<snipp>
Accumulated: This is a testSentence. Some occurences need capitalization!
Cecking: f
After: This is a testSentence. Some occurences need capitalization! F
<snipp>
********************************************************************************
This is a testSentence. Some occurences need capitalization! For examlpe here. Or here? Maybe yes, maybe not.
I would suggest to use a regex to identify all the dots in your sentence. Get the match, make it upper case and replace it back in the original sentence, in the match index. I dont actually have any IDE in which try the code on .NET right now but i can write it in pseudocode for better understanding.
String setence = "your.setence.goes.here";
Regex rx = new Regex("/\..*?[A-Z]/");
foreach (Match match in rx.Matches(sentence))
{
setence.remove(match.Index, 2).insert(match.Index, String.ToUpper(match.Value));
}
You have two tasks:
1) Split text into sentences
2) Capitalize the first char in the sentences
Task one can be very complex, e.g. because there a lot of crazy languages out there. Put since this is homework I assume you can go ahead and simply split by well know separators.
Task two is just about basic string operations. You select the first char, make it uppercase and add the missing part of the sentence via a substring operation.
Here is a code example:
char[] separators = new char[] { '!', '.', '?' };
string[] sentencesArray = "First sentence. second sentence!lastone.".Split(separators, StringSplitOptions.RemoveEmptyEntries);
var i = 0;
Array.ForEach(sentencesArray, e =>
{
sentencesArray[i] = e.Trim().First().ToString().ToUpper() +
e.Trim().Substring(1);
i++;
});
I Have created a method in Groovy for the same
String capitalizeFirstCharInSentence(String str) {
String result = ''
str = str.toLowerCase()
List<String> strings = str.tokenize('.')
strings.each { line ->
StringBuilder builder = new StringBuilder(line)
int i = 0
while (i < builder.size() - 1 && !Character.isLowerCase(builder.charAt(i))) {
i++
}
if (Character.isLowerCase(builder.charAt(i))) {
builder.setCharAt(i, builder.charAt(i).toUpperCase())
result += builder.toString() + '.'
}
}
return result
}
I liked the way you formatted your method because it made it easy for newer coders to read so I decided to try to make the code work while maintaining the structure. The main problem I saw was that you were not replacing the arrays after formatting them.
//Create method to process string.
private string Sentences(string input)
{
//Create a char array to hold all characters in input.
char[] letters = new char[input.Length];
//Read the characters from input into the array.
for (int i = 0; i < input.Length; i++)
{
letters[i] = input[i];
}
//Capitalize first letter of input.
letters[0] = char.ToUpper(letters[0]);
//Loop through array to test for punctuation and capitalize a character 2 index away.
for (int index = 0; index < letters.Length; index++)
{
if(char.IsPunctuation(letters[index]))
{
if (index + 2 <= letters.Length)
{
letters[index + 2] = char.ToUpper(letters[index+ 2]);
}
}
}
// convert array back to string
string results = new string(letters)
return results;
}

C# need to get fragment from Text

Hi I need to find longest sequence of words in text that's matching condition: if word is ending with letter N other word should start with letter N. N - could be any letter. For Example:
Simple Elephant
Apple Strong
So first line is matching my mentioned condition so I need to print it out to console.
Failing to think of an algorithm how this should work.
I make this code for you. Can answer to your questions.
List<string> sentenses = new List<string>();
sentenses.Add("hi, my name is Sam.");
sentenses.Add("Hi,is,settled,drums.");
sentenses.Add("Add all your sentenses here");
string longestSentense ="";
int longestCount = 0;
foreach(string sentense in sentenses)
{
string[] words = Regex.Split(sentense, "[^a-zA-Z]"); // cut sentense by all not letter character
int count = 0;
for (int i=0;i<words.Length-1;i++)
{
// check if last letter of words[i] is the same letter as the first or words[i+1]
if(words[i].Equals("") || words[i+1].Equals("")) continue; // don't look to "empty word"
if (words[i][words[i].Length-1].Equals(words[i + 1][0])) count++;
}
// if here is the biggest number of matching words, we save it
if(count>longestCount)
{
longestCount = count;
longestSentense = sentense;
}
}
Console.WriteLine("The sentence that contains the most of matching words : \n"
+ longestSentense + "\n"
+ " with " + longestCount + " junctions between matching words.");
Console.ReadKey();

For loop reading a file isn't working

I have this:
private void getAccount()
{
string[] acct = File.ReadAllLines(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + #"\Accts.txt");
for (int i = 0; i < acct[line].Length - 1; i++)
{
foreach (char c in acct[line])
{
if (c.ToString() == ":")
{
onPass = true;
i += 1;
}
if (onPass == false) { user += acct[line][i]; }
if (onPass == true) { pass += acct[line][i]; }
}
}
MessageBox.Show("Username is " + user + ". \n\nPassword is " + pass + ".");
onPass = false;
}
The file has this:
minicl55:mypass
However this outputs this:
These are the following problems:
The characters are repeated a lot
only "mmmmmmm" is considered part of the username, everything up until the colon should be part of the username, after is pass
The : is included in the password, it should be ignored completely (except to tell where the username stops and the password starts)
The first time you go through your for loop, i == 0. Then the foreach loop looks at each character in acct[line], but i never changes, so for all the characters prior to :, the acct[line][i] part keeps returning acct[line][0], or "m" 8 times. That's why the username appears to be "mmmmmmmm".
Then the colon is encountered and i is increased by 1. Now onPass == true, so pass ends up having acct[line][1], which is the character "i". This repeats for the rest of the string, so pass appears to be "iiiiiii" (from the colon to the end).
Now we go back to the for loop. Except i has been increased by 1 inside the loop (bad idea) so now the for loop is actually on i == 2. Again the beginning part executes 8 times (once for each character in the username), but always refers to acct[line][2], so the username is "nnnnnnnn". Except onPass is still true, so it gets appended to the password variable. Then you get 7 more "i"'s after i is increased.
The i variable is increased internally and in the for loop again, so next time you're using acct[line][4], which is "c" (8 times), then i is increased by 1 inside the foreach loop and you get acct[line][5] 7 times, which is "l".
So far, password is "iiiiiiinnnnnnnniiiiiiicccccccclllllll". Hopefully you can see the pattern.
You could eliminate some of the looping and complexity, and just use something like: (untested)
private void getAccount()
{
var allAccounts = File.ReadAllLines(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + #"\Accts.txt");
foreach (var account in allAccounts)
{
var pieces = account.Split(':');
MessageBox.Show(string.Format("Username is {0}. \n\nPassword is {1}.", pieces[0], pieces[1]));
}
}
Your outer loop is iterating over each char in acct[line]. Then you do the same in your inner loop, you just express it a little differently.
Please show your variables, but here's another approach:
private void getAccount()
{
string user = "";
string pass = "";
string[] user_pass = new string[0];
var accts = System.IO.File.ReadAllLines(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + #"\Accts.txt");
foreach(var acct in accts)
{
user_pass = acct.Split(':');
}
//Add iteration for multiple lines
if (user_pass.Length > 0)
{
MessageBox.Show("Username is " + user_pass[0] + ". \n\nPassword is " + user_pass[1] + ".");
}
else
{
MessageBox.Show("Chaos: Dogs and Cats Living Together!");
}
}
}
}
Well, I see your first loop gets the length of a specific line whose position doesnt change at all.
for (int i = 0; i < acct[line].Length - 1; i++)
And then you loop through every character of that only line
foreach (char c in acct[line])
The thing is that if your acct[line] has X length, you will loop through the acct[line] X times, hence why the repeated characters. You end up reading the same character X times.
As everyone else has commented/answered, your outer and inner loops are pretty much doing the exact same thing. I rewrote the for-loops so the outer loop loops through each line of the array of strings, and then the inside loop will go through all of the characters in that line.
for (int line = 0; line < acct.Length; line++)
{
int i = 0;
foreach (char c in acct[line])
{
if (c.ToString() == ":")
{
onPass = true;
}
else
{
if (!onPass)
user += acct[line][i];
else
pass += acct[line][i];
}
i++;
}
}
I do suggest however, for your own benefit, if you do NEED to loop through all of the characters to use this for the inner loop:
for (int i = 0; i < acct[line].Length; i++)
{
if (acct[line][i].ToString() == ":")
{
onPass = true;
}
else
{
if (!onPass)
user += acct[line][i];
else
pass += acct[line][i];
}
}
Or better yet replace everything with something simpler, and less prone to being broken by small changes:
for (int line = 0; line < acct.Length; line++)
{
if (acct[line].Contains(":"))
{
string[] parts = acct[line].Split(':');
user = parts[0];
pass = parts[1];
MessageBox.Show("Username is " + user + ". \n\nPassword is " + pass + ".");
}
}

Manipulating String in C#

I have a code in C# and have to print a label with the name of the seller, but i have a problem.
Every line in the label comport 20 letters and i have 2 lines to put this name.
I need to arrange the name of the seller in the 2 lines, without cut words.
For example - Name: JOSE MAURICIO BERTOLOTO MENDES
Line1: JOSE MAURICIO
Line2: BERTOLOTO MENDES
someone know how i do this?
Thanks
EDIT: Based in the answers, i implemente this code:
string[] SellerPrint = Seller.Split(' ');
Line1 = "";
Line2 = "";
foreach (string name in SellerPrint )
{
if (Line1.Length <= 20)
{
if ((Line1 + name).Length <= 20)
Line1 += (Line1.Length == 0) ? name : " " + name;
else
break;
}
}
Line2 = (Seller.Replace(Line1, "").Length <= 20) ? Seller.Replace(Line1+ " ", "") : Seller.Replace(Line1+ " ", "").Remove(20);
Thanks for the help!
You could simply split the string into words using string.Split() and then add to each as long it small enough to add to the line.
I also wouldn't use the character count but use Graphics.MeasureString() instead.
You can split the full name in to it's individual parts.
var names = fullname.Split(' ');
Which will give you a string[]. From there you can do the math by looking at length of each string.
The idea is that you want to append all parts of the name until you will reach or exceed your 20 character limit on the next token. When that happens, append a new line with that token and continue appending until you hit the character limit once again.
Here is a quick example:
public static string FormatName(string name)
{
const int MaxLength = 20;
if (string.IsNullOrEmpty(name))
throw new ArgumentNullException("name");
if (name.Length <= MaxLength)
return name;
string[] tokens = name.Split(' ');
if (tokens.Length == 0)
return name; //hyphen the name?
StringBuilder sb = new StringBuilder(name.Length);
int len = 0;
foreach (string token in tokens)
{
if (token.Length + len < MaxLength)
{
sb.Append(token + " ");
len += token.Length;
}
else
{
sb.Append(Environment.NewLine + token + " ");
len = 0;
}
}
return sb.ToString();
}
Note: I left the case open for when a section of the name, without spaces, is longer than 20 characters. Also, this example will continue on to the Nth line, if the name won't fit onto two lines.
Here is the logic.
Use String.split to split the name into an array. Iterate over the strings in the array, concat them into a line, while the line is less than 20 characters. A recursive function would be a good idea! When you are greater than two lines, drop the rest of the names that put it over.
I'm not sure but I think you can use a special character: '\n' (without the quotes)
Its basiclly stands for new line. So for example : JOSE MAURICIO BERTOLOTO MENDES will become JOSE MAURICIO \n BERTOLOTO MENDES.

Add to string until hits length (noobie C# guy)

I am trying to read a text file, break it into a string array, and then compile new strings out of the words, but I don't want it to exceed 120 characters in length.
What I am doing with is making it write PML to create a macro for some software I use, and the text can't exceed 120 characters. To take it even further I need to wrap the 120 characters or less (to the nearest word), string with "BTEXT |the string here|" which is the command.
Here is the code:
static void Main(string[] args)
{
int BIGSTRINGLEN = 120;
string readit = File.ReadAllText("C:\\stringtest.txt");
string finish = readit.Replace("\r\n", " ").Replace("\t", "");
string[] seeit = finish.Split(' ');
StringBuilder builder = new StringBuilder(BIGSTRINGLEN);
foreach(string word in seeit)
{
while (builder.Length + " " + word.Length <= BIGSTRINGLEN)
{
builder.Append(word)
}
}
}
Try using an if instead of the while as you will continually append the same word if not!!
Rather than read the entire file into memory, you can read it a line at a time. That will reduce your memory requirements and also prevent you having to replace the newlines.
StringBuilder builder = new StringBuilder(BIGSTRINGLEN);
foreach (var line in File.ReadLines(filename))
{
// clean up the line.
// Do you really want to replace tabs with nothing?
// if you want to treat tabs like spaces, change the call to Split
// and include '\t' in the character array.
string finish = line.Replace("\t", string.Empty);
string[] seeit = finish.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
foreach (string word in seeit)
{
if ((builder.Length + word.Length + 1 <= BIGSTRINGLEN)
{
if (builder.Length != 0)
builder.Append(' ');
builder.Append(word);
}
else
{
// output line
Console.WriteLine(builder.ToString());
// and reset the builder
builder.Length = 0;
}
}
}
// and write the last line
if (builder.Length > 0)
Console.WriteLine(builder.ToString());
That code is going to fail if a word is longer than BIGSTRINGLEN. Long words will end up outputting a blank line. I think you can figure out how to handle that case if it becomes a problem.
Matthew Moon is right - your while loop is not going to work as currently placed.
But that aside, you have some problems in this line
while (builder.Length + " " + word.Length <= BIGSTRINGLEN)
builder.Length and word.Length are integers - the number of characters in each word. " " is not an integer, it's a string. You can't correctly add 10 + " " + 5. You probably want
while (builder.Length + (" ").Length + word.Length <= BIGSTRINGLEN)
// or
while (builder.Length + 1 + word.Length <= BIGSTRINGLEN)

Categories