Substring Issue: Substring converting to char

Substring Issue: Substring converting to char - c#

I am making a typing game and I need to make a list of each character in a string so I can define what input the code should be expecting.
I tried:
static List<char> chars = "This is my string".ToCharArray().ToList();
But because char does not contain capitalization information it throws this error:
ArgumentException: InputKey named: T is unknown.
I knew char was not going to work, I needed each letter to be a string, not a char. So, I created a method using Substring:
static List<string> ToStringArray(string input)
{
List<string> strings = new List<string>();
for (int i = 0; i < input.Length; i++)
{
strings.Add(input.Substring(i, 1));
}
return strings;
}
static List<string> strings = ToStringArray("This is my string");
But apparently Substring is converting to a char because my code is still throwing the same error, and if I change the length of the substring to 2 my code works again. How can I force Substring to not convert to char? Or should I be approaching this problem in a completely different way?

I think you may be approaching this from a more complex angle than it needs to be.
If you have a:
string testString = "This is my string";
Then you can already access each individual character by index, such as testString[1] (which would be 'h')
If you're worried about case, then you can reference the string with
testString.ToLower();

Related

C# - Input string was not in a correct format

I am working on a simple windows forms application that the user enters a string with delimiters and I parse the string and only get the variables out of the string.
So for example if the user enters:
2X + 5Y + z^3
I extract the values 2,5 and 3 from the "equation" and simply add them together.
This is how I get the integer values from a string.
int thirdValue
string temp;
temp = Regex.Match(variables[3], #"\d+").Value
thirdValue = int.Parse(temp);
variables is just an array of strings I use to store strings after parsing.
However, I get the following error when I run the application:
Input string was not in a correct format

Why i everyone moaning about this question and marking it down? it's incredibly easy to explain what is happening and the questioner was right to say it as he did. There is nothing wrong whatsoever.
Regex.Match(variables[3], #"\d+").Value
throws a Input string was not in a correct format.. FormatException if the string (here it's variables[3]) doesn't contain any numbers. It also does it if it can't access variables[3] within the memory stack of an Array when running as a service. I SUSPECT THIS IS A BUG The error is that the .Value is empty and the .Match failed.
Now quite honestly this is a feature masquerading as a bug if you ask me, but it's meant to be a design feature. The right way (IMHO) to have done this method would be to return a blank string. But they don't they throw a FormatException. Go figure. It is for this reason you were advised by astef to not even bother with Regex because it throws exceptions and is confusing. But he got marked down too!
The way round it is to use this simple additional method they also made
if (Regex.IsMatch(variables[3], #"\d+")){
temp = Regex.Match(variables[3], #"\d+").Value
}
If this still doesn't work for you you cannot use Regex for this. I have seen in a c# service that this doesn't work and throws incorrect errors. So I had to stop using Regex

I prefer simple and lightweight solutions without Regex:
static class Program
{
static void Main()
{
Console.WriteLine("2X + 65Y + z^3".GetNumbersFromString().Sum());
Console.ReadLine();
}
static IEnumerable<int> GetNumbersFromString(this string input)
{
StringBuilder number = new StringBuilder();
foreach (char ch in input)
{
if (char.IsDigit(ch))
number.Append(ch);
else if (number.Length > 0)
{
yield return int.Parse(number.ToString());
number.Clear();
}
}
yield return int.Parse(number.ToString());
}
}

you can change the string to char array and check if its a digit and count them up.
string temp = textBox1.Text;
char[] arra = temp.ToCharArray();
int total = 0;
foreach (char t in arra)
{
if (char.IsDigit(t))
{
total += int.Parse(t + "");
}
}
textBox1.Text = total.ToString();

This should solve your problem:
string temp;
temp = Regex.Matches(textBox1.Text, #"\d+", RegexOptions.IgnoreCase)[2].Value;
int thirdValue = int.Parse(temp);

is there a splitByCharacterType method in c# like there is in Java?

In Java there is a method splitByCharacterType that takes a string, for example 0015j8*(, and split it into "0015","j","8","*","(". Is there a built in function like this in c#? If not how would I go around building a function to do this?

public static IEnumerable<string> SplitByCharacterType(string input)
{
if (String.IsNullOrEmpty(input))
throw new ArgumentNullException(nameof(input));
StringBuilder segment = new StringBuilder();
segment.Append(input[0]);
var current = Char.GetUnicodeCategory(input[0]);
for (int i = 1; i < input.Length; i++)
{
var next = Char.GetUnicodeCategory(input[i]);
if (next == current)
{
segment.Append(input[i]);
}
else
{
yield return segment.ToString();
segment.Clear();
segment.Append(input[i]);
current = next;
}
}
yield return segment.ToString();
}
Usage as follows:
string[] split = SplitByCharacterType("0015j8*(").ToArray();
And the result is "0015","j","8","*","("
I recommend you implement as an extension method.

I don't think that such method exist. You can follow steps as below to create your own utility method:
Create a list to hold split strings
Define strings with all your character types e.g.
string numberString = "0123456789";
string specialChars = "~!##$%^&*(){}|\/?";
string alphaChars = "abcde....XYZ";
Define a variable to hold the temporary string
Define a variable to note the type of chars
Traverse your string, one char at a time, check the type of char by checking the presence of the char in predefined type strings.
If type is new than the previous type(check the type variable value) then add the temporary string(not empty) to the list, assign the new type to type variable and assign the current char to the temp string. If otherwise, then append the char to temporary string.
In the end of traversal, add the temporary string(not empty) to the list
Now your list contains the split strings.
Convert the list to an string array and you are done.

You could maybe use regex class, somthing like below, but you will need to add support for other chars other than numbers and letters.
var chars = Regex.Matches("0015j8*(", #"((?:""[^""\\]*(?:\\.[^""\\]*)*"")|[a-z]|\d+)").Cast<Match>().Select(match => match.Value).ToArray();
Result
0015,J,8

Find if string contains at least 2 characters similar to another? C#

I need a method to check if a string contains one or more similar characters to another. I dont want to find all strings containing the letter "D".
For example, if I have a string "Christopher" and want to see if "Chris" is contained in "Christopher", I want that to return. However, if I want to see if "Candy" is in the string "Christopher", I wont want it to return just because it has a "C" in common.
I have tried the .Contains() method but cant give that rules for 2 or more similar characters and I have thought about using regular expressions but that might be a bit over kill. The similar letters must be next to eachother.
Thank you :)

This looks for each 2-character-gram of s1 and looks for it in s2.
string s1 = "Chrx";
string s2 = "Christopher";
IsMatchOn2Characters(s1, s2);
static bool IsMatchOn2Characters(string a, string b)
{
string s1 = a.ToLowerInvariant();
string s2 = b.ToLowerInvariant();
for (int i = 0; i < s1.Length - 1; i++)
{
if (s2.IndexOf(s1.Substring(i,2)) >= 0)
return true; // match
}
return false; // no match
}

This looks a lot like a longest common substring problem. This can be solved easily using DP in O(m*n).
If you are not worried about performance and don't really want to implement this, you can also go with the brute force solution of searching every substring of s1 into s2.

Parse without string split

This is a spin-off from the discussion in some other question.
Suppose I've got to parse a huge number of very long strings. Each string contains a sequence of doubles (in text representation, of course) separated by whitespace. I need to parse the doubles into a List<double>.
The standard parsing technique (using string.Split + double.TryParse) seems to be quite slow: for each of the numbers we need to allocate a string.
I tried to make it old C-like way: compute the indices of the beginning and the end of substrings containing the numbers, and parse it "in place", without creating additional string. (See http://ideone.com/Op6h0, below shown the relevant part.)
int startIdx, endIdx = 0;
while(true)
{
startIdx = endIdx;
// no find_first_not_of in C#
while (startIdx < s.Length && s[startIdx] == ' ') startIdx++;
if (startIdx == s.Length) break;
endIdx = s.IndexOf(' ', startIdx);
if (endIdx == -1) endIdx = s.Length;
// how to extract a double here?
}
There is an overload of string.IndexOf, searching only within a given substring, but I failed to find a method for parsing a double from substring, without actually extracting that substring first.
Does anyone have an idea?

There is no managed API to parse a double from a substring. My guess is that allocating the string will be insignificant compared to all the floating point operations in double.Parse.
Anyway, you can save the allocation by creating a "buffer" string once of length 100 consisting of whitespace only. Then, for every string you want to parse, you copy the chars into this buffer string using unsafe code. You fill the buffer string with whitespace. And for parsing you can use NumberStyles.AllowTrailingWhite which will cause trailing whitespace to be ignored.
Getting a pointer to string is actually a fully supported operation:
string l_pos = new string(' ', 100); //don't write to a shared string!
unsafe
{
fixed (char* l_pSrc = l_pos)
{
// do some work
}
}
C# has special syntax to bind a string to a char*.

if you want to do it really fast, i would use a state machine
this could look like:
enum State
{
Separator, Sign, Mantisse etc.
}
State CurrentState = State.Separator;
int Prefix, Exponent, Mantisse;
foreach(var ch in InputString)
{
switch(CurrentState)
{ // set new currentstate in dependence of ch and CurrentState
case Separator:
GotNewDouble(Prefix, Exponent, Mantisse);
}
}

Converting "Bizarre" Chars in String to Roman Chars

I need to be able to convert user input to [a-z] roman characters ONLY (not case sensitive). So, there are only 26 characters that I am interested in.
However, the user can type in any "form" of those characters that they wish. The Spanish "n", the French "e", and the German "u" can all have accents from the user input (which are removed by the program).
I've gotten pretty close with these two extension methods:
public static string LettersOnly(this string Instring)
{
char[] aChar = Instring.ToCharArray();
int intCount = 0;
string strTemp = "";
for (intCount = 0; intCount <= Instring.Length - 1; intCount++)
{
if (char.IsLetter(aChar[intCount]) )
{
strTemp += aChar[intCount];
}
}
return strTemp;
}
public static string RemoveAccentMarks(this string s)
{
string normalizedString = s.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
char c;
for (int i = 0; i <= normalizedString.Length - 1; i++)
{
c = normalizedString[i];
if (System.Globalization.CharUnicodeInfo.GetUnicodeCategory(c) != System.Globalization.UnicodeCategory.NonSpacingMark)
{
sb.Append(c);
}
}
return sb.ToString();
}
Here is an example test:
string input = "Àlièñ451";
input = input.LettersOnly().RemoveAccentMarks().ToLower();
console.WriteLine(input);
Result: "alien" (as expected)
This works for 99.9% of the cases. However, a few characters seem to pass all of the checks.
For instance, "ß" (a German double-s, I think). This is considered by .Net to be a letter. This is not considered by the function above to have any accent marks... but it STILL isn't in the range of a-z, like I need it to be. Ideally, I could convert this to a "B" or an "ss" (whichever is appropriate), but I need to convert it to SOMETHING in the range of a-z.
Another example, the dipthong ("æ"). Again, .Net considers this a "letter". The function above doesn't see any accent, but again, it isn't in the roman 26 character alphabet. In this case, I need to convert to the two letters "ae" (I think).
Is there an easy way to convert ANY worldwide input to the closest roman alphabet equivalent? It is expected that this probably won't be a perfectly clean translation, but I need to trust that the inputs at FlipScript.com are ONLY getting the characters a-z... and nothing else.
Any and all help appreciated.

If I were you, I'd create a Dictionary which would contain the mappings from foreign letters to Roman letters. I'd use this for two reasons:
It will make understanding what you want to do easier to someone who is reading your code.
There are a small, finite, number of these special letters so you don't need to worry about maintenance of the data structure.
I'd put the mappings into an xml file then load them into the data structure at run-time. That way, you do not need to modify any code which uses the characters, you only need to specify the mappings themselves.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Substring Issue: Substring converting to char - c#

Related

C# - Input string was not in a correct format

is there a splitByCharacterType method in c# like there is in Java?

Find if string contains at least 2 characters similar to another? C#

Parse without string split

Converting "Bizarre" Chars in String to Roman Chars

Categories

Resources