Best way to convert Pascal Case to a sentence - c#

What is the best way to convert from Pascal Case (upper Camel Case) to a sentence.
For example starting with
"AwaitingFeedback"
and converting that to
"Awaiting feedback"
C# preferable but I could convert it from Java or similar.

public static string ToSentenceCase(this string str)
{
return Regex.Replace(str, "[a-z][A-Z]", m => m.Value[0] + " " + char.ToLower(m.Value[1]));
}
In versions of visual studio after 2015, you can do
public static string ToSentenceCase(this string str)
{
return Regex.Replace(str, "[a-z][A-Z]", m => $"{m.Value[0]} {char.ToLower(m.Value[1])}");
}
Based on: Converting Pascal case to sentences using regular expression

I will prefer to use Humanizer for this. Humanizer is a Portable Class Library that meets all your .NET needs for manipulating and displaying strings, enums, dates, times, timespans, numbers and quantities.
Short Answer
"AwaitingFeedback".Humanize() => Awaiting feedback
Long and Descriptive Answer
Humanizer can do a lot more work other examples are:
"PascalCaseInputStringIsTurnedIntoSentence".Humanize() => "Pascal case input string is turned into sentence"
"Underscored_input_string_is_turned_into_sentence".Humanize() => "Underscored input string is turned into sentence"
"Can_return_title_Case".Humanize(LetterCasing.Title) => "Can Return Title Case"
"CanReturnLowerCase".Humanize(LetterCasing.LowerCase) => "can return lower case"
Complete code is :
using Humanizer;
using static System.Console;
namespace HumanizerConsoleApp
{
class Program
{
static void Main(string[] args)
{
WriteLine("AwaitingFeedback".Humanize());
WriteLine("PascalCaseInputStringIsTurnedIntoSentence".Humanize());
WriteLine("Underscored_input_string_is_turned_into_sentence".Humanize());
WriteLine("Can_return_title_Case".Humanize(LetterCasing.Title));
WriteLine("CanReturnLowerCase".Humanize(LetterCasing.LowerCase));
}
}
}
Output
Awaiting feedback
Pascal case input string is turned into sentence
Underscored input string is turned into sentence Can Return Title Case
can return lower case
If you prefer to write your own C# code you can achieve this by writing some C# code stuff as answered by others already.

Here you go...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace CamelCaseToString
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(CamelCaseToString("ThisIsYourMasterCallingYou"));
}
private static string CamelCaseToString(string str)
{
if (str == null || str.Length == 0)
return null;
StringBuilder retVal = new StringBuilder(32);
retVal.Append(char.ToUpper(str[0]));
for (int i = 1; i < str.Length; i++ )
{
if (char.IsLower(str[i]))
{
retVal.Append(str[i]);
}
else
{
retVal.Append(" ");
retVal.Append(char.ToLower(str[i]));
}
}
return retVal.ToString();
}
}
}

This works for me:
Regex.Replace(strIn, "([A-Z]{1,2}|[0-9]+)", " $1").TrimStart()

This is just like #SSTA, but is more efficient than calling TrimStart.
Regex.Replace("ThisIsMyCapsDelimitedString", "(\\B[A-Z])", " $1")

Found this in the MvcContrib source, doesn't seem to be mentioned here yet.
return Regex.Replace(input, "([A-Z])", " $1", RegexOptions.Compiled).Trim();

Just because everyone has been using Regex (except this guy), here's an implementation with StringBuilder that was about 5x faster in my tests. Includes checking for numbers too.
"SomeBunchOfCamelCase2".FromCamelCaseToSentence == "Some Bunch Of Camel Case 2"
public static string FromCamelCaseToSentence(this string input) {
if(string.IsNullOrEmpty(input)) return input;
var sb = new StringBuilder();
// start with the first character -- consistent camelcase and pascal case
sb.Append(char.ToUpper(input[0]));
// march through the rest of it
for(var i = 1; i < input.Length; i++) {
// any time we hit an uppercase OR number, it's a new word
if(char.IsUpper(input[i]) || char.IsDigit(input[i])) sb.Append(' ');
// add regularly
sb.Append(input[i]);
}
return sb.ToString();
}

Here's a basic way of doing it that I came up with using Regex
public static string CamelCaseToSentence(this string value)
{
var sb = new StringBuilder();
var firstWord = true;
foreach (var match in Regex.Matches(value, "([A-Z][a-z]+)|[0-9]+"))
{
if (firstWord)
{
sb.Append(match.ToString());
firstWord = false;
}
else
{
sb.Append(" ");
sb.Append(match.ToString().ToLower());
}
}
return sb.ToString();
}
It will also split off numbers which I didn't specify but would be useful.

string camel = "MyCamelCaseString";
string s = Regex.Replace(camel, "([A-Z])", " $1").ToLower().Trim();
Console.WriteLine(s.Substring(0,1).ToUpper() + s.Substring(1));
Edit: didn't notice your casing requirements, modifed accordingly. You could use a matchevaluator to do the casing, but I think a substring is easier. You could also wrap it in a 2nd regex replace where you change the first character
"^\w"
to upper
\U (i think)

I'd use a regex, inserting a space before each upper case character, then lowering all the string.
string spacedString = System.Text.RegularExpressions.Regex.Replace(yourString, "\B([A-Z])", " \k");
spacedString = spacedString.ToLower();

It is easy to do in JavaScript (or PHP, etc.) where you can define a function in the replace call:
var camel = "AwaitingFeedbackDearMaster";
var sentence = camel.replace(/([A-Z].)/g, function (c) { return ' ' + c.toLowerCase(); });
alert(sentence);
Although I haven't solved the initial cap problem... :-)
Now, for the Java solution:
String ToSentence(String camel)
{
if (camel == null) return ""; // Or null...
String[] words = camel.split("(?=[A-Z])");
if (words == null) return "";
if (words.length == 1) return words[0];
StringBuilder sentence = new StringBuilder(camel.length());
if (words[0].length() > 0) // Just in case of camelCase instead of CamelCase
{
sentence.append(words[0] + " " + words[1].toLowerCase());
}
else
{
sentence.append(words[1]);
}
for (int i = 2; i < words.length; i++)
{
sentence.append(" " + words[i].toLowerCase());
}
return sentence.toString();
}
System.out.println(ToSentence("AwaitingAFeedbackDearMaster"));
System.out.println(ToSentence(null));
System.out.println(ToSentence(""));
System.out.println(ToSentence("A"));
System.out.println(ToSentence("Aaagh!"));
System.out.println(ToSentence("stackoverflow"));
System.out.println(ToSentence("disableGPS"));
System.out.println(ToSentence("Ahh89Boo"));
System.out.println(ToSentence("ABC"));
Note the trick to split the sentence without loosing any character...

Pseudo-code:
NewString = "";
Loop through every char of the string (skip the first one)
If char is upper-case ('A'-'Z')
NewString = NewString + ' ' + lowercase(char)
Else
NewString = NewString + char
Better ways can perhaps be done by using regex or by string replacement routines (replace 'X' with ' x')

An xquery solution that works for both UpperCamel and lowerCamel case:
To output sentence case (only the first character of the first word is capitalized):
declare function content:sentenceCase($string)
{
let $firstCharacter := substring($string, 1, 1)
let $remainingCharacters := substring-after($string, $firstCharacter)
return
concat(upper-case($firstCharacter),lower-case(replace($remainingCharacters, '([A-Z])', ' $1')))
};
To output title case (first character of each word capitalized):
declare function content:titleCase($string)
{
let $firstCharacter := substring($string, 1, 1)
let $remainingCharacters := substring-after($string, $firstCharacter)
return
concat(upper-case($firstCharacter),replace($remainingCharacters, '([A-Z])', ' $1'))
};

Found myself doing something similar, and I appreciate having a point-of-departure with this discussion. This is my solution, placed as an extension method to the string class in the context of a console application.
using System;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string piratese = "avastTharMatey";
string ivyese = "CheerioPipPip";
Console.WriteLine("{0}\n{1}\n", piratese.CamelCaseToString(), ivyese.CamelCaseToString());
Console.WriteLine("For Pete\'s sake, man, hit ENTER!");
string strExit = Console.ReadLine();
}
}
public static class StringExtension
{
public static string CamelCaseToString(this string str)
{
StringBuilder retVal = new StringBuilder(32);
if (!string.IsNullOrEmpty(str))
{
string strTrimmed = str.Trim();
if (!string.IsNullOrEmpty(strTrimmed))
{
retVal.Append(char.ToUpper(strTrimmed[0]));
if (strTrimmed.Length > 1)
{
for (int i = 1; i < strTrimmed.Length; i++)
{
if (char.IsUpper(strTrimmed[i])) retVal.Append(" ");
retVal.Append(char.ToLower(strTrimmed[i]));
}
}
}
}
return retVal.ToString();
}
}
}

Most of the preceding answers split acronyms and numbers, adding a space in front of each character. I wanted acronyms and numbers to be kept together so I have a simple state machine that emits a space every time the input transitions from one state to the other.
/// <summary>
/// Add a space before any capitalized letter (but not for a run of capitals or numbers)
/// </summary>
internal static string FromCamelCaseToSentence(string input)
{
if (string.IsNullOrEmpty(input)) return String.Empty;
var sb = new StringBuilder();
bool upper = true;
for (var i = 0; i < input.Length; i++)
{
bool isUpperOrDigit = char.IsUpper(input[i]) || char.IsDigit(input[i]);
// any time we transition to upper or digits, it's a new word
if (!upper && isUpperOrDigit)
{
sb.Append(' ');
}
sb.Append(input[i]);
upper = isUpperOrDigit;
}
return sb.ToString();
}
And here's some tests:
[TestCase(null, ExpectedResult = "")]
[TestCase("", ExpectedResult = "")]
[TestCase("ABC", ExpectedResult = "ABC")]
[TestCase("abc", ExpectedResult = "abc")]
[TestCase("camelCase", ExpectedResult = "camel Case")]
[TestCase("PascalCase", ExpectedResult = "Pascal Case")]
[TestCase("Pascal123", ExpectedResult = "Pascal 123")]
[TestCase("CustomerID", ExpectedResult = "Customer ID")]
[TestCase("CustomABC123", ExpectedResult = "Custom ABC123")]
public string CanSplitCamelCase(string input)
{
return FromCamelCaseToSentence(input);
}

Mostly already answered here
Small chage to the accepted answer, to convert the second and subsequent Capitalised letters to lower case, so change
if (char.IsUpper(text[i]))
newText.Append(' ');
newText.Append(text[i]);
to
if (char.IsUpper(text[i]))
{
newText.Append(' ');
newText.Append(char.ToLower(text[i]));
}
else
newText.Append(text[i]);

Here is my implementation. This is the fastest that I got while avoiding creating spaces for abbreviations.
public static string PascalCaseToSentence(string input)
{
if (string.IsNullOrEmpty(input) || input.Length < 2)
return input;
var sb = new char[input.Length + ((input.Length + 1) / 2)];
var len = 0;
var lastIsLower = false;
for (int i = 0; i < input.Length; i++)
{
var current = input[i];
if (current < 97)
{
if (lastIsLower)
{
sb[len] = ' ';
len++;
}
lastIsLower = false;
}
else
{
lastIsLower = true;
}
sb[len] = current;
len++;
}
return new string(sb, 0, len);
}

Related

C# How to generate a new string based on multiple ranged index

Let's say I have a string like this one, left part is a word, right part is a collection of indices (single or range) used to reference furigana (phonetics) for kanjis in my word:
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす"
The pattern in detail:
word,<startIndex>(-<endIndex>):<furigana>
What would be the best way to achieve something like this (with a space in front of the kanji to mark which part is linked to the [furigana]):
子[こ]で 子[こ]にならぬ 時鳥[ほととぎす]
Edit: (thanks for your comments guys)
Here is what I wrote so far:
static void Main(string[] args)
{
string myString = "ABCDEF,1:test;3:test2";
//Split Kanjis / Indices
string[] tokens = myString.Split(',');
//Extract furigana indices
string[] indices = tokens[1].Split(';');
//Dictionnary to store furigana indices
Dictionary<string, string> furiganaIndices = new Dictionary<string, string>();
//Collect
foreach (string index in indices)
{
string[] splitIndex = index.Split(':');
furiganaIndices.Add(splitIndex[0], splitIndex[1]);
}
//Processing
string result = tokens[0] + ",";
for (int i = 0; i < tokens[0].Length; i++)
{
string currentIndex = i.ToString();
if (furiganaIndices.ContainsKey(currentIndex)) //add [furigana]
{
string currentFurigana = furiganaIndices[currentIndex].ToString();
result = result + " " + tokens[0].ElementAt(i) + string.Format("[{0}]", currentFurigana);
}
else //nothing to add
{
result = result + tokens[0].ElementAt(i);
}
}
File.AppendAllText(#"D:\test.txt", result + Environment.NewLine);
}
Result:
ABCDEF,A B[test]C D[test2]EF
I struggle to find a way to process ranged indices:
string myString = "ABCDEF,1:test;2-3:test2";
Result : ABCDEF,A B[test] CD[test2]EF
I don't have anything against manually manipulating strings per se. But given that you seem to have a regular pattern describing the inputs, it seems to me that a solution that uses regex would be more maintainable and readable. So with that in mind, here's an example program that takes that approach:
class Program
{
private const string _kinvalidFormatException = "Invalid format for edit specification";
private static readonly Regex
regex1 = new Regex(#"(?<word>[^,]+),(?<edit>(?:\d+)(?:-(?:\d+))?:(?:[^;]+);?)+", RegexOptions.Compiled),
regex2 = new Regex(#"(?<start>\d+)(?:-(?<end>\d+))?:(?<furigana>[^;]+);?", RegexOptions.Compiled);
static void Main(string[] args)
{
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす";
string result = EditString(myString);
}
private static string EditString(string myString)
{
Match editsMatch = regex1.Match(myString);
if (!editsMatch.Success)
{
throw new ArgumentException(_kinvalidFormatException);
}
int ichCur = 0;
string input = editsMatch.Groups["word"].Value;
StringBuilder text = new StringBuilder();
foreach (Capture capture in editsMatch.Groups["edit"].Captures)
{
Match oneEditMatch = regex2.Match(capture.Value);
if (!oneEditMatch.Success)
{
throw new ArgumentException(_kinvalidFormatException);
}
int start, end;
if (!int.TryParse(oneEditMatch.Groups["start"].Value, out start))
{
throw new ArgumentException(_kinvalidFormatException);
}
Group endGroup = oneEditMatch.Groups["end"];
if (endGroup.Success)
{
if (!int.TryParse(endGroup.Value, out end))
{
throw new ArgumentException(_kinvalidFormatException);
}
}
else
{
end = start;
}
text.Append(input.Substring(ichCur, start - ichCur));
if (text.Length > 0)
{
text.Append(' ');
}
ichCur = end + 1;
text.Append(input.Substring(start, ichCur - start));
text.Append(string.Format("[{0}]", oneEditMatch.Groups["furigana"]));
}
if (ichCur < input.Length)
{
text.Append(input.Substring(ichCur));
}
return text.ToString();
}
}
Notes:
This implementation assumes that the edit specifications will be listed in order and won't overlap. It makes no attempt to validate that part of the input; depending on where you are getting your input from you may want to add that. If it's valid for the specifications to be listed out of order, you can also extend the above to first store the edits in a list and sort the list by the start index before actually editing the string. (In similar fashion to the way the other proposed answer works; though, why they are using a dictionary instead of a simple list to store the individual edits, I have no idea…that seems arbitrarily complicated to me.)
I included basic input validation, throwing exceptions where failures occur in the pattern matching. A more user-friendly implementation would add more specific information to each exception, describing what part of the input actually was invalid.
The Regex class actually has a Replace() method, which allows for complete customization. The above could have been implemented that way, using Replace() and a MatchEvaluator to provide the replacement text, instead of just appending text to a StringBuilder. Which way to do it is mostly a matter of preference, though the MatchEvaluator might be preferred if you have a need for more flexible implementation options (i.e. if the exact format of the result can vary).
If you do choose to use the other proposed answer, I strongly recommend you use StringBuilder instead of simply concatenating onto the results variable. For short strings it won't matter much, but you should get into the habit of always using StringBuilder when you have a loop that is incrementally adding onto a string value, because for long string the performance implications of using concatenation can be very negative.
This should do it (and even handle ranged indices), based on the formatting of the input string you have-
using System;
using System.Collections.Generic;
public class stringParser
{
private struct IndexElements
{
public int start;
public int end;
public string value;
}
public static void Main()
{
//input string
string myString = "子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす";
int wordIndexSplit = myString.IndexOf(',');
string word = myString.Substring(0,wordIndexSplit);
string indices = myString.Substring(wordIndexSplit + 1);
string[] eachIndex = indices.Split(';');
Dictionary<int,IndexElements> index = new Dictionary<int,IndexElements>();
string[] elements;
IndexElements e;
int dash;
int n = 0;
int last = -1;
string results = "";
foreach (string s in eachIndex)
{
e = new IndexElements();
elements = s.Split(':');
if (elements[0].Contains("-"))
{
dash = elements[0].IndexOf('-');
e.start = int.Parse(elements[0].Substring(0,dash));
e.end = int.Parse(elements[0].Substring(dash + 1));
}
else
{
e.start = int.Parse(elements[0]);
e.end = e.start;
}
e.value = elements[1];
index.Add(n,e);
n++;
}
//this is the part that takes the "setup" from the parts above and forms the result string
//loop through each of the "indices" parsed above
for (int i = 0; i < index.Count; i++)
{
//if this is the first iteration through the loop, and the first "index" does not start
//at position 0, add the beginning characters before its start
if (last == -1 && index[i].start > 0)
{
results += word.Substring(0,index[i].start);
}
//if this is not the first iteration through the loop, and the previous iteration did
//not stop at the position directly before the start of the current iteration, add
//the intermediary chracters
else if (last != -1 && last + 1 != index[i].start)
{
results += word.Substring(last + 1,index[i].start - (last + 1));
}
//add the space before the "index" match, the actual match, and then the formatted "index"
results += " " + word.Substring(index[i].start,(index[i].end - index[i].start) + 1)
+ "[" + index[i].value + "]";
//remember the position of the ending for the next iteration
last = index[i].end;
}
//if the last "index" did not stop at the end of the input string, add the remaining characters
if (index[index.Keys.Count - 1].end + 1 < word.Length)
{
results += word.Substring(index[index.Keys.Count-1].end + 1);
}
//trimming spaces that may be left behind
results = results.Trim();
Console.WriteLine("INPUT - " + myString);
Console.WriteLine("OUTPUT - " + results);
Console.Read();
}
}
input - 子で子にならぬ時鳥,0:こ;2:こ;7-8:ほととぎす
output - 子[こ]で 子[こ]にならぬ 時鳥[ほととぎす]
Note that this should also work with characters the English alphabet if you wanted to use English instead-
input - iliketocodeverymuch,2:A;4-6:B;9-12:CDEFG
output - il i[A]k eto[B]co deve[CDEFG]rymuch

How to keep only numbers and some special characters in a string?

I have a string containing regular characters, special characters and numbers. I'm trying to remove the regular characters, just keeping the numbers and the special characters. I use a loop to check if a character is a special character or a number. Then, I replace it with an empty string. However, this doesn't seem to work because I get an error "can't apply != to string or char". My code is below. If possible, please give me some ideas to fix this. Thanks.
public string convert_string_to_no(string val)
{
string str_val = "";
int val_len = val.Length;
for (int i = 0; i < val_len; i++)
{
char myChar = Convert.ToChar(val.Substring(i, 1));
if ((char.IsDigit(myChar) == false) && (myChar != "-"))
{
str_val = str_val.replace(str_val.substring(i,1),"");
}
}
return str_val;
}
you can use regular expressions to do that.its faster than using loop and clean
String test ="Q1W2-hjkxas1-EE3R4-5T";
Regex rgx = new Regex("[^0-9-]");
Console.WriteLine(rgx.Replace(test, ""));
check the working code here
It seem try to change "-" to '-', and better to construct the string and not replacing the char.
public string convert_string_to_no(string val)
{
string str_val = "";
int val_len = val.Length;
for (int i = 0; i < val_len; i++)
{
char myChar = Convert.ToChar(val.Substring(i, 1));
if (char.IsDigit(myChar) && myChar == '-')
{
str_val += myChar;
}
}
return str_val;
}
I perfer Linq:
public static class StringExtensions
{
public static string ToLimitedString(this string instance,
string validCharacters)
{
// null reference checking...
var result = new string(instance
.Where(c => validCharacters.Contains(c))
.ToArray());
return result;
}
}
usage:
var test ="Q1W2-hjkxas1-EE3R4-5T";
var limited = test.ToLimitedString("01234567890-");
Console.WriteLine(limited);
result:
12-1-34-5
DotNetFiddle Example

Reverse a String without using Reverse. It works, but why?

Ok, so a friend of mine asked me to help him out with a string reverse method that can be reused without using String.Reverse (it's a homework assignment for him). Now, I did, below is the code. It works. Splendidly actually. Obviously by looking at it you can see the larger the string the longer the time it takes to work. However, my question is WHY does it work? Programming is a lot of trial and error, and I was more pseudocoding than actual coding and it worked lol.
Can someone explain to me how exactly reverse = ch + reverse; is working? I don't understand what is making it go into reverse :/
class Program
{
static void Reverse(string x)
{
string text = x;
string reverse = string.Empty;
foreach (char ch in text)
{
reverse = ch + reverse;
// this shows the building of the new string.
// Console.WriteLine(reverse);
}
Console.WriteLine(reverse);
}
static void Main(string[] args)
{
string comingin;
Console.WriteLine("Write something");
comingin = Console.ReadLine();
Reverse(comingin);
// pause
Console.ReadLine();
}
}
If the string passed through is "hello", the loop will be doing this:
reverse = 'h' + string.Empty
reverse = 'e' + 'h'
reverse = 'l' + 'eh'
until it's equal to
olleh
If your string is My String, then:
Pass 1, reverse = 'M'
Pass 2, reverse = 'yM'
Pass 3, reverse = ' yM'
You're taking each char and saying "that character and tack on what I had before after it".
I think your question has been answered. My reply goes beyond the immediate question and more to the spirit of the exercise. I remember having this task many decades ago in college, when memory and mainframe (yikes!) processing time was at a premium. Our task was to reverse an array or string, which is an array of characters, without creating a 2nd array or string. The spirit of the exercise was to teach one to be mindful of available resources.
In .NET, a string is an immutable object, so I must use a 2nd string. I wrote up 3 more examples to demonstrate different techniques that may be faster than your method, but which shouldn't be used to replace the built-in .NET Replace method. I'm partial to the last one.
// StringBuilder inserting at 0 index
public static string Reverse2(string inputString)
{
var result = new StringBuilder();
foreach (char ch in inputString)
{
result.Insert(0, ch);
}
return result.ToString();
}
// Process inputString backwards and append with StringBuilder
public static string Reverse3(string inputString)
{
var result = new StringBuilder();
for (int i = inputString.Length - 1; i >= 0; i--)
{
result.Append(inputString[i]);
}
return result.ToString();
}
// Convert string to array and swap pertinent items
public static string Reverse4(string inputString)
{
var chars = inputString.ToCharArray();
for (int i = 0; i < (chars.Length/2); i++)
{
var temp = chars[i];
chars[i] = chars[chars.Length - 1 - i];
chars[chars.Length - 1 - i] = temp;
}
return new string(chars);
}
Please imagine that you entrance string is "abc". After that you can see that letters are taken one by one and add to the start of the new string:
reverse = "", ch='a' ==> reverse (ch+reverse) = "a"
reverse= "a", ch='b' ==> reverse (ch+reverse) = b+a = "ba"
reverse= "ba", ch='c' ==> reverse (ch+reverse) = c+ba = "cba"
To test the suggestion by Romoku of using StringBuilder I have produced the following code.
public static void Reverse(string x)
{
string text = x;
string reverse = string.Empty;
foreach (char ch in text)
{
reverse = ch + reverse;
}
Console.WriteLine(reverse);
}
public static void ReverseFast(string x)
{
string text = x;
StringBuilder reverse = new StringBuilder();
for (int i = text.Length - 1; i >= 0; i--)
{
reverse.Append(text[i]);
}
Console.WriteLine(reverse);
}
public static void Main(string[] args)
{
int abcx = 100; // amount of abc's
string abc = "";
for (int i = 0; i < abcx; i++)
abc += "abcdefghijklmnopqrstuvwxyz";
var x = new System.Diagnostics.Stopwatch();
x.Start();
Reverse(abc);
x.Stop();
string ReverseMethod = "Reverse Method: " + x.ElapsedMilliseconds.ToString();
x.Restart();
ReverseFast(abc);
x.Stop();
Console.Clear();
Console.WriteLine("Method | Milliseconds");
Console.WriteLine(ReverseMethod);
Console.WriteLine("ReverseFast Method: " + x.ElapsedMilliseconds.ToString());
System.Console.Read();
}
On my computer these are the speeds I get per amount of alphabet(s).
100 ABC(s)
Reverse ~5-10ms
FastReverse ~5-15ms
1000 ABC(s)
Reverse ~120ms
FastReverse ~20ms
10000 ABC(s)
Reverse ~16,852ms!!!
FastReverse ~262ms
These time results will vary greatly depending on the computer but one thing is for certain if you are processing more than 100k characters you are insane for not using StringBuilder! On the other hand if you are processing less than 2000 characters the overhead from the StringBuilder definitely seems to catch up with its performance boost.

Parse an integer from a string with trailing garbage

I need to parse a decimal integer that appears at the start of a string.
There may be trailing garbage following the decimal number. This needs to be ignored (even if it contains other numbers.)
e.g.
"1" => 1
" 42 " => 42
" 3 -.X.-" => 3
" 2 3 4 5" => 2
Is there a built-in method in the .NET framework to do this?
int.TryParse() is not suitable. It allows trailing spaces but not other trailing characters.
It would be quite easy to implement this but I would prefer to use the standard method if it exists.
You can use Linq to do this, no Regular Expressions needed:
public static int GetLeadingInt(string input)
{
return Int32.Parse(new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray()));
}
This works for all your provided examples:
string[] tests = new string[] {
"1",
" 42 ",
" 3 -.X.-",
" 2 3 4 5"
};
foreach (string test in tests)
{
Console.WriteLine("Result: " + GetLeadingInt(test));
}
foreach (var m in Regex.Matches(" 3 - .x. 4", #"\d+"))
{
Console.WriteLine(m);
}
Updated per comments
Not sure why you don't like regular expressions, so I'll just post what I think is the shortest solution.
To get first int:
Match match = Regex.Match(" 3 - .x. - 4", #"\d+");
if (match.Success)
Console.WriteLine(int.Parse(match.Value));
There's no standard .NET method for doing this - although I wouldn't be surprised to find that VB had something in the Microsoft.VisualBasic assembly (which is shipped with .NET, so it's not an issue to use it even from C#).
Will the result always be non-negative (which would make things easier)?
To be honest, regular expressions are the easiest option here, but...
public static string RemoveCruftFromNumber(string text)
{
int end = 0;
// First move past leading spaces
while (end < text.Length && text[end] == ' ')
{
end++;
}
// Now move past digits
while (end < text.Length && char.IsDigit(text[end]))
{
end++;
}
return text.Substring(0, end);
}
Then you just need to call int.TryParse on the result of RemoveCruftFromNumber (don't forget that the integer may be too big to store in an int).
I like #Donut's approach.
I'd like to add though, that char.IsDigit and char.IsNumber also allow for some unicode characters which are digits in other languages and scripts (see here).
If you only want to check for the digits 0 to 9 you could use "0123456789".Contains(c).
Three example implementions:
To remove trailing non-digit characters:
var digits = new string(input.Trim().TakeWhile(c =>
("0123456789").Contains(c)
).ToArray());
To remove leading non-digit characters:
var digits = new string(input.Trim().SkipWhile(c =>
!("0123456789").Contains(c)
).ToArray());
To remove all non-digit characters:
var digits = new string(input.Trim().Where(c =>
("0123456789").Contains(c)
).ToArray());
And of course: int.Parse(digits) or int.TryParse(digits, out output)
This doesn't really answer your question (about a built-in C# method), but you could try chopping off characters at the end of the input string one by one until int.TryParse() accepts it as a valid number:
for (int p = input.Length; p > 0; p--)
{
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);
Of course, this will be slow if input is very long.
ADDENDUM (March 2016)
This could be made faster by chopping off all non-digit/non-space characters on the right before attempting each parse:
for (int p = input.Length; p > 0; p--)
{
char ch;
do
{
ch = input[--p];
} while ((ch < '0' || ch > '9') && ch != ' ' && p > 0);
p++;
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);
string s = " 3 -.X.-".Trim();
string collectedNumber = string.empty;
int i;
for (x = 0; x < s.length; x++)
{
if (int.TryParse(s[x], out i))
collectedNumber += s[x];
else
break; // not a number - that's it - get out.
}
if (int.TryParse(collectedNumber, out i))
Console.WriteLine(i);
else
Console.WriteLine("no number found");
This is how I would have done it in Java:
int parseLeadingInt(String input)
{
NumberFormat fmt = NumberFormat.getIntegerInstance();
fmt.setGroupingUsed(false);
return fmt.parse(input, new ParsePosition(0)).intValue();
}
I was hoping something similar would be possible in .NET.
This is the regex-based solution I am currently using:
int? parseLeadingInt(string input)
{
int result = 0;
Match match = Regex.Match(input, "^[ \t]*\\d+");
if (match.Success && int.TryParse(match.Value, out result))
{
return result;
}
return null;
}
Might as well add mine too.
string temp = " 3 .x£";
string numbersOnly = String.Empty;
int tempInt;
for (int i = 0; i < temp.Length; i++)
{
if (Int32.TryParse(Convert.ToString(temp[i]), out tempInt))
{
numbersOnly += temp[i];
}
}
Int32.TryParse(numbersOnly, out tempInt);
MessageBox.Show(tempInt.ToString());
The message box is just for testing purposes, just delete it once you verify the method is working.
I'm not sure why you would avoid Regex in this situation.
Here's a little hackery that you can adjust to your needs.
" 3 -.X.-".ToCharArray().FindInteger().ToList().ForEach(Console.WriteLine);
public static class CharArrayExtensions
{
public static IEnumerable<char> FindInteger(this IEnumerable<char> array)
{
foreach (var c in array)
{
if(char.IsNumber(c))
yield return c;
}
}
}
EDIT:
That's true about the incorrect result (and the maintenance dev :) ).
Here's a revision:
public static int FindFirstInteger(this IEnumerable<char> array)
{
bool foundInteger = false;
var ints = new List<char>();
foreach (var c in array)
{
if(char.IsNumber(c))
{
foundInteger = true;
ints.Add(c);
}
else
{
if(foundInteger)
{
break;
}
}
}
string s = string.Empty;
ints.ForEach(i => s += i.ToString());
return int.Parse(s);
}
private string GetInt(string s)
{
int i = 0;
s = s.Trim();
while (i<s.Length && char.IsDigit(s[i])) i++;
return s.Substring(0, i);
}
Similar to Donut's above but with a TryParse:
private static bool TryGetLeadingInt(string input, out int output)
{
var trimmedString = new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray());
var canParse = int.TryParse( trimmedString, out output);
return canParse;
}

Converting string to title case

I have a string which contains words in a mixture of upper and lower case characters.
For example: string myData = "a Simple string";
I need to convert the first character of each word (separated by spaces) into upper case. So I want the result as: string myData ="A Simple String";
Is there any easy way to do this? I don't want to split the string and do the conversion (that will be my last resort). Also, it is guaranteed that the strings are in English.
MSDN : TextInfo.ToTitleCase
Make sure that you include: using System.Globalization
string title = "war and peace";
TextInfo textInfo = new CultureInfo("en-US", false).TextInfo;
title = textInfo.ToTitleCase(title);
Console.WriteLine(title) ; //War And Peace
//When text is ALL UPPERCASE...
title = "WAR AND PEACE" ;
title = textInfo.ToTitleCase(title);
Console.WriteLine(title) ; //WAR AND PEACE
//You need to call ToLower to make it work
title = textInfo.ToTitleCase(title.ToLower());
Console.WriteLine(title) ; //War And Peace
Try this:
string myText = "a Simple string";
string asTitleCase =
System.Threading.Thread.CurrentThread.CurrentCulture.TextInfo.
ToTitleCase(myText.ToLower());
As has already been pointed out, using TextInfo.ToTitleCase might not give you the exact results you want. If you need more control over the output, you could do something like this:
IEnumerable<char> CharsToTitleCase(string s)
{
bool newWord = true;
foreach(char c in s)
{
if(newWord) { yield return Char.ToUpper(c); newWord = false; }
else yield return Char.ToLower(c);
if(c==' ') newWord = true;
}
}
And then use it like so:
var asTitleCase = new string( CharsToTitleCase(myText).ToArray() );
Yet another variation. Based on several tips here I've reduced it to this extension method, which works great for my purposes:
public static string ToTitleCase(this string s) =>
CultureInfo.InvariantCulture.TextInfo.ToTitleCase(s.ToLower());
Personally I tried the TextInfo.ToTitleCase method, but, I don´t understand why it doesn´t work when all chars are upper-cased.
Though I like the util function provided by Winston Smith, let me provide the function I'm currently using:
public static String TitleCaseString(String s)
{
if (s == null) return s;
String[] words = s.Split(' ');
for (int i = 0; i < words.Length; i++)
{
if (words[i].Length == 0) continue;
Char firstChar = Char.ToUpper(words[i][0]);
String rest = "";
if (words[i].Length > 1)
{
rest = words[i].Substring(1).ToLower();
}
words[i] = firstChar + rest;
}
return String.Join(" ", words);
}
Playing with some tests strings:
String ts1 = "Converting string to title case in C#";
String ts2 = "C";
String ts3 = "";
String ts4 = " ";
String ts5 = null;
Console.Out.WriteLine(String.Format("|{0}|", TitleCaseString(ts1)));
Console.Out.WriteLine(String.Format("|{0}|", TitleCaseString(ts2)));
Console.Out.WriteLine(String.Format("|{0}|", TitleCaseString(ts3)));
Console.Out.WriteLine(String.Format("|{0}|", TitleCaseString(ts4)));
Console.Out.WriteLine(String.Format("|{0}|", TitleCaseString(ts5)));
Giving output:
|Converting String To Title Case In C#|
|C|
||
| |
||
Recently I found a better solution.
If your text contains every letter in uppercase, then TextInfo will not convert it to the proper case. We can fix that by using the lowercase function inside like this:
public static string ConvertTo_ProperCase(string text)
{
TextInfo myTI = new CultureInfo("en-US", false).TextInfo;
return myTI.ToTitleCase(text.ToLower());
}
Now this will convert everything that comes in to Propercase.
public static string PropCase(string strText)
{
return new CultureInfo("en").TextInfo.ToTitleCase(strText.ToLower());
}
Use ToLower() first, and then CultureInfo.CurrentCulture.TextInfo.ToTitleCase on the result to get the correct output.
//---------------------------------------------------------------
// Get title case of a string (every word with leading upper case,
// the rest is lower case)
// i.e: ABCD EFG -> Abcd Efg,
// john doe -> John Doe,
// miXEd CaSING - > Mixed Casing
//---------------------------------------------------------------
public static string ToTitleCase(string str)
{
return CultureInfo.CurrentCulture.TextInfo.ToTitleCase(str.ToLower());
}
If someone is interested for the solution for Compact Framework :
return String.Join(" ", thestring.Split(' ').Select(i => i.Substring(0, 1).ToUpper() + i.Substring(1).ToLower()).ToArray());
Here's the solution for that problem...
CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
TextInfo textInfo = cultureInfo.TextInfo;
string txt = textInfo.ToTitleCase(txt);
I needed a way to deal with all caps words, and I liked Ricky AH's solution, but I took it a step further to implement it as an extension method. This avoids the step of having to create your array of chars then call ToArray on it explicitly every time - so you can just call it on the string, like so:
usage:
string newString = oldString.ToProper();
code:
public static class StringExtensions
{
public static string ToProper(this string s)
{
return new string(s.CharsToTitleCase().ToArray());
}
public static IEnumerable<char> CharsToTitleCase(this string s)
{
bool newWord = true;
foreach (char c in s)
{
if (newWord) { yield return Char.ToUpper(c); newWord = false; }
else yield return Char.ToLower(c);
if (c == ' ') newWord = true;
}
}
}
I used the above references and a complete solution is:
Use Namespace System.Globalization;
string str = "INFOA2Z means all information";
// Need result like "Infoa2z Means All Information"
// We need to convert the string in lowercase also, otherwise it is not working properly.
TextInfo ProperCase = new CultureInfo("en-US", false).TextInfo;
str = ProperCase.ToTitleCase(str.toLower());
Change string to proper case in an ASP.NET Using C#
Its better to understand by trying your own code...
Read more
http://www.stupidcodes.com/2014/04/convert-string-to-uppercase-proper-case.html
1) Convert a String to Uppercase
string lower = "converted from lowercase";
Console.WriteLine(lower.ToUpper());
2) Convert a String to Lowercase
string upper = "CONVERTED FROM UPPERCASE";
Console.WriteLine(upper.ToLower());
3) Convert a String to TitleCase
CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
TextInfo textInfo = cultureInfo.TextInfo;
string txt = textInfo.ToTitleCase(TextBox1.Text());
String TitleCaseString(String s)
{
if (s == null || s.Length == 0) return s;
string[] splits = s.Split(' ');
for (int i = 0; i < splits.Length; i++)
{
switch (splits[i].Length)
{
case 1:
break;
default:
splits[i] = Char.ToUpper(splits[i][0]) + splits[i].Substring(1);
break;
}
}
return String.Join(" ", splits);
}
Without using TextInfo:
public static string TitleCase(this string text, char seperator = ' ') =>
string.Join(seperator, text.Split(seperator).Select(word => new string(
word.Select((letter, i) => i == 0 ? char.ToUpper(letter) : char.ToLower(letter)).ToArray())));
It loops through every letter in each word, converting it to uppercase if it's the first letter otherwise converting it to lowercase.
You can directly change text or string to proper using this simple method, after checking for null or empty string values in order to eliminate errors:
// Text to proper (Title Case):
public string TextToProper(string text)
{
string ProperText = string.Empty;
if (!string.IsNullOrEmpty(text))
{
ProperText = CultureInfo.CurrentCulture.TextInfo.ToTitleCase(text);
}
else
{
ProperText = string.Empty;
}
return ProperText;
}
Alternative with reference to Microsoft.VisualBasic (handles uppercase strings too):
string properCase = Strings.StrConv(str, VbStrConv.ProperCase);
Here is an implementation, character by character. It should work with "(One Two Three)":
public static string ToInitcap(this string str)
{
if (string.IsNullOrEmpty(str))
return str;
char[] charArray = new char[str.Length];
bool newWord = true;
for (int i = 0; i < str.Length; ++i)
{
Char currentChar = str[i];
if (Char.IsLetter(currentChar))
{
if (newWord)
{
newWord = false;
currentChar = Char.ToUpper(currentChar);
}
else
{
currentChar = Char.ToLower(currentChar);
}
}
else if (Char.IsWhiteSpace(currentChar))
{
newWord = true;
}
charArray[i] = currentChar;
}
return new string(charArray);
}
Try this:
using System.Globalization;
using System.Threading;
public void ToTitleCase(TextBox TextBoxName)
{
int TextLength = TextBoxName.Text.Length;
if (TextLength == 1)
{
CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
TextInfo textInfo = cultureInfo.TextInfo;
TextBoxName.Text = textInfo.ToTitleCase(TextBoxName.Text);
TextBoxName.SelectionStart = 1;
}
else if (TextLength > 1 && TextBoxName.SelectionStart < TextLength)
{
int x = TextBoxName.SelectionStart;
CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
TextInfo textInfo = cultureInfo.TextInfo;
TextBoxName.Text = textInfo.ToTitleCase(TextBoxName.Text);
TextBoxName.SelectionStart = x;
}
else if (TextLength > 1 && TextBoxName.SelectionStart >= TextLength)
{
CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
TextInfo textInfo = cultureInfo.TextInfo;
TextBoxName.Text = textInfo.ToTitleCase(TextBoxName.Text);
TextBoxName.SelectionStart = TextLength;
}
}
Call this method in the TextChanged event of the TextBox.
This is what I use and it works for most cases unless the user decides to override it by pressing shift or caps lock. Like on Android and iOS keyboards.
Private Class ProperCaseHandler
Private Const wordbreak As String = " ,.1234567890;/\-()#$%^&*€!~+=#"
Private txtProperCase As TextBox
Sub New(txt As TextBox)
txtProperCase = txt
AddHandler txt.KeyPress, AddressOf txtTextKeyDownProperCase
End Sub
Private Sub txtTextKeyDownProperCase(ByVal sender As System.Object, ByVal e As Windows.Forms.KeyPressEventArgs)
Try
If Control.IsKeyLocked(Keys.CapsLock) Or Control.ModifierKeys = Keys.Shift Then
Exit Sub
Else
If txtProperCase.TextLength = 0 Then
e.KeyChar = e.KeyChar.ToString.ToUpper()
e.Handled = False
Else
Dim lastChar As String = txtProperCase.Text.Substring(txtProperCase.SelectionStart - 1, 1)
If wordbreak.Contains(lastChar) = True Then
e.KeyChar = e.KeyChar.ToString.ToUpper()
e.Handled = False
End If
End If
End If
Catch ex As Exception
Exit Sub
End Try
End Sub
End Class
As an extension method:
/// <summary>
// Returns a copy of this string converted to `Title Case`.
/// </summary>
/// <param name="value">The string to convert.</param>
/// <returns>The `Title Case` equivalent of the current string.</returns>
public static string ToTitleCase(this string value)
{
string result = string.Empty;
for (int i = 0; i < value.Length; i++)
{
char p = i == 0 ? char.MinValue : value[i - 1];
char c = value[i];
result += char.IsLetter(c) && ((p is ' ') || p is char.MinValue) ? $"{char.ToUpper(c)}" : $"{char.ToLower(c)}";
}
return result;
}
Usage:
"kebab is DELICIOU's ;d c...".ToTitleCase();
Result:
Kebab Is Deliciou's ;d C...
It works fine even with camel case: 'someText in YourPage'
public static class StringExtensions
{
/// <summary>
/// Title case example: 'Some Text In Your Page'.
/// </summary>
/// <param name="text">Support camel and title cases combinations: 'someText in YourPage'</param>
public static string ToTitleCase(this string text)
{
if (string.IsNullOrEmpty(text))
{
return text;
}
var result = string.Empty;
var splitedBySpace = text.Split(new[]{ ' ' }, StringSplitOptions.RemoveEmptyEntries);
foreach (var sequence in splitedBySpace)
{
// let's check the letters. Sequence can contain even 2 words in camel case
for (var i = 0; i < sequence.Length; i++)
{
var letter = sequence[i].ToString();
// if the letter is Big or it was spaced so this is a start of another word
if (letter == letter.ToUpper() || i == 0)
{
// add a space between words
result += ' ';
}
result += i == 0 ? letter.ToUpper() : letter;
}
}
return result.Trim();
}
}
For the ones who are looking to do it automatically on keypress, I did it with following code in VB.NET on a custom textboxcontrol - you can obviously also do it with a normal textbox - but I like the possibility to add recurring code for specific controls via custom controls it suits the concept of OOP.
Imports System.Windows.Forms
Imports System.Drawing
Imports System.ComponentModel
Public Class MyTextBox
Inherits System.Windows.Forms.TextBox
Private LastKeyIsNotAlpha As Boolean = True
Protected Overrides Sub OnKeyPress(e As KeyPressEventArgs)
If _ProperCasing Then
Dim c As Char = e.KeyChar
If Char.IsLetter(c) Then
If LastKeyIsNotAlpha Then
e.KeyChar = Char.ToUpper(c)
LastKeyIsNotAlpha = False
End If
Else
LastKeyIsNotAlpha = True
End If
End If
MyBase.OnKeyPress(e)
End Sub
Private _ProperCasing As Boolean = False
<Category("Behavior"), Description("When Enabled ensures for automatic proper casing of string"), Browsable(True)>
Public Property ProperCasing As Boolean
Get
Return _ProperCasing
End Get
Set(value As Boolean)
_ProperCasing = value
End Set
End Property
End Class
A way to do it in C:
char proper(char string[])
{
int i = 0;
for(i=0; i<=25; i++)
{
string[i] = tolower(string[i]); // Converts all characters to lower case
if(string[i-1] == ' ') // If the character before is a space
{
string[i] = toupper(string[i]); // Converts characters after spaces to upper case
}
}
string[0] = toupper(string[0]); // Converts the first character to upper case
return 0;
}

Categories