Isolating contents between multiple matches with Regex - c#

I have strings that look like the following:
"1y 250 2y 32% 3y otherjibberish".
My ultimate goal is to split it into the following:
"1y 250"
"2y 32%"
"3y otherjibberish"
The main 'separator' between these splits are the "\d+y" patterns. Using Regex (C# 4.0), I can use the Matches function to match a number followed by a 'y', but I don't know how to get everything that follows that match but precedes the next match.
Is there a way to do that?
Hopefully that makes sense.... Much appreciated
- kcross

You can use a "MatchCollection" to split the string according to the occurrences.
The example below does almost what you want. The blank character at right of each string is not removed.
Code:
using System;
using System.Collections.Generic;
using System.Text;
using System.Text.RegularExpressions;
namespace Q11438740ConApp
{
class Program
{
static void Main(string[] args)
{
string sourceStr = "1y 250 2y 32% 3y otherjibberish";
Regex rx = new Regex(#"\d+y");
string[] splitedArray = SplitByRegex(sourceStr, rx);
for (int i = 0; i < splitedArray.Length; i++)
{
Console.WriteLine(String.Format("'{0}'", splitedArray[i]));
}
Console.ReadLine();
}
public static string[] SplitByRegex(string input, Regex rx)
{
MatchCollection matches = rx.Matches(input);
String[] outArray = new string[matches.Count];
for (int i = 0; i < matches.Count; i++)
{
int length = 0;
if (i == matches.Count - 1)
{
length = input.Length - (matches[i].Index + matches[i].Length);
}
else
{
length = matches[i + 1].Index - (matches[i].Index + matches[i].Length);
}
outArray[i] = matches[i].Value + input.Substring(matches[i].Index + matches[i].Length, length);
}
return outArray;
}
}
}
Output:
'1y 250 '
'2y 32% '
'3y otherjibberish'
"Solution" 7z file: Q11438740ConApp.7z

This was actually quite easy... Just used the Regex.Split() method.

Related

How to swap first and last letters in each word?

I have a practice session on C#, and I want to know how can I swap first and last characters in each word of a sentence and lower case them. I have created a string array that represents each word, and in an inner for loop, I am iterating each character in each word. There is my code.
using System;
namespace ConsoleApp11
{
class Program
{
static void Main(string[] args)
{
string text = "Hello world";
string[] words = text.Split(" ");
string output = "";
for(int i = 0; i < words.Length; i++)
{
for(int j = 0; j < words[i].Length; j++)
{
if (char.IsUpper(words[i][j]))
{
output += char.ToLower(words[i][j]);
}
else
{
output += words[i][j];
}
}
output += " ";
}
Console.WriteLine(output);
}
}
}
Because It is sunday :-), here is the complete code (my explanations were to difficult):
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string output = "ABCD EFGH IJKL";
string[] outputs = output.Split(' ');
char[] chars;
string first,last;
string flower,llower;
string result = string.Empty;
for (int i = 0; i < outputs.Length; i++)
{
chars = outputs[i].ToCharArray();
first = new string(chars[0], 1);
last = new string(chars[chars.Length - 1], 1);
flower = first.ToLower();
llower = last.ToLower();
chars[chars.Length - 1] = flower.ToCharArray()[0];
chars[0] = llower.ToCharArray()[0];
result += new string(chars);
result += " ";
}
Console.WriteLine(output);
Console.WriteLine(result);
Console.ReadLine();
}
}
}
Result: dBCa hFGe lJKi
How do I lowercase and reverse the first and last characters in each word.
Solution using a regular expresion
We could use the Split() method of String or Regex, to split on non-word characters, but then we wouldn't be able to output the correct characters between each word, unless we only split on a single character.
using System;
using System.Text.RegularExpressions;
namespace CS_Regex {
class Program {
static void Main(string[] args) {
// Match words using a regular experession
string match_word = #"(\w+)";
string match_non_word = #"([^\w]*)";
string pattern = match_non_word + match_word + match_non_word;
Regex rx = new Regex(pattern, RegexOptions.Compiled);
// Do the match on example data
string data = "Hello world";
MatchCollection matches = rx.Matches(data);
// Output the matches
foreach (Match match in matches) {
// Get the text before and after the word
string non_word_before = match.Groups[1].ToString();
string non_word_after = match.Groups[3].ToString();
// Get the matched word
string word = match.Groups[2].ToString();
// Lower case the first and last characters and swap them
string firstchar = (word.Length > 0) ? $"{char.ToLower(word[0])}" : "";
string lastchar = (word.Length > 1) ? $"{char.ToLower(word[word.Length - 1])}" : "";
string middle = (word.Length > 2) ? word.Substring(1, word.Length - 2) : "";
string newword = lastchar + middle + firstchar;
// Output the new word
Console.Write(non_word_before + newword + non_word_after);
}
} // Main
} // class
} // namespace
Output from the proposed solution
oellh dorlw
Links
Regular Expression Language - Quick Reference
Regex Class
Regex.Match Method
one idea is to convert each string into an array of chars then, for each array of caracters, get the first and last caracter and convert those caratcter into string (of one caracter) in order to use the lower function. Then replace the first and last letters with the lower caracters by inverting the 0 index with the last index, in order to swap.
Example for first letter (NO swap just for explaination):
string output = "ABCD";
char[] chars = output.ToCharArray();
string firt = new string(chars[0],1);
string lower = firt.ToLower();
string result = output.Replace(chars[0].ToString(), lower.ToString());
For the swap of the first letter to the last
Here is a complete code for first letter: result is "BCDa". For the last letter, it is the same idea
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string output = "ABCD";
char[] chars = output.ToCharArray();
string firt = new string(chars[0],1);
string lower = firt.ToLower();
chars[chars.Length-1] = lower.ToCharArray()[0];
string result = new string(chars);
Console.WriteLine(output);
Console.WriteLine(result);
Console.ReadLine();
}
}
}

C# Reading from file, removing numbers and special characters and add to hashtable

The task is to read from a file, retrieve words (no numbers or special characters) and adding it to a hash table. If the same word (key) already exist in the hash table, update the frequency of the word(value) +1.
So far in the code blow, all text is retrieved from the file including words with numbers and special characters into a string array "words".
I would like to update the values based on a regex to only keep words with letters, in lowercase.
I have tried the regex in all different ways but it does not work. The Split() method only allows individual characters to be removed. (eventually, this code will need to be applied to 200 files with unknown amount of special characters and numbers).
Is there a clean way to read the file, save only words and omit special characters and numbers?
this is what i have so far:
using System;
using System.Collections.Generic;
using System.Collections;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
String myLine;
String[] words;
Hashtable hashT = new Hashtable();
TextReader tr = new StreamReader("C:\\file including numbers and spacial charecters.txt");
while ((myLine = tr.ReadLine()) != null)
{
words = myLine.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
string pattern = #"^[a - z] +$";
Regex term = new Regex(pattern);
for (int i = 0; i < words.Length; i++)
{
Console.WriteLine(words[i]);
words[i] = Regex.Replace(words[i], term, "");
if (hashT.ContainsKey(words[0]))
{
hashT[words[i]] = double.Parse(hashT[words[i]].ToString()) + 1;
}
else
{
hashT.Add(words[i], 1.00);
}
}
foreach (String word in hashT.Keys)
{
Console.WriteLine(word + " " + hashT[words]);
}
Console.ReadKey();
}
}
}
}
try this
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace ConsoleApp
{
class Program
{
static void Main(string[] args)
{
//file content read form your file
string fileContent = #"Hello Wor1d
fun f1nd found
";
//split file content to lines
string[] line = fileContent.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
Regex r = new Regex("[a-zA-Z]+");
List<string> matchesList = new List<string>();
for (int i = 0; i < line.Length; i++)
{
//split line to string, like Hello Wor1d => string[]{ Hello, Wor1d }
string[] lineData = line[i].Split(' ');
for (int j = 0; j < lineData.Length; j++)
{
string str = lineData[j];
//get matches form string
//if count == 0, string is not include words
//if count > 1 string is have some not words, because Wor1d have 2 matches => Wor and d
if (r.Matches(str).Count == 1)
{
matchesList.Add(str);
}
}
}
for (int i = 0; i < matchesList.Count; i++)
{
Console.WriteLine($"{matchesList[i]} is ok");
}
Console.ReadLine();
}
}
}

Can I limit the number of times a character appears in a string?

For instance, I have a string and I only want the character '<' to appear 10 times in the string, and create a substring where the cutoff point is the 10th appearance of that character. Is this possible?
A manual solution could be like the following:
class Program
{
static void Main(string[] args)
{
int maxNum = 10;
string initialString = "a<b<c<d<e<f<g<h<i<j<k<l<m<n<o<p<q<r<s<t<u<v<w<x<y<z";
string[] splitString = initialString.Split('<');
string result = "";
Console.WriteLine(splitString.Length);
if (splitString.Length > maxNum)
{
for (int i = 0; i < maxNum; i++) {
result += splitString[i];
result += "<";
}
}
else
{
result = initialString;
}
Console.WriteLine(result);
Console.ReadKey();
}
}
By the way, it may be better to try to do it using Regex (in case you may have other replacement rules in the future, or need to make changes, etc). However, given your problem, something like that will work, too.
You can utilize TakeWhile for your purpose, given the string s, your character < as c and your count 10 as count, following function would solve your problem:
public static string foo(string s, char c, int count)
{
var i = 0;
return string.Concat(s.TakeWhile(x => (x == c ? i++ : i) < count));
}
Regex.Matches can be used to count the number of occurrences of a patter in a string.
It also reference the position of each occurrence, the Capture.Index property.
You can read the Index of the Nth occurrence and cut your string there:
(The RegexOptions are there just in case the pattern is something different. Modify as required.)
int cutAtOccurrence = 10;
string input = "one<two<three<four<five<six<seven<eight<nine<ten<eleven<twelve<thirteen<fourteen<fifteen";
var regx = Regex.Matches(input, "<", RegexOptions.CultureInvariant | RegexOptions.IgnoreCase);
if (regx.Count >= cutAtOccurrence) {
input = input.Substring(0, regx[cutAtOccurrence - 1].Index);
}
input is now:
one<two<three<four<five<six<seven<eight<nine<ten
If you need to use this procedure many times, it's bettern to build a method that returns a StringBuilder instead.

Compare two strings ignoring little changes

I want to compare two strings ignoring few words (say three).
Like if I compare these two strings:
"Hello! My name is Alex Jolig. Its nice to meet you."
"My name is Alex. Nice to meet you."
I should get result as True.
Is there any way to do that?
Nothing inbuilt that comes to my mind, but I think you can tokenise both the strings using a delimiter (' ' in your case) & punctuation marks (! & . in your case).
Once both the strings are broken down in ordered tokens you can apply a comparison between individual tokens as per your requirement.
You could split the strings into words and compare them like this;
private bool compareStrings()
{
string stringLeft = "Hello! My name is Alex Jolig. Its nice to meet you.";
string stringRight = "My name is Alex. Nice to meet you.";
List<string> liLeft = stringLeft.Split(' ').ToList();
List<string> liRight = stringRight.Split(' ').ToList();
double totalWordCount = liLeft.Count();
double matchingWordCount = 0;
foreach (var item in liLeft)
{
if(liRight.Contains(item)){
matchingWordCount ++;
}
}
//return bool based on percentage of matching words
return ((matchingWordCount / totalWordCount) * 100) >= 50;
}
This returns a boolean based on a percentage of matching words, you might want to use a Regex or similar to replace some format characters for more accurate results.
There is an article on Fuzzy String Matching with Edit Distance in codeproject
You can probably extend this idea to suit your requirement. It uses Levenshtein's Edit Distance as a Fuzzy String Match.
http://www.codeproject.com/Articles/162790/Fuzzy-String-Matching-with-Edit-Distance
Hey so my here is my go at an answer.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var string1 = "Hi thar im a string";
var string2 = "Hi thar im a string";
var string3 = "Hi thar im a similar string";
var string4 = "im a really different string";
var string5 = "Hi thar im a string but have many different words";
Console.WriteLine(StringComparo(string1, string2));
Console.WriteLine(StringComparo(string1, string3));
Console.WriteLine(StringComparo(string1, string4));
Console.WriteLine(StringComparo(string1, string5));
Console.ReadLine();
}
public static bool StringComparo(string str1, string str2, int diffCounterLimiter = 3)
{
var counter = 0;
var arr1 = str1.Split(' ');
var arr2 = str2.Split(' ');
while (counter <= diffCounterLimiter)
{
TreeNode bestResult = null;
for (int i = 0; i < arr1.Length; i++)
{
for (int j = 0; j < arr2.Length; j++)
{
var result = new TreeNode() { arr1Index = i, arr2Index = j };
if (string.Equals(arr1[i], arr2[j]) && (bestResult == null || bestResult.diff < result.diff))
{
bestResult = result;
}
}
}
// no result found
if(bestResult == null)
{
// any left over words plus current counter
return arr1.Length + arr2.Length + counter <= diffCounterLimiter;
}
counter += bestResult.diff;
arr1 = arr1.Where((val, idx) => idx != bestResult.arr1Index).ToArray();
arr2 = arr2.Where((val, idx) => idx != bestResult.arr2Index).ToArray();
}
return false;
}
}
public class TreeNode
{
public int arr1Index;
public int arr2Index;
public int diff => Math.Abs(arr1Index - arr2Index);
}
}
I tried to implement a tree search(I know its not really a search tree I may re write it a bit).
In essence it find the closest matched elements in each string.While under the limit of 3 differences it removes the elements matched them adds the difference and repeats. Hope it helps.

Best way to convert Pascal Case to a sentence

What is the best way to convert from Pascal Case (upper Camel Case) to a sentence.
For example starting with
"AwaitingFeedback"
and converting that to
"Awaiting feedback"
C# preferable but I could convert it from Java or similar.
public static string ToSentenceCase(this string str)
{
return Regex.Replace(str, "[a-z][A-Z]", m => m.Value[0] + " " + char.ToLower(m.Value[1]));
}
In versions of visual studio after 2015, you can do
public static string ToSentenceCase(this string str)
{
return Regex.Replace(str, "[a-z][A-Z]", m => $"{m.Value[0]} {char.ToLower(m.Value[1])}");
}
Based on: Converting Pascal case to sentences using regular expression
I will prefer to use Humanizer for this. Humanizer is a Portable Class Library that meets all your .NET needs for manipulating and displaying strings, enums, dates, times, timespans, numbers and quantities.
Short Answer
"AwaitingFeedback".Humanize() => Awaiting feedback
Long and Descriptive Answer
Humanizer can do a lot more work other examples are:
"PascalCaseInputStringIsTurnedIntoSentence".Humanize() => "Pascal case input string is turned into sentence"
"Underscored_input_string_is_turned_into_sentence".Humanize() => "Underscored input string is turned into sentence"
"Can_return_title_Case".Humanize(LetterCasing.Title) => "Can Return Title Case"
"CanReturnLowerCase".Humanize(LetterCasing.LowerCase) => "can return lower case"
Complete code is :
using Humanizer;
using static System.Console;
namespace HumanizerConsoleApp
{
class Program
{
static void Main(string[] args)
{
WriteLine("AwaitingFeedback".Humanize());
WriteLine("PascalCaseInputStringIsTurnedIntoSentence".Humanize());
WriteLine("Underscored_input_string_is_turned_into_sentence".Humanize());
WriteLine("Can_return_title_Case".Humanize(LetterCasing.Title));
WriteLine("CanReturnLowerCase".Humanize(LetterCasing.LowerCase));
}
}
}
Output
Awaiting feedback
Pascal case input string is turned into sentence
Underscored input string is turned into sentence Can Return Title Case
can return lower case
If you prefer to write your own C# code you can achieve this by writing some C# code stuff as answered by others already.
Here you go...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace CamelCaseToString
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(CamelCaseToString("ThisIsYourMasterCallingYou"));
}
private static string CamelCaseToString(string str)
{
if (str == null || str.Length == 0)
return null;
StringBuilder retVal = new StringBuilder(32);
retVal.Append(char.ToUpper(str[0]));
for (int i = 1; i < str.Length; i++ )
{
if (char.IsLower(str[i]))
{
retVal.Append(str[i]);
}
else
{
retVal.Append(" ");
retVal.Append(char.ToLower(str[i]));
}
}
return retVal.ToString();
}
}
}
This works for me:
Regex.Replace(strIn, "([A-Z]{1,2}|[0-9]+)", " $1").TrimStart()
This is just like #SSTA, but is more efficient than calling TrimStart.
Regex.Replace("ThisIsMyCapsDelimitedString", "(\\B[A-Z])", " $1")
Found this in the MvcContrib source, doesn't seem to be mentioned here yet.
return Regex.Replace(input, "([A-Z])", " $1", RegexOptions.Compiled).Trim();
Just because everyone has been using Regex (except this guy), here's an implementation with StringBuilder that was about 5x faster in my tests. Includes checking for numbers too.
"SomeBunchOfCamelCase2".FromCamelCaseToSentence == "Some Bunch Of Camel Case 2"
public static string FromCamelCaseToSentence(this string input) {
if(string.IsNullOrEmpty(input)) return input;
var sb = new StringBuilder();
// start with the first character -- consistent camelcase and pascal case
sb.Append(char.ToUpper(input[0]));
// march through the rest of it
for(var i = 1; i < input.Length; i++) {
// any time we hit an uppercase OR number, it's a new word
if(char.IsUpper(input[i]) || char.IsDigit(input[i])) sb.Append(' ');
// add regularly
sb.Append(input[i]);
}
return sb.ToString();
}
Here's a basic way of doing it that I came up with using Regex
public static string CamelCaseToSentence(this string value)
{
var sb = new StringBuilder();
var firstWord = true;
foreach (var match in Regex.Matches(value, "([A-Z][a-z]+)|[0-9]+"))
{
if (firstWord)
{
sb.Append(match.ToString());
firstWord = false;
}
else
{
sb.Append(" ");
sb.Append(match.ToString().ToLower());
}
}
return sb.ToString();
}
It will also split off numbers which I didn't specify but would be useful.
string camel = "MyCamelCaseString";
string s = Regex.Replace(camel, "([A-Z])", " $1").ToLower().Trim();
Console.WriteLine(s.Substring(0,1).ToUpper() + s.Substring(1));
Edit: didn't notice your casing requirements, modifed accordingly. You could use a matchevaluator to do the casing, but I think a substring is easier. You could also wrap it in a 2nd regex replace where you change the first character
"^\w"
to upper
\U (i think)
I'd use a regex, inserting a space before each upper case character, then lowering all the string.
string spacedString = System.Text.RegularExpressions.Regex.Replace(yourString, "\B([A-Z])", " \k");
spacedString = spacedString.ToLower();
It is easy to do in JavaScript (or PHP, etc.) where you can define a function in the replace call:
var camel = "AwaitingFeedbackDearMaster";
var sentence = camel.replace(/([A-Z].)/g, function (c) { return ' ' + c.toLowerCase(); });
alert(sentence);
Although I haven't solved the initial cap problem... :-)
Now, for the Java solution:
String ToSentence(String camel)
{
if (camel == null) return ""; // Or null...
String[] words = camel.split("(?=[A-Z])");
if (words == null) return "";
if (words.length == 1) return words[0];
StringBuilder sentence = new StringBuilder(camel.length());
if (words[0].length() > 0) // Just in case of camelCase instead of CamelCase
{
sentence.append(words[0] + " " + words[1].toLowerCase());
}
else
{
sentence.append(words[1]);
}
for (int i = 2; i < words.length; i++)
{
sentence.append(" " + words[i].toLowerCase());
}
return sentence.toString();
}
System.out.println(ToSentence("AwaitingAFeedbackDearMaster"));
System.out.println(ToSentence(null));
System.out.println(ToSentence(""));
System.out.println(ToSentence("A"));
System.out.println(ToSentence("Aaagh!"));
System.out.println(ToSentence("stackoverflow"));
System.out.println(ToSentence("disableGPS"));
System.out.println(ToSentence("Ahh89Boo"));
System.out.println(ToSentence("ABC"));
Note the trick to split the sentence without loosing any character...
Pseudo-code:
NewString = "";
Loop through every char of the string (skip the first one)
If char is upper-case ('A'-'Z')
NewString = NewString + ' ' + lowercase(char)
Else
NewString = NewString + char
Better ways can perhaps be done by using regex or by string replacement routines (replace 'X' with ' x')
An xquery solution that works for both UpperCamel and lowerCamel case:
To output sentence case (only the first character of the first word is capitalized):
declare function content:sentenceCase($string)
{
let $firstCharacter := substring($string, 1, 1)
let $remainingCharacters := substring-after($string, $firstCharacter)
return
concat(upper-case($firstCharacter),lower-case(replace($remainingCharacters, '([A-Z])', ' $1')))
};
To output title case (first character of each word capitalized):
declare function content:titleCase($string)
{
let $firstCharacter := substring($string, 1, 1)
let $remainingCharacters := substring-after($string, $firstCharacter)
return
concat(upper-case($firstCharacter),replace($remainingCharacters, '([A-Z])', ' $1'))
};
Found myself doing something similar, and I appreciate having a point-of-departure with this discussion. This is my solution, placed as an extension method to the string class in the context of a console application.
using System;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string piratese = "avastTharMatey";
string ivyese = "CheerioPipPip";
Console.WriteLine("{0}\n{1}\n", piratese.CamelCaseToString(), ivyese.CamelCaseToString());
Console.WriteLine("For Pete\'s sake, man, hit ENTER!");
string strExit = Console.ReadLine();
}
}
public static class StringExtension
{
public static string CamelCaseToString(this string str)
{
StringBuilder retVal = new StringBuilder(32);
if (!string.IsNullOrEmpty(str))
{
string strTrimmed = str.Trim();
if (!string.IsNullOrEmpty(strTrimmed))
{
retVal.Append(char.ToUpper(strTrimmed[0]));
if (strTrimmed.Length > 1)
{
for (int i = 1; i < strTrimmed.Length; i++)
{
if (char.IsUpper(strTrimmed[i])) retVal.Append(" ");
retVal.Append(char.ToLower(strTrimmed[i]));
}
}
}
}
return retVal.ToString();
}
}
}
Most of the preceding answers split acronyms and numbers, adding a space in front of each character. I wanted acronyms and numbers to be kept together so I have a simple state machine that emits a space every time the input transitions from one state to the other.
/// <summary>
/// Add a space before any capitalized letter (but not for a run of capitals or numbers)
/// </summary>
internal static string FromCamelCaseToSentence(string input)
{
if (string.IsNullOrEmpty(input)) return String.Empty;
var sb = new StringBuilder();
bool upper = true;
for (var i = 0; i < input.Length; i++)
{
bool isUpperOrDigit = char.IsUpper(input[i]) || char.IsDigit(input[i]);
// any time we transition to upper or digits, it's a new word
if (!upper && isUpperOrDigit)
{
sb.Append(' ');
}
sb.Append(input[i]);
upper = isUpperOrDigit;
}
return sb.ToString();
}
And here's some tests:
[TestCase(null, ExpectedResult = "")]
[TestCase("", ExpectedResult = "")]
[TestCase("ABC", ExpectedResult = "ABC")]
[TestCase("abc", ExpectedResult = "abc")]
[TestCase("camelCase", ExpectedResult = "camel Case")]
[TestCase("PascalCase", ExpectedResult = "Pascal Case")]
[TestCase("Pascal123", ExpectedResult = "Pascal 123")]
[TestCase("CustomerID", ExpectedResult = "Customer ID")]
[TestCase("CustomABC123", ExpectedResult = "Custom ABC123")]
public string CanSplitCamelCase(string input)
{
return FromCamelCaseToSentence(input);
}
Mostly already answered here
Small chage to the accepted answer, to convert the second and subsequent Capitalised letters to lower case, so change
if (char.IsUpper(text[i]))
newText.Append(' ');
newText.Append(text[i]);
to
if (char.IsUpper(text[i]))
{
newText.Append(' ');
newText.Append(char.ToLower(text[i]));
}
else
newText.Append(text[i]);
Here is my implementation. This is the fastest that I got while avoiding creating spaces for abbreviations.
public static string PascalCaseToSentence(string input)
{
if (string.IsNullOrEmpty(input) || input.Length < 2)
return input;
var sb = new char[input.Length + ((input.Length + 1) / 2)];
var len = 0;
var lastIsLower = false;
for (int i = 0; i < input.Length; i++)
{
var current = input[i];
if (current < 97)
{
if (lastIsLower)
{
sb[len] = ' ';
len++;
}
lastIsLower = false;
}
else
{
lastIsLower = true;
}
sb[len] = current;
len++;
}
return new string(sb, 0, len);
}

Categories