convert or figure formula which is contained parentheses - c#

i need to find a way to conert treated formula(just using digits,letters and parentheses)
for example, for this input: '5(2(a)sz)' the output should be :'aaszaaszaaszaaszaasz'
i tried in that way:
string AddChainDeleteBracks(int open, int close, string input)
{
string to="",from="";
//get the local chain multipule the number in input[open-1]
//the number of the times the chain should be multiplied
for (int i = input[open - 1]; i > 0; i--)
{
//the content
for (int m = open + 1; m < close; m++)
{
to = to + input[m];
}
}
//get the chain i want to replace with "to"
for (int j = open - 1; j <= close; j++)
{
from = from + input[j];
}
String output = input.Replace(from, to);
return output;
}
but it doesn't work. Do u have a better idea to solve this?

You could store the opening parenthesis positions along with the number associated with that parenthesis in a stack (Last-in-First-out, e.g. System.Collections.Generic.Stack); then when you encounter the first (that is: next) closing parenthesis, pop the top of the stack: this will give you the beginning and ending position of the substring between the (so far most inner) parentheses you need to repeat. Then replace this portion of the original string (including the repetion number) with the repeated string. Continue until you reach the end of the string.
Things to be aware of:
when you do the replacement, you will need to update your current position so it now points to the end of the repetiotion string in the new (modified) string
depending whether 0 repetion is allowed, you might need to handle an empty repetition -- that is an empty string
when you reach the end of the string, the stack should be empty (all opening parentheses were matched with a closing one)
the stack might become empty in the middle of the string -- if you encounter a closing parentheses, the input string was malformed
there might be a way to escape the opening/cloding parentheses, so they don't count as part of the repetition pattern -- this depends on your requirements

Since the syntax of your expression is recursive, I suggest a recursive approach.
First split the expression into single tokens. I use Regex to do it and remove empty entries.
Example: "5(2(a)sz)" is split into "5", "(", "2", "(", "a", ")", "sz", ")"
Using an Enumerator enables you to get the tokens one by one. tokens.MoveNext() gets the next token. tokens.Current is the current token.
public string ConvertExpression(string expression)
{
IEnumerator<string> tokens = Regex.Split(expression, #"\b")
.Where(s => s != "")
.GetEnumerator();
if (tokens.MoveNext()) {
return Parse(tokens);
}
return "";
}
Here the main job is done in a recursive way
private string Parse(IEnumerator<string> tokens)
{
string s = "";
while (tokens.Current != ")") {
int n;
if (tokens.Current == "(") {
if (tokens.MoveNext()) {
s += Parse(tokens);
if (tokens.Current == ")") {
tokens.MoveNext();
return s;
}
}
} else if (Int32.TryParse(tokens.Current, out n)) {
if (tokens.MoveNext()) {
string subExpr = Parse(tokens);
var sb = new StringBuilder();
for (int i = 0; i < n; i++) {
sb.Append(subExpr);
}
s += sb.ToString();
}
} else {
s += tokens.Current;
if (!tokens.MoveNext())
return s;
}
}
return s;
}

Here is my second answer. My first answer was a quick shot. Here I tried to create a parser by doing the things one by one.
In order to convert an expression, you need to parse it. This means that you have to analyze its syntax. While analyzing its syntax you can produce an output as well.
1 The first thing to do, is to define the syntax of all the valid expressions.
Here I use EBNF to do it. EBNF is simple.
{ and } enclose repetitions (possibly zero).
[ and ] encloses an optional part.
| separates alternatives.
See Extended Backus–Naur Form (EBNF) on Wikpedia for more detailed information on EBNF. (The EBNF variant used here drops the concatenation operator ",").
Our syntax in EBNF
Expression = { Term }.
Term = [ Number ] Factor.
Factor = Text | "(" Expression ")" | Term.
Examples
5(2(a)sz) => aaszaaszaaszaaszaasz
5(2a sz) => aaszaaszaaszaaszaasz
2 3(a 2b)c => abbabbabbabbabbabbc
2 Lexical analysis
Before we analyze the syntax we have to split the whole expression into single lexical tokens (numbers, operators, etc.).
We use an enum to indicate the token type
private enum TokenType
{
None,
LPar,
RPar,
Number,
Text
}
The following fields are used to hold the token information and the Boolean _error which tells whether an error occurred during parsing.
private IEnumerator<Match> _matches;
TokenType _tokenType;
string _text;
int _number;
bool _error;
The method ConvertExpression starts the conversion. It splits the expression into single tokens represented as Regex.Matches.
Those are used by the method GetToken, which in turn converts the Regex.Matches into more useful information. This information is stored in the fields described above.
public string ConvertExpression(string expression)
{
_matches = Regex.Matches(expression, #"\d+|\(|\)|[a-zA-Z]+")
.Cast<Match>()
.GetEnumerator();
_error = false;
return GetToken() ? Expression() : "";
}
private bool GetToken()
{
_number = 0;
_tokenType = TokenType.None;
_text = null;
if (_error || !_matches.MoveNext())
return false;
_text = _matches.Current.Value;
switch (_text[0]) {
case '(':
_tokenType = TokenType.LPar;
break;
case ')':
_tokenType = TokenType.RPar;
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
_tokenType = TokenType.Number;
_number = Int32.Parse(_text);
break;
default:
_tokenType = TokenType.Text;
break;
}
return true;
}
3 Syntactic and Semantic Analysis
Now we have everything we need to perform the actual parsing and expression conversion. Each of the methods below analyses one EBNF syntax production and returns the result of the conversion as string.
The conversion of EBNF into C# code is straight forward. A repetition in the syntax is converted to a C# loop statement.
An option is converted to an if statement and alternatives are converted to a switch statement.
// Expression = { Term }.
private string Expression()
{
string s = "";
do {
s += Term();
} while (_tokenType != TokenType.RPar && _tokenType != TokenType.None);
return s;
}
// Term = [ Number ] Factor.
private string Term()
{
int n;
if (_tokenType == TokenType.Number) {
n = _number;
if (!GetToken()) {
_error = true;
return " Error: Factor expected.";
}
string factor = Factor();
if (_error) {
return factor;
}
var sb = new StringBuilder(n * factor.Length);
for (int i = 0; i < n; i++) {
sb.Append(factor);
}
return sb.ToString();
}
return Factor();
}
// Factor = Text | "(" Expression ")" | Term.
private string Factor()
{
switch (_tokenType) {
case TokenType.None:
_error = true;
return " Error: Unexpected end of Expression.";
case TokenType.LPar:
if (GetToken()) {
string s = Expression();
if (_tokenType == TokenType.RPar) {
GetToken();
return s;
} else {
_error = true;
return s + " Error ')' expected.";
}
} else {
_error = true;
return " Error: Unexpected end of Expression.";
}
case TokenType.RPar:
_error = true;
GetToken();
return " Error: Unexpected ')'.";
case TokenType.Text:
string t = _text;
GetToken();
return t;
default:
return Term();
}
}

Related

C# calculate multiple values from textbox

I am a student and got a task where I will have to make a program that solves first-grade equations. I will start of by making a textbox where I can write down different numbers of all the arithmetics, for example, 3+4*8 (it doesn't have to follow the priority rules) and then when I press the "go" button I get the answer.
I tried using the split method from this question/answer:
C# read and calculate multiple values from textbox and it worked for addition but then I tried to use the same script and change it up a little to make it work for multiplication and subtraction but it did not work.
The scripts that I have tried are:
string[] parts = tbxTal.Text.Split('+');
int intSumma = 0;
foreach (string item in parts)
{
intSumma = intSumma + Convert.ToInt32(item);
}
lblSvar.Text = intSumma.ToString();
Also tried using switch (+) after the split but it did not work since there is not + left after te splitt
Any Idea on how I can make a textbox that calculates everything inside of it? My teacher gave me a tip to use the split method and case method together.
Without giving the answer overtly, I would suggest that you keep a variable for the accumulator, operator, and operand. From there you can use a for loop to keep reading until you've evaluated all of the expression, then return the accumulator.
double Evaluate(string expression) {
double accumulator = 0;
double operand = 0;
string operator = string.Empty;
int index = 0;
while (index < expression.Length) {
operand = ExtractNextNumericValue(ref index, expression);
operator = ExtractNextOperator(ref index, expression);
// We now have everything we need to do the math
...
}
return accumulator;
}
public double ExtractNextNumericValue(ref index, string expression) {
// Use IndexOf on the string, use the index as a start location
// Make sure to update ref to be at the end of where you extracted your value
// You know that the value will come before an operator, so look for '+', '-', '*', '/'
...
}
One thing that should help you with creating a calculator like this is reversed Polish notation. Accepted answer to this question is a working calculator that can handle order of operations etc.
Code from mentioned post:
static void Main(string[] args)
{
String str = "5 + ( ( 1 + 2 ) * 4 ) −3";
String result=LengyelFormaKonvertalas(str);
Console.WriteLine(result.ToString());
Console.ReadLine();
}
static String LengyelFormaKonvertalas(String input) // this is the rpn method
{
Stack stack = new Stack();
String str = input.Replace(" ",string.Empty);
StringBuilder formula = new StringBuilder();
for (int i = 0; i < str.Length; i++)
{
char x=str[i];
if (x == '(')
stack.Push(x);
else if (IsOperandus(x)) // is it operand
{
formula.Append(x);
}
else if (IsOperator(x)) // is it operation
{
if (stack.Count>0 && (char)stack.Peek()!='(' && Prior(x)<=Prior((char)stack.Peek()) )
{
char y = (char)stack.Pop();
formula.Append(y);
}
if (stack.Count > 0 && (char)stack.Peek() != '(' && Prior(x) < Prior((char)stack.Peek()))
{
char y = (char)stack.Pop();
formula.Append(y);
}
stack.Push(x);
}
else
{
char y=(char)stack.Pop();
if (y!='(')
{
formula.Append(y);
}
}
}
while (stack.Count>0)
{
char c = (char)stack.Pop();
formula.Append(c);
}
return formula.ToString();
}
static bool IsOperator(char c)
{
return (c=='-'|| c=='+' || c=='*' || c=='/');
}
static bool IsOperandus(char c)
{
return (c>='0' && c<='9' || c=='.');
}
static int Prior(char c)
{
switch (c)
{
case '=':
return 1;
case '+':
return 2;
case '-':
return 2;
case '*':
return 3;
case '/':
return 3;
case '^':
return 4;
default:
throw new ArgumentException("Rossz paraméter");
}
}
}
Change below line of code from this article C# read and calculate multiple values from textbox
like this:
string[] parts = textBox1.Text.Split('+');
Replace above line with below line
string[] parts = textBox1.Text.Split('+','*','-');

Better regular expression for ReverseStringFormat

I've been using for a while this neat function found here on SO:
private List<string> ReverseStringFormat(string template, string str)
{
string pattern = "^" + Regex.Replace(template, #"\{[0-9]+\}", "(.*?)") + "$";
Regex r = new Regex(pattern);
Match m = r.Match(str);
List<string> ret = new List<string>();
for (int i = 1; i < m.Groups.Count; i++)
ret.Add(m.Groups[i].Value);
return ret;
}
This function is able to process correctly templates like:
My name is {0} and I'm {1} years old
While it fails with patterns like:
My name is {0} and I'm {1:00} years old
I would like to handle this failing scenario and add fixed length parsing.
The function transforms the (first) template as following:
My name is (.*?) and I'm (.*?) years old
I've been trying to write the above regular expression to limit the number of characters captured for the second group without success. This is my (terrible) attempt:
My name is (.*?) and I'm (.{2}) years old
I've been trying to process inputs like the following but the below PATTERN doesn't work:
PATTERN: My name is (.*?) (.{3})(.{5})
INPUT: My name is John 123ABCDE
EXPECTED OUTPUT: John, 123, ABCDE
Every suggestion is highly appreciated
It is highly unlikely that you will be able to measure the length of a captured group within the same Regex replacement.
I would strongly suggest you look at the following state machine implementation.
Please note that this implementation also solves the multiple curly brace escape feature of string.Format.
First you will need a state enum, very much like this one:
public enum State {
Outside,
OutsideAfterCurly,
Inside,
InsideAfterColon
}
Then you will need a nice way to iterate over each character in a string.
The string chars parameter represents your template parameter while the returning IEnumerable<string> represents consecutive parts of the resulting pattern:
public static IEnumerable<string> InnerTransmogrify(string chars) {
State state = State.Outside;
int counter = 0;
foreach (var #char in chars) {
switch (state) {
case State.Outside:
switch (#char) {
case '{':
state = State.OutsideAfterCurly;
break;
default:
yield return #char.ToString();
break;
}
break;
case State.OutsideAfterCurly:
switch (#char) {
case '{':
state = State.Outside;
break;
default:
state = State.Inside;
counter = 0;
yield return "(.";
break;
}
break;
case State.Inside:
switch (#char) {
case '}':
state = State.Outside;
yield return "*?)";
break;
case ':':
state = State.InsideAfterColon;
break;
default:
break;
}
break;
case State.InsideAfterColon:
switch (#char) {
case '}':
state = State.Outside;
yield return "{" + counter + "})";
break;
default:
counter++;
break;
}
break;
}
}
}
You could join the parts like so:
public static string Transmogrify(string chars) {
var parts = InnerTransmogrify(chars);
var result = string.Join("", parts);
return result;
}
And then wrap everything up, like you originally intended:
private List<string> ReverseStringFormat(string template, string str) {
string pattern = <<SOME_PLACE>> .Transmogrify(template);
Regex r = new Regex(pattern);
Match m = r.Match(str);
List<string> ret = new List<string>();
for (int i = 1; i < m.Groups.Count; i++)
ret.Add(m.Groups[i].Value);
return ret;
}
Hope you understand why the Regex language isn't expressive enough (at least as far as my understanding is concerned) for this sort of job.
The only way to solve your problem with regular expressions is using a custom matcher to replace the group capture length.
The code bellow does this in your example:
private static string PatternFromStringFormat(string template)
{
// replaces only elements like {0}
string firstPass = Regex.Replace(template, #"\{[0-9]+\}", "(.*?)");
// replaces elements like {0:000} using a custom matcher
string secondPass = Regex.Replace(firstPass, #"\{[0-9]+\:(?<len>[0-9]+)\}",
(match) =>
{
var len = match.Groups["len"].Value.Length;
return "(.{" + len + "*})";
});
return "^" + secondPass + "$";
}
private static List<string> ReverseStringFormat(string template, string str)
{
string pattern = PatternFromStringFormat(template);
Regex r = new Regex(pattern);
Match m = r.Match(str);
List<string> ret = new List<string>();
for (int i = 1; i < m.Groups.Count; i++)
ret.Add(m.Groups[i].Value);
return ret;
}

translate special character in strings

I have a program that reads from a xml document. In this xml document some of the attributes contain special characters like "\n", "\t", etc.
Is there an easy way to replace all of these strings with the actual character or do I just have to do it manually for each character like the following example?
Manual example:
s.Replace("\\n", "\n").Replace("\\t", "\t")...
edit:
I'm looking for some way to treat the string like an escaped string like this(even though I know this doesn't work)
s.Replace("\\", "\");
Try Regex.Unescape().
Official docs here:
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.unescape(v=vs.110).aspx
Why not just walk the document and build up the new string in one pass. Saves a lot of duplicate searching and intermediate allocations
string ConvertSpecialCharacters(string input) {
var builder = new StringBuilder();
bool inEscape = false;
for (int i = 0; i < input.Length ; i++) {
if (inEscape) {
switch (input[i]) {
case 'n':
builder.Append('\t');
break;
case 't':
builder.Append('\n');
break;
default:
builder.Append('\\');
builder.Append(input[i]);
}
else if (input[i] == '\\' && i + 1 < input.Length) {
inEscape = true;
}
else {
builder.Append(input[i]);
}
}
return builder.ToString();
}

Parsing strings recursively

I am trying to extract information out of a string - a fortran formatting string to be specific. The string is formatted like:
F8.3, I5, 3(5X, 2(A20,F10.3)), 'XXX'
with formatting fields delimited by "," and formatting groups inside brackets, with the number in front of the brackets indicating how many consecutive times the formatting pattern is repeated. So, the string above expands to:
F8.3, I5, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 5X, A20,F10.3, A20,F10.3, 'XXX'
I am trying to make something in C# that will expand a string that conforms to that pattern. I have started going about it with lots of switch and if statements, but am wondering if I am not going about it the wrong way?
I was basically wondering if some Regex wizzard thinks that Regular expressions can do this in one neat-fell swoop? I know nothing about regular expressions, but if this could solve my problem I am considering putting in some time to learn how to use them... on the other hand if regular expressions can't sort this out then I'd rather spend my time looking at another method.
This has to be doable with Regex :)
I've expanded my previous example and it test nicely with your example.
// regex to match the inner most patterns of n(X) and capture the values of n and X.
private static readonly Regex matcher = new Regex(#"(\d+)\(([^(]*?)\)", RegexOptions.None);
// create new string by repeating X n times, separated with ','
private static string Join(Match m)
{
var n = Convert.ToInt32(m.Groups[1].Value); // get value of n
var x = m.Groups[2].Value; // get value of X
return String.Join(",", Enumerable.Repeat(x, n));
}
// expand the string by recursively replacing the innermost values of n(X).
private static string Expand(string text)
{
var s = matcher.Replace(text, Join);
return (matcher.IsMatch(s)) ? Expand(s) : s;
}
// parse a string for occurenses of n(X) pattern and expand then.
// return the string as a tokenized array.
public static string[] Parse(string text)
{
// Check that the number of parantheses is even.
if (text.Sum(c => (c == '(' || c == ')') ? 1 : 0) % 2 == 1)
throw new ArgumentException("The string contains an odd number of parantheses.");
return Expand(text).Split(new[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries);
}
I would suggest using a recusive method like the example below( not tested ):
ResultData Parse(String value, ref Int32 index)
{
ResultData result = new ResultData();
Index startIndex = index; // Used to get substrings
while (index < value.Length)
{
Char current = value[index];
if (current == '(')
{
index++;
result.Add(Parse(value, ref index));
startIndex = index;
continue;
}
if (current == ')')
{
// Push last result
index++;
return result;
}
// Process all other chars here
}
// We can't find the closing bracket
throw new Exception("String is not valid");
}
You maybe need to modify some parts of the code, but this method have i used when writing a simple compiler. Although it's not completed, just a example.
Personally, I would suggest using a recursive function instead. Every time you hit an opening parenthesis, call the function again to parse that part. I'm not sure if you can use a regex to match a recursive data structure.
(Edit: Removed incorrect regex)
Ended up rewriting this today. It turns out that this can be done in one single method:
private static string ExpandBrackets(string Format)
{
int maxLevel = CountNesting(Format);
for (int currentLevel = maxLevel; currentLevel > 0; currentLevel--)
{
int level = 0;
int start = 0;
int end = 0;
for (int i = 0; i < Format.Length; i++)
{
char thisChar = Format[i];
switch (Format[i])
{
case '(':
level++;
if (level == currentLevel)
{
string group = string.Empty;
int repeat = 0;
/// Isolate the number of repeats if any
/// If there are 0 repeats the set to 1 so group will be replaced by itself with the brackets removed
for (int j = i - 1; j >= 0; j--)
{
char c = Format[j];
if (c == ',')
{
start = j + 1;
break;
}
if (char.IsDigit(c))
repeat = int.Parse(c + (repeat != 0 ? repeat.ToString() : string.Empty));
else
throw new Exception("Non-numeric character " + c + " found in front of the brackets");
}
if (repeat == 0)
repeat = 1;
/// Isolate the format group
/// Parse until the first closing bracket. Level is decremented as this effectively takes us down one level
for (int j = i + 1; j < Format.Length; j++)
{
char c = Format[j];
if (c == ')')
{
level--;
end = j;
break;
}
group += c;
}
/// Substitute the expanded group for the original group in the format string
/// If the group is empty then just remove it from the string
if (string.IsNullOrEmpty(group))
{
Format = Format.Remove(start - 1, end - start + 2);
i = start;
}
else
{
string repeatedGroup = RepeatString(group, repeat);
Format = Format.Remove(start, end - start + 1).Insert(start, repeatedGroup);
i = start + repeatedGroup.Length - 1;
}
}
break;
case ')':
level--;
break;
}
}
}
return Format;
}
CountNesting() returns the highest level of bracket nesting in the format statement, but could be passed in as a parameter to the method. RepeatString() just repeats a string the specified number of times and substitutes it for the bracketed group in the format string.

Best way to convert Pascal Case to a sentence

What is the best way to convert from Pascal Case (upper Camel Case) to a sentence.
For example starting with
"AwaitingFeedback"
and converting that to
"Awaiting feedback"
C# preferable but I could convert it from Java or similar.
public static string ToSentenceCase(this string str)
{
return Regex.Replace(str, "[a-z][A-Z]", m => m.Value[0] + " " + char.ToLower(m.Value[1]));
}
In versions of visual studio after 2015, you can do
public static string ToSentenceCase(this string str)
{
return Regex.Replace(str, "[a-z][A-Z]", m => $"{m.Value[0]} {char.ToLower(m.Value[1])}");
}
Based on: Converting Pascal case to sentences using regular expression
I will prefer to use Humanizer for this. Humanizer is a Portable Class Library that meets all your .NET needs for manipulating and displaying strings, enums, dates, times, timespans, numbers and quantities.
Short Answer
"AwaitingFeedback".Humanize() => Awaiting feedback
Long and Descriptive Answer
Humanizer can do a lot more work other examples are:
"PascalCaseInputStringIsTurnedIntoSentence".Humanize() => "Pascal case input string is turned into sentence"
"Underscored_input_string_is_turned_into_sentence".Humanize() => "Underscored input string is turned into sentence"
"Can_return_title_Case".Humanize(LetterCasing.Title) => "Can Return Title Case"
"CanReturnLowerCase".Humanize(LetterCasing.LowerCase) => "can return lower case"
Complete code is :
using Humanizer;
using static System.Console;
namespace HumanizerConsoleApp
{
class Program
{
static void Main(string[] args)
{
WriteLine("AwaitingFeedback".Humanize());
WriteLine("PascalCaseInputStringIsTurnedIntoSentence".Humanize());
WriteLine("Underscored_input_string_is_turned_into_sentence".Humanize());
WriteLine("Can_return_title_Case".Humanize(LetterCasing.Title));
WriteLine("CanReturnLowerCase".Humanize(LetterCasing.LowerCase));
}
}
}
Output
Awaiting feedback
Pascal case input string is turned into sentence
Underscored input string is turned into sentence Can Return Title Case
can return lower case
If you prefer to write your own C# code you can achieve this by writing some C# code stuff as answered by others already.
Here you go...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace CamelCaseToString
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine(CamelCaseToString("ThisIsYourMasterCallingYou"));
}
private static string CamelCaseToString(string str)
{
if (str == null || str.Length == 0)
return null;
StringBuilder retVal = new StringBuilder(32);
retVal.Append(char.ToUpper(str[0]));
for (int i = 1; i < str.Length; i++ )
{
if (char.IsLower(str[i]))
{
retVal.Append(str[i]);
}
else
{
retVal.Append(" ");
retVal.Append(char.ToLower(str[i]));
}
}
return retVal.ToString();
}
}
}
This works for me:
Regex.Replace(strIn, "([A-Z]{1,2}|[0-9]+)", " $1").TrimStart()
This is just like #SSTA, but is more efficient than calling TrimStart.
Regex.Replace("ThisIsMyCapsDelimitedString", "(\\B[A-Z])", " $1")
Found this in the MvcContrib source, doesn't seem to be mentioned here yet.
return Regex.Replace(input, "([A-Z])", " $1", RegexOptions.Compiled).Trim();
Just because everyone has been using Regex (except this guy), here's an implementation with StringBuilder that was about 5x faster in my tests. Includes checking for numbers too.
"SomeBunchOfCamelCase2".FromCamelCaseToSentence == "Some Bunch Of Camel Case 2"
public static string FromCamelCaseToSentence(this string input) {
if(string.IsNullOrEmpty(input)) return input;
var sb = new StringBuilder();
// start with the first character -- consistent camelcase and pascal case
sb.Append(char.ToUpper(input[0]));
// march through the rest of it
for(var i = 1; i < input.Length; i++) {
// any time we hit an uppercase OR number, it's a new word
if(char.IsUpper(input[i]) || char.IsDigit(input[i])) sb.Append(' ');
// add regularly
sb.Append(input[i]);
}
return sb.ToString();
}
Here's a basic way of doing it that I came up with using Regex
public static string CamelCaseToSentence(this string value)
{
var sb = new StringBuilder();
var firstWord = true;
foreach (var match in Regex.Matches(value, "([A-Z][a-z]+)|[0-9]+"))
{
if (firstWord)
{
sb.Append(match.ToString());
firstWord = false;
}
else
{
sb.Append(" ");
sb.Append(match.ToString().ToLower());
}
}
return sb.ToString();
}
It will also split off numbers which I didn't specify but would be useful.
string camel = "MyCamelCaseString";
string s = Regex.Replace(camel, "([A-Z])", " $1").ToLower().Trim();
Console.WriteLine(s.Substring(0,1).ToUpper() + s.Substring(1));
Edit: didn't notice your casing requirements, modifed accordingly. You could use a matchevaluator to do the casing, but I think a substring is easier. You could also wrap it in a 2nd regex replace where you change the first character
"^\w"
to upper
\U (i think)
I'd use a regex, inserting a space before each upper case character, then lowering all the string.
string spacedString = System.Text.RegularExpressions.Regex.Replace(yourString, "\B([A-Z])", " \k");
spacedString = spacedString.ToLower();
It is easy to do in JavaScript (or PHP, etc.) where you can define a function in the replace call:
var camel = "AwaitingFeedbackDearMaster";
var sentence = camel.replace(/([A-Z].)/g, function (c) { return ' ' + c.toLowerCase(); });
alert(sentence);
Although I haven't solved the initial cap problem... :-)
Now, for the Java solution:
String ToSentence(String camel)
{
if (camel == null) return ""; // Or null...
String[] words = camel.split("(?=[A-Z])");
if (words == null) return "";
if (words.length == 1) return words[0];
StringBuilder sentence = new StringBuilder(camel.length());
if (words[0].length() > 0) // Just in case of camelCase instead of CamelCase
{
sentence.append(words[0] + " " + words[1].toLowerCase());
}
else
{
sentence.append(words[1]);
}
for (int i = 2; i < words.length; i++)
{
sentence.append(" " + words[i].toLowerCase());
}
return sentence.toString();
}
System.out.println(ToSentence("AwaitingAFeedbackDearMaster"));
System.out.println(ToSentence(null));
System.out.println(ToSentence(""));
System.out.println(ToSentence("A"));
System.out.println(ToSentence("Aaagh!"));
System.out.println(ToSentence("stackoverflow"));
System.out.println(ToSentence("disableGPS"));
System.out.println(ToSentence("Ahh89Boo"));
System.out.println(ToSentence("ABC"));
Note the trick to split the sentence without loosing any character...
Pseudo-code:
NewString = "";
Loop through every char of the string (skip the first one)
If char is upper-case ('A'-'Z')
NewString = NewString + ' ' + lowercase(char)
Else
NewString = NewString + char
Better ways can perhaps be done by using regex or by string replacement routines (replace 'X' with ' x')
An xquery solution that works for both UpperCamel and lowerCamel case:
To output sentence case (only the first character of the first word is capitalized):
declare function content:sentenceCase($string)
{
let $firstCharacter := substring($string, 1, 1)
let $remainingCharacters := substring-after($string, $firstCharacter)
return
concat(upper-case($firstCharacter),lower-case(replace($remainingCharacters, '([A-Z])', ' $1')))
};
To output title case (first character of each word capitalized):
declare function content:titleCase($string)
{
let $firstCharacter := substring($string, 1, 1)
let $remainingCharacters := substring-after($string, $firstCharacter)
return
concat(upper-case($firstCharacter),replace($remainingCharacters, '([A-Z])', ' $1'))
};
Found myself doing something similar, and I appreciate having a point-of-departure with this discussion. This is my solution, placed as an extension method to the string class in the context of a console application.
using System;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string piratese = "avastTharMatey";
string ivyese = "CheerioPipPip";
Console.WriteLine("{0}\n{1}\n", piratese.CamelCaseToString(), ivyese.CamelCaseToString());
Console.WriteLine("For Pete\'s sake, man, hit ENTER!");
string strExit = Console.ReadLine();
}
}
public static class StringExtension
{
public static string CamelCaseToString(this string str)
{
StringBuilder retVal = new StringBuilder(32);
if (!string.IsNullOrEmpty(str))
{
string strTrimmed = str.Trim();
if (!string.IsNullOrEmpty(strTrimmed))
{
retVal.Append(char.ToUpper(strTrimmed[0]));
if (strTrimmed.Length > 1)
{
for (int i = 1; i < strTrimmed.Length; i++)
{
if (char.IsUpper(strTrimmed[i])) retVal.Append(" ");
retVal.Append(char.ToLower(strTrimmed[i]));
}
}
}
}
return retVal.ToString();
}
}
}
Most of the preceding answers split acronyms and numbers, adding a space in front of each character. I wanted acronyms and numbers to be kept together so I have a simple state machine that emits a space every time the input transitions from one state to the other.
/// <summary>
/// Add a space before any capitalized letter (but not for a run of capitals or numbers)
/// </summary>
internal static string FromCamelCaseToSentence(string input)
{
if (string.IsNullOrEmpty(input)) return String.Empty;
var sb = new StringBuilder();
bool upper = true;
for (var i = 0; i < input.Length; i++)
{
bool isUpperOrDigit = char.IsUpper(input[i]) || char.IsDigit(input[i]);
// any time we transition to upper or digits, it's a new word
if (!upper && isUpperOrDigit)
{
sb.Append(' ');
}
sb.Append(input[i]);
upper = isUpperOrDigit;
}
return sb.ToString();
}
And here's some tests:
[TestCase(null, ExpectedResult = "")]
[TestCase("", ExpectedResult = "")]
[TestCase("ABC", ExpectedResult = "ABC")]
[TestCase("abc", ExpectedResult = "abc")]
[TestCase("camelCase", ExpectedResult = "camel Case")]
[TestCase("PascalCase", ExpectedResult = "Pascal Case")]
[TestCase("Pascal123", ExpectedResult = "Pascal 123")]
[TestCase("CustomerID", ExpectedResult = "Customer ID")]
[TestCase("CustomABC123", ExpectedResult = "Custom ABC123")]
public string CanSplitCamelCase(string input)
{
return FromCamelCaseToSentence(input);
}
Mostly already answered here
Small chage to the accepted answer, to convert the second and subsequent Capitalised letters to lower case, so change
if (char.IsUpper(text[i]))
newText.Append(' ');
newText.Append(text[i]);
to
if (char.IsUpper(text[i]))
{
newText.Append(' ');
newText.Append(char.ToLower(text[i]));
}
else
newText.Append(text[i]);
Here is my implementation. This is the fastest that I got while avoiding creating spaces for abbreviations.
public static string PascalCaseToSentence(string input)
{
if (string.IsNullOrEmpty(input) || input.Length < 2)
return input;
var sb = new char[input.Length + ((input.Length + 1) / 2)];
var len = 0;
var lastIsLower = false;
for (int i = 0; i < input.Length; i++)
{
var current = input[i];
if (current < 97)
{
if (lastIsLower)
{
sb[len] = ' ';
len++;
}
lastIsLower = false;
}
else
{
lastIsLower = true;
}
sb[len] = current;
len++;
}
return new string(sb, 0, len);
}

Categories