remove string between "|" and "," in stringbuilder in C# - c#

I use VS2019 in Windows7.
I want to remove string between "|" and "," in a StringBuilder.
That is , I want to convert StringBuilder from
"578.552|0,37.986|317,38.451|356,23"
to
"578.552,37.986,38.451,23"
I have tried Substring but failed, what other method I could use to achieve this?

If you have a huge StringBuilder and that's why converting it into String and applying regular expression is not the option,
you can try implementing Finite State Machine (FSM):
StringBuilder source = new StringBuilder("578.552|0,37.986|317,38.451|356,23");
int state = 0; // 0 - keep character, 1 - discard character
int index = 0;
for (int i = 0; i < source.Length; ++i) {
char c = source[i];
if (state == 0)
if (c == '|')
state = 1;
else
source[index++] = c;
else if (c == ',') {
state = 0;
source[index++] = c;
}
}
source.Length = index;

StringBuilder isn't really setup for much by way of inspection and mutation in the middle. It would be pretty easy to do once you have a string (probably via a Regex), but StringBuilder? not so much. In reality, StringBuilder is mostly intended for forwards-only append, so the answer would be:
if you didn't want those characters, why did you add them?
Maybe just use the string version here; then:
var s = "578.552|0,37.986|317,38.451|356,23";
var t = Regex.Replace(s, #"\|.*?(?=,)", ""); // 578.552,37.986,38.451,23
The regex translation here is "pipe (\|), non-greedy anything (.*?), followed by a comma where the following comma isn't part of the match ((?=,)).

If you don't know very much of Regex patterns, you can write your own custom method to filter out data; its always instructive and a good practicing exercise:
public static String RemoveDelimitedSubstrings(
this StringBuilder s,
char startDelimitter,
char endDelimitter,
char newDelimitter)
{
var buffer = new StringBuilder(s.Length);
var ignore = false;
for (var i = 0; i < s.Length; i++)
{
var currentChar = s[i];
if (currentChar == startDelimitter && !ignore)
{
ignore = true;
}
else if (currentChar == endDelimitter && ignore)
{
ignore = false;
buffer.Append(newDelimitter);
}
else if (!ignore)
buffer.Append(currentChar);
}
return buffer.ToString();
}
And youd obvisouly use it like:
var buffer= new StringBuilder("578.552|0,37.986|317,38.451|356,23");
var filteredBuffer = b.RemoveDelimitedSubstrings('|', ',', ','));

Related

C#: Need to split a string into a string[] and keeping the delimiter (also a string) at the beginning of the string

I think I am too dumb to solve this problem...
I have some formulas which need to be "translated" from one syntax to another.
Let's say I have a formula that goes like that (it's a simple one, others have many "Ceilings" in it):
string formulaString = "If([Param1] = 0, 1, Ceiling([Param2] / 0.55) * [Param3])";
I need to replace "Ceiling()" with "Ceiling(; 1)" (basically, insert "; 1" before the ")").
My attempt is to split the fomulaString at "Ceiling(" so I am able to iterate through the string array and insert my string at the correct index (counting every "(" and ")" to get the right index)
What I have so far:
//splits correct, but loses "CEILING("
string[] parts = formulaString.Split(new[] { "CEILING(" }, StringSplitOptions.None);
//splits almost correct, "CEILING(" is in another group
string[] parts = Regex.Split(formulaString, #"(CEILING\()");
//splits almost every letter
string[] parts = Regex.Split(formulaString, #"(?=[(CEILING\()])");
When everything is done, I concat the string so I have my complete formula again.
What do I have to set as Regex pattern to achieve this sample? (Or any other method that will help me)
part1 = "If([Param1] = 0, 1, ";
part2 = "Ceiling([Param2] / 0.55) * [Param3])";
//part3 = next "CEILING(" in a longer formula and so on...
As I mention in a comment, you almost got it: (?=Ceiling). This is incomplete for your use case unfortunately.
I need to replace "Ceiling()" with "Ceiling(; 1)" (basically, insert "; 1" before the ")").
Depending on your regex engine (for example JS) this works:
string[] parts = Regex.Split(formulaString, #"(?<=Ceiling\([^)]*(?=\)))");
string modifiedFormula = String.join("; 1", parts);
The regex
(?<=Ceiling\([^)]*(?=\)))
(?<= ) Positive lookbehind
Ceiling\( Search for literal "Ceiling("
[^)] Match any char which is not ")" ..
* .. 0 or more times
(?=\)) Positive lookahead for ")", effectively making us stop before the ")"
This regex is a zero-assertion, therefore nothing is lost and it will cut your strings before the last ")" in every "Ceiling()".
This solution would break whenever you have nested "Ceiling()". Then your only solution would be writing your own parser for the same reasons why you can't parse markup with regex.
Regex.Replace(formulaString, #"(?<=Ceiling\()(.*?)(?=\))","$1; 1");
Note: This will not work for nested "Ceilings", but it does for Ceiling(), It will also not work fir Ceiling(AnotherFunc(x)). For that you need something like:
Regex.Replace(formulaString, #"(?<=Ceiling\()((.*\((?>[^()]+|(?1))*\))*|[^\)]*)(\))","$1; 1$3");
but I could not get that to work with .NET, only in JavaScript.
This is my solution:
private string ConvertCeiling(string formula)
{
int ceilingsCount = formula.CountOccurences("Ceiling(");
int startIndex = 0;
int bracketCounter;
for (int i = 0; i < ceilingsCount; i++)
{
startIndex = formula.IndexOf("Ceiling(", startIndex);
bracketCounter = 0;
for (int j = 0; j < formula.Length; j++)
{
if (j < startIndex) continue;
var c = formula[j];
if (c == '(')
{
bracketCounter++;
}
if (c == ')')
{
bracketCounter--;
if (bracketCounter == 0)
{
// found end
formula = formula.Insert(j, "; 1");
startIndex++;
break;
}
}
}
}
return formula;
}
And CountOccurence:
public static int CountOccurences(this string value, string parameter)
{
int counter = 0;
int startIndex = 0;
int indexOfCeiling;
do
{
indexOfCeiling = value.IndexOf(parameter, startIndex);
if (indexOfCeiling < 0)
{
break;
}
else
{
startIndex = indexOfCeiling + 1;
counter++;
}
} while (true);
return counter;
}

Replace character in string with Uppercase of next in line (Pascal Casing)

I want to remove all underscores from a string with the uppercase of the character following the underscore. So for example: _my_string_ becomes: MyString similarly: my_string becomes MyString
Is there a simpler way to do it? I currently have the following (assuming no input has two consecutive underscores):
StringBuilder sb = new StringBuilder();
int i;
for (i = 0; i < input.Length - 1; i++)
{
if (input[i] == '_')
sb.Append(char.ToUpper(input[++i]));
else if (i == 0)
sb.Append(char.ToUpper(input[i]));
else
sb.Append(input[i]);
}
if (i < input.Length && input[i] != '_')
sb.Append(input[i]);
return sb.ToString();
Now I know this is not totally related, but I thought to run some numbers on the implementations provided in the answers, and here are the results in Milliseconds for each implementation using 1000000 iterations of the string: "_my_string_121_a_" :
Achilles: 313
Raj: 870
Damian: 7916
Dmitry: 5380
Equalsk: 574
method utilised:
Stopwatch stp = new Stopwatch();
stp.Start();
for (int i = 0; i < 1000000; i++)
{
sb = Test("_my_string_121_a_");
}
stp.Stop();
long timeConsumed= stp.ElapsedMilliseconds;
In the end I think I'll go with Raj's implementation, because it's just very simple and easy to understand.
This must do it using ToTitleCase using System.Globalization namespace
static string toCamel(string input)
{
TextInfo info = CultureInfo.CurrentCulture.TextInfo;
input= info.ToTitleCase(input).Replace("_", string.Empty);
return input;
}
Shorter (regular expressions), but I doubt if it's better (regular expressions are less readable):
string source = "_my_string_123__A_";
// MyString123A
string result = Regex
// _ + lower case Letter -> upper case letter (thanks to Wiktor Stribiżew)
.Replace(source, #"(_+|^)(\p{Ll})?", match => match.Groups[2].Value.ToUpper())
// all the other _ should be just removed
.Replace("_", "");
Loops over each character and converts to uppercase as necessary.
public string GetNewString(string input)
{
var convert = false;
var sb = new StringBuilder();
foreach (var c in input)
{
if (c == '_')
{
convert = true;
continue;
}
if (convert)
{
sb.Append(char.ToUpper(c));
convert = false;
continue;
}
sb.Append(c);
}
return sb.ToString().First().ToString().ToUpper() + sb.ToString().Substring(1);
}
Usage:
GetNewString("my_string");
GetNewString("___this_is_anewstring_");
GetNewString("___this_is_123new34tring_");
Output:
MyString
ThisIsAnewstring
ThisIs123new34tring
Try with Regex:
var regex = new Regex("^[a-z]|_[a-z]?");
var result = regex.Replace("my_string_1234", x => x.Value== "_" ? "" : x.Value.Last().ToString().ToUpper());
Tested with:
my_string -> MyString
_my_string -> MyString
_my_string_ -> MyString
You need convert snake case to camel case, You can use this code it's working for me
var x ="_my_string_".Split(new[] {"_"}, StringSplitOptions.RemoveEmptyEntries)
.Select(s => char.ToUpperInvariant(s[0]) + s.Substring(1, s.Length - 1))
.Aggregate(string.Empty, (s1, s2) => s1 + s2);
x = MyString
static string toCamel(string input)
{
StringBuilder sb = new StringBuilder();
int i;
for (i = 0; i < input.Length; i++)
{
if ((i == 0) || (i > 0 && input[i - 1] == '_'))
sb.Append(char.ToUpper(input[i]));
else
sb.Append(char.ToLower(input[i]));
}
return sb.ToString();
}

How to add symbols automatically in c#?

I'm beginning programming with C# and have a question:
I have a string of character like abcdef123456789. But the string is too long, therefore I want to add : automatically, after the second, fourth, six .... symbol.
How can I do that?
There is no built in method that does something like that. You just have to put the separators in there by looping through the string.
You can loop though the characters and put them in a StringBuilder, adding a colon at every other character:
string input = "abcdef123456789";
StringBuilder builder = new StringBuilder();
int cnt = 0;
foreach (char c in input) {
if (cnt == 2) {
builder.Append(':');
cnt = 0;
}
builder.Append(c);
cnt++;
}
string output = builder.ToString();
You could try this approach. I've attempted to make it as readable as possible, so hopefully it makes sense to you.
var s = "abcdef123456789";
var charsChanged = new List<char>();
for (var i = 0; i < s.Length; i++)
{
charsChanged.Add(s[i]);
var evenCharacter = i % 2 != 0;
var atEndOfString = i == s.Length - 1;
if (evenCharacter && !atEndOfString)
{
charsChanged.Add(':');
}
}
var updatedString = string.Concat(charsChanged));
updatedString will equal ab:cd:ef:12:34:56:78:9.
This approach makes use of the modulus operator (%) to determine if we're at an even or odd character. For more examples, check this out.
You can use a StringBuilder and do something like this:
string sourceStr = "123456789";
StringBuilder s = new StringBuilder();
foreach(char c in sourceStr){
s.append(c);
s.append(":");
}

Concatenate neighboring characters of a special character "-"

i am developing an application using c#.net in which i need that if a input entered by user contains the character '-'(hyphen) then i want the immediate neighbors of the hyphen(-) to be concatenated for example if a user enters
A-B-C then i want it to be replaced with ABC
AB-CD then i want it to be replaced like BC
ABC-D-E then i want it to be replaced like CDE
AB-CD-K then i want it to be replaced like BC and DK both separated by keyword and
after getting this i have to prepare my query to database.
i hope i made the problem clear but if need more clarification let me know.
Any help will be appreciated much.
Thanks,
Devjosh
Use:
string[] input = {
"A-B-C",
"AB-CD",
"ABC-D-E",
"AB-CD-K"
};
var regex = new Regex(#"\w(?=-)|(?<=-)\w", RegexOptions.Compiled);
var result = input.Select(s => string.Concat(regex.Matches(s)
.Cast<Match>().Select(m => m.Value)));
foreach (var s in result)
{
Console.WriteLine(s);
}
Output:
ABC
BC
CDE
BCDK
Untested, but this should do the trick, or at the very least lead you in the right direction.
private string Prepare(string input)
{
StringBuilder output = new StringBuilder();
char[] chars = input.ToCharArray();
for (int i = 0; i < chars.Length; i++)
{
if (chars[i] == '-')
{
if (i > 0)
{
output.Append(chars[i - 1]);
}
if (++i < chars.Length)
{
output.Append(chars[i])
}
else
{
break;
}
}
}
return output.ToString();
}
If you want each pair to form a separate object in an array, try the following code:
private string[] Prepare(string input)
{
List<string> output = new List<string>();
char[] chars = input.ToCharArray();
for (int i = 0; i < chars.Length; i++)
{
if (chars[i] == '-')
{
string o = string.Empty;
if (i > 0)
{
o += chars[i - 1];
}
if (++i < chars.Length)
{
o += chars[i]
}
output.Add(o);
}
}
return output.ToArray();
}
Correct me if I am wrong but surely all you need to do is remove the '-'?
like this:
"A-B-C".Replace("-","");
You can even solve this with a one-liner (although a bit ugly):
String.Join(String.Empty, input.Split('-').Select(q => (q.Length == 0 ? String.Empty : (q.Length > 1 ? (q.First() + q.Last()).ToString() : q.First().ToString())))).Substring(((input[0] + input[1]).ToString().Contains('-') ? 0 : 1), input.Length - ((input[0] + input[1]).ToString().Contains('-') ? 0 : 1) - ((input[input.Length - 1] + input[input.Length - 2]).ToString().Contains('-') ? 0 : 1));
first it splits the string to an array on each '-', then it concatenates only the first and the last character of each string (or just the only character if there's only one, and it leaves the empty string if there's nothing there), and then it concatenates the resulting enumerable to a String. Finally we strip the first and the last letter, if they are not in the needed range.
I know, it's ugly, I'm just saying that it's possible..
Probably it's way better to just use a simple
new Regex(#"\w(?=-)|(?<=-)\w", RegexOptions.Compiled)
and then work with that..
EDIT #Kirill Polishchuk was faster.. his solution should work..
EDIT 2
After the Question has been updated, here's a snippet that should do the trick:
string input = "A-B-C";
string s2;
string s3 = "";
string s4 = "";
var splitted = input.Split('-');
foreach(string s in splitted) {
if (s.Length == 0)
s2 = String.Empty;
else
if (s.Length > 1)
s2 = (s.First() + s.Last()).ToString();
else
s2 = s.First().ToString();
s3 += s4 + s2;
s4 = " and ";
}
int beginning;
int end;
if (input.Length > 1)
{
if ((input[0] + input[1]).ToString().Contains('-'))
beginning = 0;
else
beginning = 1;
if ((input[input.Length - 1] + input[input.Length - 2]).ToString().Contains('-'))
end = 0;
else
end = 1;
}
else
{
if ((input[0]).ToString().Contains('-'))
beginning = 0;
else
beginning = 1;
if ((input[input.Length - 1]).ToString().Contains('-'))
end = 0;
else
end = 1;
}
string result = s3.Substring(beginning, s3.Length - beginning - end);
It's not very elegant, but it should work (not tested though..). it works nearly the same as the one-liner above...

How do I extract text that lies between parentheses (round brackets)?

I have a string User name (sales) and I want to extract the text between the brackets, how would I do this?
I suspect sub-string but I can't work out how to read until the closing bracket, the length of text will vary.
If you wish to stay away from regular expressions, the simplest way I can think of is:
string input = "User name (sales)";
string output = input.Split('(', ')')[1];
A very simple way to do it is by using regular expressions:
Regex.Match("User name (sales)", #"\(([^)]*)\)").Groups[1].Value
As a response to the (very funny) comment, here's the same Regex with some explanation:
\( # Escaped parenthesis, means "starts with a '(' character"
( # Parentheses in a regex mean "put (capture) the stuff
# in between into the Groups array"
[^)] # Any character that is not a ')' character
* # Zero or more occurrences of the aforementioned "non ')' char"
) # Close the capturing group
\) # "Ends with a ')' character"
Assuming that you only have one pair of parenthesis.
string s = "User name (sales)";
int start = s.IndexOf("(") + 1;
int end = s.IndexOf(")", start);
string result = s.Substring(start, end - start);
Use this function:
public string GetSubstringByString(string a, string b, string c)
{
return c.Substring((c.IndexOf(a) + a.Length), (c.IndexOf(b) - c.IndexOf(a) - a.Length));
}
and here is the usage:
GetSubstringByString("(", ")", "User name (sales)")
and the output would be:
sales
Regular expressions might be the best tool here. If you are not famililar with them, I recommend you install Expresso - a great little regex tool.
Something like:
Regex regex = new Regex("\\((?<TextInsideBrackets>\\w+)\\)");
string incomingValue = "Username (sales)";
string insideBrackets = null;
Match match = regex.Match(incomingValue);
if(match.Success)
{
insideBrackets = match.Groups["TextInsideBrackets"].Value;
}
string input = "User name (sales)";
string output = input.Substring(input.IndexOf('(') + 1, input.IndexOf(')') - input.IndexOf('(') - 1);
A regex maybe? I think this would work...
\(([a-z]+?)\)
using System;
using System.Text.RegularExpressions;
private IEnumerable<string> GetSubStrings(string input, string start, string end)
{
Regex r = new Regex(Regex.Escape(start) +`"(.*?)"` + Regex.Escape(end));
MatchCollection matches = r.Matches(input);
foreach (Match match in matches)
yield return match.Groups[1].Value;
}
int start = input.IndexOf("(") + 1;
int length = input.IndexOf(")") - start;
output = input.Substring(start, length);
Use a Regular Expression:
string test = "(test)";
string word = Regex.Match(test, #"\((\w+)\)").Groups[1].Value;
Console.WriteLine(word);
input.Remove(input.IndexOf(')')).Substring(input.IndexOf('(') + 1);
The regex method is superior I think, but if you wanted to use the humble substring
string input= "my name is (Jayne C)";
int start = input.IndexOf("(");
int stop = input.IndexOf(")");
string output = input.Substring(start+1, stop - start - 1);
or
string input = "my name is (Jayne C)";
string output = input.Substring(input.IndexOf("(") +1, input.IndexOf(")")- input.IndexOf("(")- 1);
var input = "12(34)1(12)(14)234";
var output = "";
for (int i = 0; i < input.Length; i++)
{
if (input[i] == '(')
{
var start = i + 1;
var end = input.IndexOf(')', i + 1);
output += input.Substring(start, end - start) + ",";
}
}
if (output.Length > 0) // remove last comma
output = output.Remove(output.Length - 1);
output : "34,12,14"
Here is a general purpose readable function that avoids using regex:
// Returns the text between 'start' and 'end'.
string ExtractBetween(string text, string start, string end)
{
int iStart = text.IndexOf(start);
iStart = (iStart == -1) ? 0 : iStart + start.Length;
int iEnd = text.LastIndexOf(end);
if(iEnd == -1)
{
iEnd = text.Length;
}
int len = iEnd - iStart;
return text.Substring(iStart, len);
}
To call it in your particular example you can do:
string result = ExtractBetween("User name (sales)", "(", ")");
I'm finding that regular expressions are extremely useful but very difficult to write. So, I did some research and found this tool that makes writing them so easy.
Don't shy away from them because the syntax is difficult to figure out. They can be so powerful.
This code is faster than most solutions here (if not all), packed as String extension method, it does not support recursive nesting:
public static string GetNestedString(this string str, char start, char end)
{
int s = -1;
int i = -1;
while (++i < str.Length)
if (str[i] == start)
{
s = i;
break;
}
int e = -1;
while(++i < str.Length)
if (str[i] == end)
{
e = i;
break;
}
if (e > s)
return str.Substring(s + 1, e - s - 1);
return null;
}
This one is little longer and slower, but it handles recursive nesting more nicely:
public static string GetNestedString(this string str, char start, char end)
{
int s = -1;
int i = -1;
while (++i < str.Length)
if (str[i] == start)
{
s = i;
break;
}
int e = -1;
int depth = 0;
while (++i < str.Length)
if (str[i] == end)
{
e = i;
if (depth == 0)
break;
else
--depth;
}
else if (str[i] == start)
++depth;
if (e > s)
return str.Substring(s + 1, e - s - 1);
return null;
}
I've been using and abusing C#9 recently and I can't help throwing in Spans even in questionable scenarios... Just for the fun of it, here's a variation on the answers above:
var input = "User name (sales)";
var txtSpan = input.AsSpan();
var startPoint = txtSpan.IndexOf('(') + 1;
var length = txtSpan.LastIndexOf(')') - startPoint;
var output = txtSpan.Slice(startPoint, length);
For the OP's specific scenario, it produces the right output.
(Personally, I'd use RegEx, as posted by others. It's easier to get around the more tricky scenarios where the solution above falls apart).
A better version (as extension method) I made for my own project:
//Note: This only captures the first occurrence, but
//can be easily modified to scan across the text (I'd prefer Slicing a Span)
public static string ExtractFromBetweenChars(this string txt, char openChar, char closeChar)
{
ReadOnlySpan<char> span = txt.AsSpan();
int firstCharPos = span.IndexOf(openChar);
int lastCharPos = -1;
if (firstCharPos != -1)
{
for (int n = firstCharPos + 1; n < span.Length; n++)
{
if (span[n] == openChar) firstCharPos = n; //This allows the opening char position to change
if (span[n] == closeChar) lastCharPos = n;
if (lastCharPos > firstCharPos) break;
//This would correctly extract "sales" from this [contrived]
//example: "just (a (name (sales) )))(test"
}
return span.Slice(firstCharPos + 1, lastCharPos - firstCharPos - 1).ToString();
}
return "";
}
Much similar to #Gustavo Baiocchi Costa but offset is being calculated with another intermediate Substring.
int innerTextStart = input.IndexOf("(") + 1;
int innerTextLength = input.Substring(start).IndexOf(")");
string output = input.Substring(innerTextStart, innerTextLength);
I came across this while I was looking for a solution to a very similar implementation.
Here is a snippet from my actual code. Starts substring from the first char (index 0).
string separator = "\n"; //line terminator
string output;
string input= "HowAreYou?\nLets go there!";
output = input.Substring(0, input.IndexOf(separator));

Categories