C# replace string except when preceded by another - c#

I want to replace all ocurrence of " by \" in a string except if this " is preceded by a \
for exemple the string hello "World\" will become hello \"World\"
Is it possible without using regex ?
But if I have to use regex, what kind have I to use ?
Thanks for help,
regards,

You could use a lookbehind:
var output = Regex.Replace(input, #"(?<!\\)""", #"\""")
Or you could just make the preceeding character optional, for example:
var output = Regex.Replace(input, #"\\?""", #"\""")
This works because " is replaced with \" (which is what you wanted), and \" is replaced with \", so no change.

The regex for this would be:
(?<!\\)"

Without a regex this should do:
yourStringVar.Replace("""","\\""").Replace("\\\\""","\\""");

It is possible without using regex:
str = str.Replace(" \"", "\\\"");

Since you have asked if it's possible without using regex explicitly, that's not as simple and impossible with pure String.Replace approaches. You could use a loop and a StringBuilder:
StringBuilder builder = new StringBuilder();
builder.Append(text[0] == '"' ? "\\\"" : text.Substring(0, 1));
for (int i = 1; i < text.Length; i++)
{
Char next = text[i];
Char last = text[i - 1];
if (next == '"' && last != '\\')
builder.Append("\\\"");
else
builder.Append(next);
}
string result = builder.ToString();
Edit: here's a demo (difficult to create that string literal): http://ideone.com/Xmeh1w

Related

Replacing only single "\n" in string occurrence

I have a string in C# which can have multiple \n characters. For e.g. :
string tp = "Hello\nWorld \n\n\n !!";
If there is a single occurrence of \n I want to replace it with something, but if more than one \n appear together in the same place I want to leave them alone. So for the tp string above I want to replace the \n between Hello and World, because there is only one at that place, and leave the the three \n nearer the end of the string alone, because they appear in a group.
If I try to use the Replace() method in C# it replaces all of them. How can I resolve this issue?
You can try using regular expressions: let's change \n into "*" whenever \n is single:
using System.Text.RegularExpressions;
...
string tp = "Hello\nWorld \n\n\n !!";
// "Hello*World \n\n\n !!";
string result = Regex.Replace(tp, "\n+", match =>
match.Value.Length > 1
? match.Value // multiple (>1) \n in row: leave intact
: "*"); // single ocurrence: change into "*"
A solution using loops:
char[] c = "\t"+ tp + "\t".ToCharArray();
for(int i = 1; i < c.Length - 1; i++)
if(c[i] == '\n' && c[i-1] != '\n' && c[i+1] != '\n')
c[i] = 'x';
tp = new string(c, 1, c.Length-2);
Use regular expressions and combine negative lookbehind and lookahead:
var test = "foo\nbar...foo\n\nbar\n\n\nfoo\r\nbar";
var replaced = System.Text.RegularExpressions.Regex.Replace(test, "(?<!\n)\n(?!\n)", "_");
// only first and last \n have been replaced
While searching through the input the regex "stops" at any "\n" it finds and verifies if no "\n" is one character behind the current position or ahead.
Thus only single "\n" will be replaced.

Regex Ignore first and last terminator

I have string in text that have uses | as a delimiter.
Example:
|2P|1|U|F8|
I want the result to be 2P|1|U|F8. How can I do that?
The regex is very easy, but why not just use Trim():
var str = "|2P|1|U|F8|";
str = str.Trim(new[] {'|'});
or just without new[] {...}:
str = str.Trim('|');
Output:
In case there are leading/trailing whitespaces, you can use chained Trims:
var str = "\r\n |2P|1|U|F8| \r\n";
str = str.Trim().Trim('|');
Output will be the same.
You can use String.Substring:
string str = "|2P|1|U|F8|";
string newStr = str.Substring(1, str.Length - 2);
Just remove the starting and the ending delimiter.
#"^\||\|$"
Use the below regex and then replace the match with an empty string.
Regex rgx = new Regex(#"^\||\|$");
string result = rgx.Replace(input, "");
Use mulitline modifier m when you're dealing with multiple lines.
Regex rgx = new Regex(#"(?m)^\||\|$");
Since | is a special char in regex, you need to escape this in-order to match a literal | symbol.
string input = "|2P|1|U|F8|";
foreach (string item in input.Split("|".ToCharArray(), StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine(item);
}
Result is:
2P
1
U
F8
^\||\|$
You can try this.Replace by empty string.Use verbatim mode.See demo.
https://regex101.com/r/oF9hR9/14
For completionists-sake, you can also use Mid
Strings.Mid("|2P|1|U|F8|", 2, s.Length - 2)
This will cut out the part from the second character to the previous to last one and produce the correct output.
I'm assuming that at some point you will want to parse the string to extract its '|' separated components, so here goes another alternative that goes in that direction:
string.Join("|", theString.Split(new[] {'|'}, StringSplitOptions.RemoveEmptyEntries))

What is the regular expression to replace white space with a specified character?

I have searched lot of questions and answers but, I just got lengthy and complicated expressions. Now I want to replace all white spaces from the string. I know it can be done by regex. but, I don't have enough knowledge about regex and how to replace all white space with ','(comma) using it. I have checked some links but, I didn't get exact answer. If you have any link of posted question or answer like this. please suggest me.
My string is defined as below.
string sText = "BankMaster AccountNo decimal To varchar";
and the result should be return as below.
"BankMaster,AccountNo,decimal,To,varchar"
Full Code:
string sItems = Clipboard.GetText();
string[] lines = sItems.Split('\n');
for (int iLine =0; iLine<lines.Length;iLine++)
{
string sLine = lines[iLine];
sLine = //CODE TO REPLACE WHITE SPACE WITH ','
string[] cells = sLine.Split(',');
grdGrid.Rows.Add(iLine, cells[0], cells[1], cells[2], cells[4]);
}
Additional Details
I have more than 16000 line in a list. and all lines are same formatted like given example above. So, I am going to use regular expression instead of loop and recursive function call. If you have any other way to make this process more faster than regex then please suggest me.
string result = Regex.Replace(sText, "\\s+", ",");
\s+ stands for "capture all sequential whitespaces of any kind".
By whitespace regex engine undeerstands space (), tab (\t), newline (\n) and caret return (\r)
string a = "Some text with spaces";
Regex rgx = new Regex("\\s+");
string result = rgx.Replace(a, ",");
Console.WriteLine(result);
The code above will replace all the white spaces with ',' character
there are lot's of samples to do that by regular expressions:
Flex: replace all spaces with comma,
Regex replace all commas with value,
http://www.perlmonks.org/?node_id=896548,
http://www.dslreports.com/forum/r20971008-sed-help-whitespace-to-comma
Try This:
string str = "BankMaster AccountNo decimal To varchar";
StringBuilder temp = new StringBuilder();
str=str.Trim(); //trim before logic to avoid any trailing/leading whitespaces.
foreach(char ch in str)
{
if (ch == ' ' && temp[temp.Length-1] != ',')
{
temp.Append(",");
}
else if (ch != ' ')
{
temp.Append(ch.ToString());
}
}
Console.WriteLine(temp);
Output:
BankMaster,AccountNo,decimal,To,varchar
Try this:
sText = Regex.Replace(sText , #"\s+", ",");

Is there a method for removing whitespace characters from a string?

Is there a string class member function (or something else) for removing all spaces from a string? Something like Python's str.strip() ?
You could simply do:
myString = myString.Replace(" ", "");
If you want to remove all white space characters you could use Linq, even if the syntax is not very appealing for this use case:
myString = new string(myString.Where(c => !char.IsWhiteSpace(c)).ToArray());
String.Trim method removes trailing and leading white spaces. It is the functional equivalent of Python's strip method.
LINQ feels like overkill here, converting a string to a list, filtering the list, then turning it back onto a string. For removal of all white space, I would go for a regular expression. Regex.Replace(s, #"\s", ""). This is a common idiom and has probably been optimized.
If you want to remove the spaces that prepend the string or at itt's end, you might want to have a look at TrimStart() and TrimEnd() and Trim().
If you're looking to replace all whitespace in a string (not just leading and trailing whitespace) based on .NET's determination of what's whitespace or not, you could use a pretty simple LINQ query to make it work.
string whitespaceStripped = new string((from char c in someString
where !char.IsWhiteSpace(c)
select c).ToArray());
Yes, Trim.
String a = "blabla ";
var b = a.Trim(); // or TrimEnd or TrimStart
Yes, String.Trim().
var result = " a b ".Trim();
gives "a b" in result. By default all whitespace is trimmed. If you want to remove only space you need to type
var result = " a b ".Trim(' ');
If you want to remove all spaces in a string you can use string.Replace().
var result = " a b ".Replace(" ", "");
gives "ab" in result. But that is not equivalent to str.strip() in Python.
I don't know much about Python...
IF the str.strip() just removes whitespace at the start and the end then you could use str = str.Trim() in .NET... otherwise you could just str = str.Replace ( " ", "") for removing all spaces.
IF it removes all whitespace then use
str = (from c in str where !char.IsWhiteSpace(c) select c).ToString()
There are many diffrent ways, some faster then others:
public static string StripTabsAndNewlines(this string s) {
//string builder (fast)
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.Length; i++) {
if ( ! Char.IsWhiteSpace(s[i])) {
sb.Append();
}
}
return sb.tostring();
//linq (faster ?)
return new string(input.ToCharArray().Where(c => !Char.IsWhiteSpace(c)).ToArray());
//regex (slow)
return Regex.Replace(s, #"\s+", "")
}
you could use
StringVariable.Replace(" ","")
I'm surprised no one mentioned this:
String.Join("", " all manner\tof\ndifferent\twhite spaces!\n".Split())
string.Split by default splits along the characters that are char.IsWhiteSpace so this is a very similar solution to filtering those characters out by the direct use of char.IsWhiteSpace and it's a one-liner that works in pre-LINQ environments as well.
Strip spaces? Strip whitespaces? Why should it matter? It only matters if we're searching for an existing implementation, but let's not forget how fun it is to program the solution rather than search MSDN (boring).
You should be able to strip any chars from any string by using 1 of the 2 functions below.
You can remove any chars like this
static string RemoveCharsFromString(string textChars, string removeChars)
{
string tempResult = "";
foreach (char c in textChars)
{
if (!removeChars.Contains(c))
{
tempResult = tempResult + c;
}
}
return tempResult;
}
or you can enforce a character set (so to speak) like this
static string EnforceCharLimitation(string textChars, string allowChars)
{
string tempResult = "";
foreach (char c in textChars)
{
if (allowChars.Contains(c))
{
tempResult = tempResult + c;
}
}
return tempResult;
}

Replace char in a string

how to change
XXX#YYY.ZZZ into XXX_YYY_ZZZ
One way i know is to use the string.replace(char, char) method,
but i want to replace "#" & "." The above method replaces just one char.
one more case is what if i have XX.X#YYY.ZZZ...
i still want the output to look like XX.X_YYY_ZZZ
Is this possible?? any suggestions thanks
So, if I'm understanding correctly, you want to replace # with _, and . with _, but only if . comes after #? If there is a guaranteed # (assuming you're dealing with e-mail addresses?):
string e = "XX.X#YYY.ZZZ";
e = e.Substring(0, e.IndexOf('#')) + "_" + e.Substring(e.IndexOf('#')+1).Replace('.', '_');
Here's a complete regex solution that covers both your cases. The key to your second case is to match dots after the # symbol by using a positive look-behind.
string[] inputs = { "XXX#YYY.ZZZ", "XX.X#YYY.ZZZ" };
string pattern = #"#|(?<=#.*?)\.";
foreach (var input in inputs)
{
string result = Regex.Replace(input, pattern, "_");
Console.WriteLine("Original: " + input);
Console.WriteLine("Modified: " + result);
Console.WriteLine();
}
Although this is simple enough to accomplish with a couple of string Replace calls. Efficiency is something you will need to test depending on text size and number of replacements the code will make.
You can use the Regex.Replace method:
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace(v=VS.90).aspx
You can use the following extension method to do your replacement without creating too many temporary strings (as occurs with Substring and Replace) or incurring regex overhead. It skips to the # symbol, and then iterates through the remaining characters to perform the replacement.
public static string CustomReplace(this string s)
{
var sb = new StringBuilder(s);
for (int i = Math.Max(0, s.IndexOf('#')); i < sb.Length; i++)
if (sb[i] == '#' || sb[i] == '.')
sb[i] = '_';
return sb.ToString();
}
you can chain replace
var newstring = "XX.X#YYY.ZZZ".Replace("#","_").Replace(".","_");
Create an array with characters you want to have replaced, loop through array and do the replace based off the index.
Assuming data format is like XX.X#YYY.ZZZ, here is another alternative with String.Split(char seperator):
string[] tmp = "XX.X#YYY.ZZZ".Split('#');
string newstr = tmp[0] + "_" + tmp[1].Replace(".", "_");

Categories