RegEx.Replace is only replacing first occurrence, need all - c#

I'm having an issue with Regex.Replace in C# as it doesn't seem to be replacing all occurrences of the matched pattern.
private string ReplaceBBCode(string inStr)
{
var outStr = Regex.Replace(inStr, #"\[(b|i|u)\](.*?)\[/\1\]", #"<$1>$2</$1>", RegexOptions.IgnoreCase | RegexOptions.Multiline);
outStr = Regex.Replace(outStr, "(\r|\n)+", "<br />");
return outStr;
}
The input string:
[b]Saint Paul's Food Kitchen[/b] [b] [/b]Saint Paul's food kitchen opens weekly to provide food to those in need.
The result:
<b>Saint Paul's Food Kitchen</b> [b] [/b]Saint Paul's food kitchen opens weekly to provide food to those in need.
I've tested this in regexhero.net and it works exactly as it should there.
EDIT:
Sorry, copied the wrong version of the function. It now shows the correct code, that behaves incorrectly for me.

The output I'm getting is completely different from what you say you're getting, but
The biggest problem I see, is that you probably don't want your regex to be greedy.
try replacing the .* with .*?

No need for Regex:
private static string ReplaceBBCode(string inStr)
{
return inStr.Replace("[b]", "<b>").Replace("[/b]", "</b>")
.Replace("[i]", "<i>").Replace("[/i]", "</i>")
.Replace("[u]", "<u>").Replace("[/u]", "</u>")
.Replace("\r\n", "\n")
.Replace("\n", "<br />");
}
I like this one better:
private static string ReplaceBBCode(string inStr)
{
StringBuilder outStr = new StringBuilder();
bool addBR = false;
for(int i=0; i<inStr.Length; i++){
if (addBR){
outStr.Append("<br />");
addBR = false;
}
if (inStr[i] == '\r' || inStr[i] == '\n'){
if (!addBR)
addBR = true;
}
else {
addBR = false;
if (i+2 < inStr.Length && inStr[i] == '['
&& (inStr[i+1] == 'b' || inStr[i+1] == 'i' || inStr[i+1] == 'u')
&& inStr[i+2] == ']'){
outStr.Append("<").Append(inStr[i+1]).Append(">");
i+=2;
}
else if(i+3 < inStr.Length && inStr[i] == '[' && inStr[i+1] == '/'
&& (inStr[i+2] == 'b' || inStr[i+2] == 'i' || inStr[i+2] == 'u')
&& inStr[i+3] == ']'){
outStr.Append("</").Append(inStr[i+2]).Append(">");
i+=3;
}
else
outStr.Append(inStr[i]);
}
}
return outStr.ToString();
}

This solved the issue, it also handles nested tags. Not sure why, but rebuilding over and over it still was causing errors. Its possible our VS2010 is corrupted and not building properly, or that the framework is corrupted. Not sure what the cause of the problem is, but this solved it:
private string ReplaceBBCode(string inStr)
{
var outStr = inStr;
var bbre = new Regex(#"\[(b|i|u)\](.*?)\[/\1\]", RegexOptions.IgnoreCase | RegexOptions.Multiline);
while( bbre.IsMatch(outStr))
outStr = bbre.Replace(outStr, #"<$1>$2</$1>");
outStr = Regex.Replace(outStr, "(\r|\n)+", "<br />");
return outStr;
}

Related

How to find character after pipe "|" in C#

I would like to find character after | and check in a loop in C#.
Like I have Test|T1. After pipe the character could be anything like Test|T2 or Test|T3.
The first value is Table|Y and second value could be Test|T1, Test|T2 or Test|T3.
So I would like to check the character after | in else block.
foreach (var testing in TestQuestion.Split(','))
{
if(testing.Equals("Table|Y"))
{
order.OrderProperties.Add(new OrderProperty("A")
{
...
});
}
else
{
//check the value after "|"
}
}
So I would like to check the character after | in else block.
like this
var s = "XXX|T1234";
var idx = s.IndexOf("|");
if (idx > -1) // found
{
var nextChar = s.Substring(idx + 1, 1);
}
if its possible that '|' is the last character then you need to check for that
Another way to do this,
var tokens = test.Split('|');
if (tokens.FirstOrDefault() == "Test" && tokens.LastOrDefault() == "T1")
{
}
You can do it this way (to test if value is T1 for example):
if (testing.Split('|')[0] == "Test" && testing.Split('|')[1] == "T1")
{
}

C#: How to add double quotes in variable

i am developing silverlight application. right now, i have an issue to pass string variable to sql statement, together with double quotes. i tried using this way:
string itemList;
string commaString = #"""";
for (int i = 0; i < TestList.Count; i++)
{
itemList += commaString + TestList[i].Code + commaString + " || ";
}
itemList = "x.Code == " + itemList;
itemList = itemList.Removed(itemList.Length - 3);
but the variable passed is like this:
" x.Code == \" 11001-111001 \" || x.Code == \" 11016-111001 \"
"
i want it to be this way:
x.Code == "11001-111001" || x.Code == "11016-111001"
i dont want the back slash. i want to pass this variable to sql select statement. is it possible to do this? can anybody help me? thank you...
***Updated:
my TestList currently have 2 values:
"11001-111001"
"11016-111001"
and then i want to combine these values in itemList to be:
x.Code == "11001-111001" || x.Code == "11016-111001"
since i want to use this in sql statement. when combined, now becomes
" x.Code == \"11001-111001\" || x.Code == \"11016-111001\" "
below is how i want to use the variable. i want to replace the codes with itemList:
private void GetAccountCodes()
{
var r = _svc.AccountCodes.Where(x => x.Code == "11001-111001" || x.Code == "11016-111001").Select(x => x);
_company.AccountCodes.LoadCompleted -= new EventHandler<LoadCompletedEventArgs>(AccountCodes_LoadCompleted);
_company.AccountCodes.LoadCompleted += new EventHandler<LoadCompletedEventArgs>(AccountCodes_LoadCompleted);
_company.AccountCodes.Clear(true);
_company.AccountCodes.LoadAsync(r);
}
private void AccountCodes_LoadCompleted(object sender, LoadCompletedEventArgs e)
{
if (_company.AccountCodes!= null && _company.AccountCodes.Count() > 0)
{
//do something. but right now it returns no record
}
}
Edit:
You can use Contains for pass a list of items for a database linq query:
.Where(x=> itemList.Contains(x.Code)).ToList()
Debugger escape characters:
As for the `\" part This is only the debugger showing the value like that.
Trying clicking the small magnifier icon or printing the value to the console, you'll get:
"11001-111001" || x.Code == "11016-111001" || "
See this image as a proof:
Those escape backslashes are added by the debugger only.
To add a single quote, you can use one of the below approaches:
string commaString = " \" "
Or:
string commaString = #" "" "
user3185569 is right. It's just the debugger that shows that value. Nevertheless it might not be the best idea to concatenate the strings that way (remember: strings are immutable). I would use a StringBuilder() and string.Format() as follows:
StringBuilder itemList = new StringBuilder();
for (int i = 0; i < TestList.Count; i++)
{
itemList.Append(string.Format("\"{0}\" || ", TestList[i].Code));
}
This also improves readability.

Checking open parenthesis close parenthesis in c#

I am creating a compiler.
When I write input code for my compiler, if there is a missing parenthesis, the compiler should show an error. For that I use this code:
Stack<int> openingbraces = new Stack<int>();
string output = string.Empty;
for (int i = 0; i < MELEdtior.Length; i++)
{
if (MELEdtior[i] == '{')
{
openingbraces.Push(i);
output="close braces missing";
}
else if (MELEdtior[i] == '}')
{
openingbraces.Push(i);
output = "Open Braces missing";
}
}
if(openingbraces.Count==2)
{
output = "Build Successfull";
}
else
{
output = "brace missing";
}`
When I give simple input like function{} it works perfectly. But my input is:
{global gHCIRCIN = OBSNOW("Head circumf")}
{IF gHCIRCCM <> "" AND HeadCircsDifferrev() THEN
OBSNOW("Head circumf",str(rnd(ConvertCMtoIN(gHCIRCCM),2))) ELSE "" ENDIF }
Here my compiler should check the correctness of all parentheses, and show an error message.
My idea to achieve this is to separate opening and closing parentheses first and then pair them, if any pair is missing, my compiler should throw an error message. How can I implement this?
Here is a mini program solving the problem. Based on o_weisman's comment.
class Program {
static void Main(string[] args) {
int currentOpenBracketNum = 0;
string message = "Brackets OK";
string input = #"{globa} }{IF gHCIRCCM <> """" AND HeadCircsDifferrev() THEN OBSNOW(""Head circumf"",str(rnd(ConvertCMtoIN(gHCIRCCM),2))) ELSE """" ENDIF }";
foreach (char c in input) {
if (c.Equals('{')) currentOpenBracketNum++;
else if (c.Equals('}')) {
if (currentOpenBracketNum > 0) {
currentOpenBracketNum--;
} else {
message = "Missing open bracket";
}
}
}
if (currentOpenBracketNum > 0) {
message = "Missing close bracket";
}
Console.WriteLine(message);
Console.ReadKey(); // suspend screen
}
}
Note: You could track if you are within " characters and exclude counting of those which is considered as string, if you want to solve the coming issue what xanatos is pointing out.
To avoid the problems with brackets within quoted area, I would just use RegEx to replace them. Then you can count the signs:
char quoteChar = '"';
string s1 = "{global gHCIRCIN = OBSNOW(\"Head circumf\")} {IF gHCIRCCM <> \"\" AND HeadCircsDifferrev() THEN OBSNOW(\"Head circumf\",str(rnd(ConvertCMtoIN(gHCIRCCM),2))) ELSE \"\" ENDIF }";
string s2 = Regex.Replace(s1, quoteChar + ".*?" + quoteChar, "This_was_quoted");
int countOpening = s2.Count(c => c == '{');
int countClosing = s2.Count(c => c == '}');
MessageBox.Show(string.Format("There are {0} opening and {1} closing }}-signs.", countOpening, countClosing));

Remove comma's between the [] brackets in string using c#

I want remove comma's between the square brackets[] instead of entire comma from the string.
Here my string is,
string result= "a,b,c,[c,d,e],f,g,[h,i,j]";
Expected output:
a,b,c,[cde],f,g,[hij]
Thanks advance.
As I've written, you need a simple state machine (inside brackets, outside brackets)... Then for each character, you analyze it and if necessary you change the state of the state machine and decide if you need to output it or not.
public static string RemoveCommas(string str)
{
int bracketLevel = 0;
var sb = new StringBuilder(str.Length);
foreach (char ch in str)
{
switch (ch) {
case '[':
bracketLevel++;
sb.Append(ch);
break;
case ']':
if (bracketLevel > 0) {
bracketLevel--;
}
sb.Append(ch);
break;
case ',':
if (bracketLevel == 0) {
sb.Append(ch);
}
break;
default:
sb.Append(ch);
break;
}
}
return sb.ToString();
}
Use it like:
string result = "a,b,c,[c,d,e],f,g,[h,i,j]";
Console.WriteLine(RemoveCommas(result));
Note that to "save" the state of the state machine I'm using an int, so that it works with recursive brackets, like a,b,[c,d,[e,f]g,h]i,j
Just as an interesting exercise, it can be done with a slower LINQ expression:
string result2 = result.Aggregate(new
{
BracketLevel = 0,
Result = string.Empty,
}, (state, ch) => new {
BracketLevel = ch == '[' ?
state.BracketLevel + 1 :
ch == ']' && state.BracketLevel > 0 ?
state.BracketLevel - 1 :
state.BracketLevel,
Result = ch != ',' || state.BracketLevel == 0 ? state.Result + ch : state.Result
}).Result;
In the end the code is very similar... There is a state that is brought along (the BracketLevel) plus the string (Result) that is being built. please don't use it, it is only written as an amusing piece of LINQ.
Regex approach
string stringValue = "a,b,c,[c,d,e],f,g,[h,i,j]";
var result = Regex.Replace(stringValue, #",(?![^\]]*(?:\[|$))", string.Empty);
if you don't have nested brackets
You can try this:
var output = new string(result
.Where((s, index) => s != ',' ||
IsOutside(result.Substring(0, index)))
.ToArray()
);
//output: a,b,c,[cde],f,g,[hij]
And
private static bool IsOutside(string value)
{
return value.Count(i => i == '[') <= value.Count(i => i == ']');
}
But remember this is not the efficient way of doing this job.

How to not include line breaks when comparing two strings

i am comparing updates to two strings. i did a:
string1 != string2
and they turn out different. I put them in the "Add Watch" and i see the only difference is one has line breaks and the other doesnt'.:
string1 = "This is a test. \nThis is a test";
string2 = "This is a test. This is a test";
i basically want to do a compare but dont include line breaks. So if line break is the only difference then consider them equal.
A quick and dirty way, when performance isn't much of an issue:
string1.Replace("\n", "") != string2.Replace("\n", "")
I'd suggest regex to reduce every space, tab, \r, \n to a single space :
Regex.Replace(string1, #"\s+", " ") != Regex.Replace(string2, #"\s+", " ")
Assuming:
The sort of direct char-value-for-char-value comparison of != and == is what is wanted here, except for the matter of newlines.
The strings are, or may, be large enough or compared often enough to make just replacing "\n" with an empty string too inefficient.
Then:
public bool LinelessEquals(string x, string y)
{
//deal with quickly handlable cases quickly.
if(ReferenceEquals(x, y))//same instance
return true; // - generally happens often in real code,
//and is a fast check, so always worth doing first.
//We already know they aren't both null as
//ReferenceEquals(null, null) returns true.
if(x == null || y == null)
return false;
IEnumerator<char> eX = x.Where(c => c != '\n').GetEnumerator();
IEnumerator<char> eY = y.Where(c => c != '\n').GetEnumerator();
while(eX.MoveNext())
{
if(!eY.MoveNext()) //y is shorter
return false;
if(ex.Current != ey.Current)
return false;
}
return !ey.MoveNext(); //check if y was longer.
}
This is defined as equality rather than inequality, so you could easily adapt it to be an implementation of IEqualityComparer<string>.Equals. Your question for a linebreak-less string1 != string2 becomes: !LinelessEquals(string1, string2)
Here's an equality comparer for strings that ignores certain characters, such as \r and \n.
This implementation doesn't allocate any heap memory during execution, helping its performance. It also avoids virtual calls through IEnumerable and IEnumerator.
public sealed class SelectiveStringComparer : IEqualityComparer<string>
{
private readonly string _ignoreChars;
public SelectiveStringComparer(string ignoreChars = "\r\n")
{
_ignoreChars = ignoreChars;
}
public bool Equals(string x, string y)
{
if (ReferenceEquals(x, y))
return true;
if (x == null || y == null)
return false;
var ix = 0;
var iy = 0;
while (true)
{
while (ix < x.Length && _ignoreChars.IndexOf(x[ix]) != -1)
ix++;
while (iy < y.Length && _ignoreChars.IndexOf(y[iy]) != -1)
iy++;
if (ix >= x.Length)
return iy >= y.Length;
if (iy >= y.Length)
return false;
if (x[ix] != y[iy])
return false;
ix++;
iy++;
}
}
public int GetHashCode(string obj)
{
throw new NotSupportedException();
}
}
A cleaner approach would be to use:
string1.Replace(Environment.NewLine, String.Empty) != string2.Replace(Environment.NewLine, String.Empty);
This is a generalized and tested version of Jon Hannas answer.
/// <summary>
/// Compares two character enumerables one character at a time, ignoring those specified.
/// </summary>
/// <param name="x"></param>
/// <param name="y"></param>
/// <param name="ignoreThese"> If not specified, the default is to ignore linefeed and newline: {'\r', '\n'} </param>
/// <returns></returns>
public static bool EqualsIgnoreSome(this IEnumerable<char> x, IEnumerable<char> y, params char[] ignoreThese)
{
// First deal with quickly handlable cases quickly:
// Same instance - generally happens often in real code, and is a fast check, so always worth doing first.
if (ReferenceEquals(x, y))
return true; //
// We already know they aren't both null as ReferenceEquals(null, null) returns true.
if (x == null || y == null)
return false;
// Default ignore is newlines:
if (ignoreThese == null || ignoreThese.Length == 0)
ignoreThese = new char[] { '\r', '\n' };
// Filters by specifying enumerator.
IEnumerator<char> eX = x.Where(c => !ignoreThese.Contains(c)).GetEnumerator();
IEnumerator<char> eY = y.Where(c => !ignoreThese.Contains(c)).GetEnumerator();
// Compares.
while (eX.MoveNext())
{
if (!eY.MoveNext()) //y is shorter
return false;
if (eX.Current != eY.Current)
return false;
}
return !eY.MoveNext(); //check if y was longer.
}
string1.replace('\n','') != string2.replace('\n','')
Cant you just strip out the line breaks before comparing the strings?
E.g. (pseudocode)...
string1.replace('\n','') != string2.replace('\n','')
Here's a version in VB.net based on Drew Noakes answer
Dim g_sIgnore As String = vbSpace & vbNewLine & vbTab 'String.Format("\n\r\t ")
Public Function StringCompareIgnoringWhitespace(s1 As String, s2 As String) As Boolean
Dim i1 As Integer = 0
Dim i2 As Integer = 0
Dim s1l As Integer = s1.Length
Dim s2l As Integer = s2.Length
Do
While i1 < s1l AndAlso g_sIgnore.IndexOf(s1(i1)) <> -1
i1 += 1
End While
While i2 < s2l AndAlso g_sIgnore.IndexOf(s2(i2)) <> -1
i2 += 1
End While
If i1 = s1l And i2 = s2l Then
Return True
Else
If i1 < s1l AndAlso i2 < s2l AndAlso s1(i1) = s2(i2) Then
i1 += 1
i2 += 1
Else
Return False
End If
End If
Loop
Return False
End Function
I also tested it with
Try
Debug.Assert(Not StringCompareIgnoringWhitespace("a", "z"))
Debug.Assert(Not StringCompareIgnoringWhitespace("aa", "zz"))
Debug.Assert(StringCompareIgnoringWhitespace("", ""))
Debug.Assert(StringCompareIgnoringWhitespace(" ", ""))
Debug.Assert(StringCompareIgnoringWhitespace("", " "))
Debug.Assert(StringCompareIgnoringWhitespace(" a", "a "))
Debug.Assert(StringCompareIgnoringWhitespace(" aa", "aa "))
Debug.Assert(StringCompareIgnoringWhitespace(" aa ", " aa "))
Debug.Assert(StringCompareIgnoringWhitespace(" aa a", " aa a"))
Debug.Assert(Not StringCompareIgnoringWhitespace("a", ""))
Debug.Assert(Not StringCompareIgnoringWhitespace("", "a"))
Debug.Assert(Not StringCompareIgnoringWhitespace("ccc", ""))
Debug.Assert(Not StringCompareIgnoringWhitespace("", "ccc"))
Catch ex As Exception
Console.WriteLine(ex.ToString)
End Try
I've run into this problem a number of times when I'm writing unit tests that need to compare multiple line expected strings with the actual output strings.
For example, if I'm writing a method that outputs a multi-line string I care about what each line looks like, but I don't care about the particular newline character used on a Windows or Mac machine.
In my case I just want to assert that each line is equal in my unit tests and bail out if one of them isn't.
public static void AssertAreLinesEqual(string expected, string actual)
{
using (var expectedReader = new StringReader(expected))
using (var actualReader = new StringReader(actual))
{
while (true)
{
var expectedLine = expectedReader.ReadLine();
var actualLine = actualReader.ReadLine();
Assert.AreEqual(expectedLine, actualLine);
if(expectedLine == null || actualLine == null)
break;
}
}
}
Of course, you could also make the method a little more generic and write to return a bool instead.
public static bool AreLinesEqual(string expected, string actual)
{
using (var expectedReader = new StringReader(expected))
using (var actualReader = new StringReader(actual))
{
while (true)
{
var expectedLine = expectedReader.ReadLine();
var actualLine = actualReader.ReadLine();
if (expectedLine != actualLine)
return false;
if(expectedLine == null || actualLine == null)
break;
}
}
return true;
}
What surprises me most is that there isn't a method like this included in any unit testing framework I've used.
I've had this issue with line endings in an unit test.
//compare files ignoring line ends
org.junit.Assert.assertEquals(
read.readPayload("myFile.xml")
.replace("\n", "")
.replace("\r", ""),
values.getFile()
.replace("\n", "")
.replace("\r", ""));
I usually do not like to make this kind of comparison (comparing the whole file), as a better approach would be validating the fields. But it answers this question here, as it removes line endings for most of the systems (the replace calls is the trick).
PS: read.readPayload reads a text file from the resources folder and puts it into a String, and values is a structure that contains a String with the raw content of a file (as String) in its attributes.
PS2: No performance was considered, since it was just an ugly fix for unit test

Categories