How to use regex to extract a string with multiple curly braces? - c#

I have this sample string
`{1:}{2:}{3:}{4:\r\n-}{5:}`
and I want to extract out only {4:\r\n-}
This is my code but it is not working.
var str = "{1:}{2:}{3:}{4:\r\n-}{5:}";
var regex = new Regex(#"{4:*.*-}");
var match = regex.Match(str);

You need to escape the special regex characters (in this case the opening and closing braces and the backslashes) in the search string. This would capture just that part:
var regex = new Regex("\{4:\\r\\n-\}");
... or if you wanted anything up to and including the slash before the closing brace (which is what it looks like you might be trying to do)...
var regex = new Regex("\{4:[^-]*-\}");

You just need to escape your \r and \n characters in your regular expression. You can use the Regex.Escape() method to escape characters in your regex string which returns a string of characters that are converted to their escaped form.
Working example: https://dotnetfiddle.net/6GLZrl
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string str = #"{1:}{2:}{3:}{4:\r\n-}{5:}";
string regex = #"{4:\r\n-}"; //Original regex
Match m = Regex.Match(str, Regex.Escape(regex));
if (m.Success)
{
Console.WriteLine("Found '{0}' at position {1}.", m.Value, m.Index);
}
else
{
Console.WriteLine("No match found");
}
}
}

Related

How can I replace "XX,XXX" with "XX XXX"?

I need to replace string like "XX,XXX" with "XX XXX". The string "XX,XXX" is in another string, e.g:
"-1299-5,"XXX,XXXX",trft,4,0,10800"
The string is fetched from a text file. I want to split the string by ",". But the comma in the substring led to the wrong result.
The X represents a char. I think regex can help, who can give me the right regex expression.
This expression,
(.*"[^,]*),([^,]*".*)
with a replacement of $1 $2 might work.
Demo
Example
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"(.*""[^,]*),([^,]*"".*)";
string substitution = #"\1 \2";
string input = #"-1299-5,""XXX,XXXX"",trft,4,0,10800";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
}
}
Simply, use 'Replace' to replace char from your string.
var test = "XXX,XXXX";
var filtered = test.Replace(',', ' ');
Console.WriteLine(filtered);
Output :
XXX XXXX

Regex - replacing chars in C# string in specific cases

I want to replace all brackets to another in my input string only when between them there aren't digits. I wrote this working sample of code:
string pattern = #"(\{[^0-9]*?\})";
MatchCollection matches = Regex.Matches(inputString, pattern);
if(matches != null)
{
foreach (var match in matches)
{
string outdateMatch = match.ToString();
string updateMatch = outdateMatch.Replace('{', '[').Replace('}', ']');
inputString = inputString.Replace(outdateMatch, updateMatch);
}
}
So for:
string inputString = "{0}/{something}/{1}/{other}/something"
The result will be:
inputString = "{0}/[something]/{1}/[other]/something"
Is there possibility to do this in one line using Regex.Replace() method?
You may use
var output = Regex.Replace(input, #"\{([^0-9{}]*)}", "[$1]");
See the regex demo.
Details
\{ - a { char
([^0-9{}]*) - Capturing group 1: 0 or more chars other than digits, { and }
} - a } char.
The replacement is [$1], the contents of Group 1 enclosed with square brackets.
Regex.Replace(input, #"\{0}/\{(.+)\}/\{1\}/\{(.+)}/(.+)", "{0}/[$1]/{1}/[$2]/$3")
Could you do this?
Regex.Replace(inputString, #"\{([^0-9]*)\}", "[$1]");
That is, capture the "number"-part, then just return the string with the braces replaced.
Not sure if this is exactly what you are after, but it seems to fit the question :)

Regex to parse URL from an excel formula

I have a formula in excel which upon reading from C# code looks like this
"=HYPERLINK(CONCATENATE(\"https://abc.efghi.rtyui.com/#/wqeqwq/\",#REF!,\"/asdasd\"), \"View asdas\")"
I want to use regex to fetch the URL from this string, i.e.
https://abc.efghi.rtyui.com/#/wqeqwq/#REF!/asdasd
The url can be different but the format of the formula will remain the same.
"=HYPERLINK(CONCATENATE(\"{SOME_STRING}\",#REF!,\"{SOME_STRING}\"), \"View asdas\")"
Try it like this:
(?<=HYPERLINK\(CONCATENATE\(")[^"]+
Demo
The positive lookbehind allows us to skip part in-front of the URL from the full match.
If you have an arbitrary number of whitespace in-between add some \s*, e.g. see this example that also shows the escaped = at the beginning of the string.
Sample Code:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"(?<=HYPERLINK\(CONCATENATE\("")[^""]+";
string input = #"=HYPERLINK(CONCATENATE(""https://abc.efghi.rtyui.com/#/wqeqwq/"",#REF!,""/asdasd""), ""View asdas"")";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
Addendum: Here is another technique that uses capturing groups and regex Replace to extract the resulting URL string (after CONCATENATE would have happened):
^\=HYPERLINK\(CONCATENATE\("([^"]+)",([^,]+),"([^"]+)".*$
Demo2
string pattern = #"^\=HYPERLINK\(CONCATENATE\(""([^""]+)"",([^,]+),""([^""]+)"".*$";
string substitution = #"$1$2$3";
string input = #"=HYPERLINK(CONCATENATE(""https://abc.efghi.rtyui.com/#/wqeqwq/"",#REF!,""/asdasd""), ""View asdas"")";
Regex regex = new Regex(pattern);
string result = regex.Replace(input, substitution, 1);
You can extract the URL from the formula using capturing groups in regular expression as given below:
string inputString = "=HYPERLINK(CONCATENATE(\"https://abc.efghi.rtyui.com/#/wqeqwq/\",#REF!,\"/asdasd\"), \"View asdas\")";
string regex = "CONCATENATE\\(\"([\\S]+)\",#REF!,\"([\\S]+)\"\\)";
Regex substringRegex = new Regex(regex, RegexOptions.IgnoreCase);
Match substringMatch = substringRegex.Match(inputString);
if (substringMatch.Success)
{
string url = substringMatch.Groups[1].Value + "#REF!" + substringMatch.Groups[2].Value;
}
I have defined two capturing groups in my regular expression. One for extracting part of the URL before #REF! and the other for extracting part of the URL after #REF!. Then I am concatenating all the extracted parts with #REF! to get the final URL.

How to match names with slash in C# regex?

I have a long text which contains strings like these:
...
1.1SMITH/JOHN 2.1SMITH/SARA
...
1.1Parker/Sara/Amanda.CH07/Elizabeth.IN03
...
Is there any regular expression in C# which can match these names. The clue is to search for [A-Z] which has separated by '/'.
You can try this:
[a-zA-Z\/]+
Explanation
c# sample:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"[a-zA-Z\/]+";
string input = #"...
1.1SMITH/JOHN 2.1SMITH/SARA
...
1.1Parker/Sara/Amanda.CH07/Elizabeth.IN03";
foreach (Match m in Regex.Matches(input, pattern))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
You can test the working c# sample here
You can use
[a-z\/]+
which matches any combination of characters and slashes (see Regex101).
Make sure you are matching case-insensitive.
var expression = new Regex(#"[a-z\/]+", RegexOptions.IgnoreCase);
var names = expression.Matches(theText, expression);
Do you want to capture any [A-Za-z] which has a previous char or next char equals '/'?
Try this:
(?<=\/)[A-Za-z]+|[A-Za-z]+(?=\/)

Problems with regex in c# only returning a single match

I'm building a regex and I'm missing something as it's not working properly.
my regex logic is trying to look for anything that has #anychars# and return the number of matches on the sentence and not a single match.
Here are a few examples
1- #_Title_# and #_Content_# should return two matches: #_Title_# and #_Content_#.
2- Product #_TemplateName_# #_Full_Product_Name_# more text. text text #_Short_Description_# should return 3 matches: #_TemplateName_# #_Full_Product_Name_# and #_Short_Description_#
and so on. Here is what my regex looks like: ^(.*#_.*_#.*)+$
any thoughts on what I'm doing wrong?
Something as simple as:
#.*?#
Or:
#_.*?_#
If you are trying to match the underscores too (it wasn't clear in the original version of the question). Or:
#_(.*?)_#
Which makes it easier to extract the token between your #_ and _# delimiters as a group.
Should work. The *? is key. It's non-greedy. Otherwise you match everything between the first and last #
So for example:
var str = "Product #_TemplateName_# #_Full_Product_Name_# more text. text text #_Short_Description_#";
var r = new Regex("#_(.*?)_#");
foreach (Match m in r.Matches(str))
{
Console.WriteLine(m.Value + "\t" + m.Groups[1].Value);
}
Outputs:
#_TemplateName_#     TemplateName
#_Full_Product_Name_#    Full_Product_Name
#_Short_Description_#    Short_Description
Try this :
string[] inputs = {
"#Title# and #Content#",
"Product #TemplateName# #_Full_Product_Name_# more text. text text #_Short_Description_#"
};
string pattern = "(?'string'#[^#]+#)";
foreach (string input in inputs)
{
MatchCollection matches = Regex.Matches(input, pattern);
Console.WriteLine(string.Join(",",matches.Cast<Match>().Select(x => x.Groups["string"].Value).ToArray()));
}
Console.ReadLine();
You regular expression is not correct. In addition, you want to loop through match if you want all matching.
static void Main(string[] args)
{
string input = "Product #_TemplateName_# #_Full_Product_Name_# more text. text text #_Short_Description_#",
pattern = "#_[a-zA-Z_]*_#";
Match match = Regex.Match(input, pattern);
while (match.Success)
{
Console.WriteLine(match.Value);
match = match.NextMatch();
}
Console.ReadLine();
}
Result
Don't use anchors and change your regex to:
(#[^#]+#)
In regex the [^#] expression means any character BUT #
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"(#[^#]+#)";
Regex rgx = new Regex(pattern);
string sentence = "#blah blah# asdfasdfaf #somethingelse#";
foreach (Match match in rgx.Matches(sentence))
Console.WriteLine("Found '{0}' at position {1}",
match.Value, match.Index);
}
}

Categories