How to Regex replace match group item with method result - c#

The input string is something like this:
LineA: 50
LineB: 120
LineA: 12
LineB: 53
I would like to replace the LineB values with a result of MultiplyCalculatorMethod(LineAValue), where LineAValue is the value of the line above LineB and MultiplyCalculatorMethod is my other, complicated C# method.
In semi-code, I would like to do something like this:
int MultiplyCalculatorMethod(int value)
{
return 2 * Math.Max(3,value);
}
string ReplaceValues(string Input)
{
Matches mat = Regex.Match(LineA:input_value\r\nLineB:output_value)
foreach (Match m in mat)
{
m.output_value = MultiplyCalculatorMethod(m.input_value)
}
return m.OutputText;
}
Example:
string Text = "LineA:5\r\nLineB:2\r\nLineA:2\r\nLineB:7";
string Result = ReplaceValues(Text);
//Result = "LineA:5\r\nLineB:10\r\nLineA:2\r\nLineB:6";
I wrote a Regex.Match to match LineA: value\r\nLineB: value and get these values in groups. But when I use Regex.Replace, I can only provide a "static" result that is combining groups from the match, but I can not use C# methods there.
So my questions is how to Regex.Replace where Result is a result of C# method where input is LineA value.

You can use a MatchEvaluator like this:
public static class Program
{
public static void Main()
{
string input = "LineA:5\r\nLineB:2\r\nLineA:2\r\nLineB:7";
string output = Regex.Replace(input, #"LineA:(?<input_value>\d+)\r\nLineB:\d+", new MatchEvaluator(MatchEvaluator));
Console.WriteLine(output);
}
private static string MatchEvaluator(Match m)
{
int inputValue = Convert.ToInt32(m.Groups["input_value"].Value);
int outputValue = MultiplyCalculatorMethod(inputValue);
return string.Format("LineA:{0}\r\nLineB:{1}", inputValue, outputValue);
}
static int MultiplyCalculatorMethod(int value)
{
return 2 * Math.Max(3, value);
}
}

Try using following Replace overload.
public static string Replace( string input, string pattern, MatchEvaluator evaluator);
MatchEvaluator has access to Match contents and can call any other methods to return the replacement string.

Related

What is the best practice to resolve placeholders in a plain text?

I need to resolve a huge load of placeholders (about 250) in a plain text.
A placeholder is defined as %ThisIsAPlaceholder%, an example would be %EmailSender%.
Now it's gets a bit creepy: the code should handle case insensitive placeholders too. So, %EmailSender%, %EMAILSENDER% and %emailsender% are the same placeholder. I think that's where it gets complicated.
My first approach was the something like:
public string ResolvePlaceholders(string text)
{
var placeholders = new IEnumerable<string>
{
"%EmailSender%",
"%ErrorMessage%",
"%ActiveUser%"
};
var resolvedText = text;
foreach(var placeholder in placeholders)
{
if(!replacedText.Contains(placeholder))
continue;
var value = GetValueByPlaceholder(placeholder);
resolvedText = resolvedText.Replace(placeholder, value);
}
return resolvedText;
}
But.. as you may notice, i can't handle case insesitive placeholders.
Also i check for every placeholder (if it is used in the text). When using > 200 placholders in a text with about 10'000 words i think this solution is not very fast.
How can this be solved in a better way? A solution that supports case insensitive placeholders would be appreciated.
A really basic but efficient replacement scheme for your case would be something like this:
private readonly static Regex regex = new Regex("%(?<name>.+?)%");
private static string Replace(string input, ISet<string> replacements)
{
string result = regex.Replace(input, m => {
string name = m.Groups["name"].Value;
string value;
if (replacements.Contains(name))
{
return GetValueByPlaceholder(name);
}
else
{
return m.Captures[0].Value;
}
});
return result;
}
public static void Main(string[] args)
{
var replacements = new HashSet<string>(StringComparer.CurrentCultureIgnoreCase)
{
"EmailSender", "ErrorMessage", "ActiveUser"
};
string text = "Hello %ACTIVEUSER%, There is a message from %emailsender%. %errorMessage%";
string result = Replace(text, replacements);
Console.WriteLine(result);
}
It will use a regular expression to go through the input text once. Note that we are getting case-insensitive comparisons via the equality comparer passed to the HashSet that we constructed in Main. Any unrecognized items will be ignored. For more general cases, the Replace method could take a dictionary:
private static string Replace(string input, IDictionary<string, string> replacements)
{
string result = regex.Replace(input, m => {
string name = m.Groups["name"].Value;
string value;
if (replacements.TryGetValue(name, out value))
{
return value;
}
else
{
return m.Captures[0].Value;
}
});
return result;
}
A typical recommendation when matching using quantifiers on input from an untrusted source (e.g. users over the internet) is to specify a match timeout for the regular expression. You would have to catch the RegexMatchTimeoutException that is thrown and do something in that case.
Regex solution
private static string ReplaceCaseInsensitive(string input, string search, string replacement)
{
string result = Regex.Replace(
input,
Regex.Escape(search),
replacement.Replace("$","$$"),
RegexOptions.IgnoreCase
);
return result;
}
Non regex solution
public static string Replace(this string str, string old, string #new, StringComparison comparison)
{
#new = #new ?? "";
if (string.IsNullOrEmpty(str) || string.IsNullOrEmpty(old) || old.Equals(#new, comparison))
return str;
int foundAt;
while ((foundAt = str.IndexOf(old, 0, StringComparison.CurrentCultureIgnoreCase)) != -1)
str = str.Remove(foundAt, old.Length).Insert(foundAt, #new);
return str;
}
Seems like a duplicate question / answer
String.Replace ignoring case

Combine TrimStart and TrimEnd for a String

I have a string with some letters and numbers. here an exemple :
OG000134W4.11
I have to trim all the first letters and the first zeros to get this :
134W4.11
I also need to cut the character from the first letter he will encounter to finally retreive :
134
I know I can do this with more than one "trim" but I want to know if there was an efficient way to do that.
Thanks.
If you don't want to use regex.. then Linq is your friend
[Test]
public void TrimTest()
{
var str = "OG000134W4.11";
var ret = str.SkipWhile(x => char.IsLetter(x) || x == '0').TakeWhile(x => !char.IsLetter(x));
Assert.AreEqual("134", ret);
}
Here is the regex I would use
([1-9][0-9]*)[^1-9].*
Here is some C# code you could try
var input = "OG000134W4.11";
var result = new Regex(#"([1-9][0-9]*)[^1-9].*").Replace(input, "$1");
using System;
using System.Text.RegularExpressions;
namespace regex
{
class MainClass
{
public static void Main (string[] args)
{
string result = matchTest("OG000134W4.11");
Console.WriteLine(result);
}
public static string matchTest (string input)
{
Regex rx = new Regex(#"([1-9][0-9]+)\w*[0-9]*\.[0-9]*");
Match match = rx.Match(input);
if (match.Success){
return match.Groups[1].Value;
}else{
return string.Empty;
}
}
}
}

Extracting Numbers From Between String [duplicate]

This question already has answers here:
what's the quickest way to extract a 5 digit number from a string in c#
(8 answers)
Closed 9 years ago.
What's the best way to extract the number part from this string? I looked at RegularExpressions but they confuse the hell out of me. Is it possible with SubString?
/store/457987680928164?id=2
All I require is the numbers.
RegEx is a good way to go with this problem, but if you're set on using SubString...
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string test = "/store/457987680928164?id=2";
int start = test.IndexOfAny("0123456789".ToCharArray());
int end = test.IndexOf("?");
Console.WriteLine(test.Substring(start, end - start));
Console.ReadLine();
}
}
}
You should get RadExpression Designer and teach yourself RegEx's with the cheat sheets.
If RegEx's frighten you, simply do it in a loop which will often be quicker:
string s = "/store/457987680928164?id=2";
string numericInput = string.Empty;
foreach(char c in s)
{
if (char.IsDigit(c))
numericInput += c;
}
I'd prefer Regex, this isn't using substring runs faster than it in my examples
static long GrabFirstLongFromString(string input)
{
string intAsString = String.Empty;
bool startedInt = false;
foreach(char c in input)
{
if (Char.IsDigit(c))
{
startedInt = true; //really only care about the first digit
intAsString += c;
}
else if (startedInt)
return long.Parse(intAsString);
}
return -1; //define a default, since this only does a 0 or positive I picked a negative
}
This does it with regular expressions
private static Regex digitsOnly = new Regex(#"[^\d]");
public static string RemoveNonNumbers(string input)
{
return digitsOnly.Replace(input, "");
}
Or just simple Regex:
Regex r = new Regex(#"\d+");
MatchCollection m = r.Matches("/store/457987680928164?id=2");
if (m.Count > 0)
{
Console.WriteLine(string.Format("Big number: {0} - Little number: {1}", m[0], m[1]));
}
The above prints:
Big number: 457987680928164 - Little number: 2
I would definitely recommend using RegEx for this. String pattern matching and extraction is really an ideal scenario for Regular Expressions.
Here is a RegEx that will match the string sample you provided, with capturing parenthesis for the numeric parts of the String:
^/store/(\d+)\?id=(\d+)
You can verify it here in the Regex Tester. I tested it using your sample String and the RegEx I wrote above.
I came up with these extension methods to simplify common string parsing tasks:
private static string Substring(string str, string value, bool isLastValue, bool isAfterValue, StringComparison comparisonType, string defaultValue)
{
int pos = isLastValue ? str.LastIndexOf(value, comparisonType) : str.IndexOf(value, comparisonType);
if (pos == -1) return defaultValue;
return isAfterValue ? str.Substring(pos + value.Length) : str.Substring(0, pos);
}
public static string SubstringBeforeFirst(this string str, string value, StringComparison comparisonType = StringComparison.CurrentCulture, string defaultValue = "")
{
return Substring(str, value, false, false, comparisonType, defaultValue);
}
public static string SubstringBeforeLast(this string str, string value, StringComparison comparisonType = StringComparison.CurrentCulture, string defaultValue = "")
{
return Substring(str, value, true, false, comparisonType, defaultValue);
}
public static string SubstringAfterFirst(this string str, string value, StringComparison comparisonType = StringComparison.CurrentCulture, string defaultValue = "")
{
return Substring(str, value, false, true, comparisonType, defaultValue);
}
public static string SubstringAfterLast(this string str, string value, StringComparison comparisonType = StringComparison.CurrentCulture, string defaultValue = "")
{
return Substring(str, value, true, true, comparisonType, defaultValue);
}
As for getting the number from your example:
string s = "/store/457987680928164?id=2";
string number = s.SubstringAfterLast("/").SubstringBeforeFirst("?");
YARS (Yet Another Regex Solution)
The following:
var str = "/store/457987680928164?id=2";
var regex = new Regex(#"\d+");
foreach (Match match in regex.Matches(str))
{
Console.WriteLine(match.Value);
}
outputs this:
457987680928164
2
string str = "/store/457987680928164?id=2";
string num = new string(str.Remove(str.IndexOf("?")).Where(a => char.IsDigit(a)).ToArray());

Insert in regex expression

I have following string:
10-5*tan(40)-cos(0)-40*sin(90);
I have extracted the math functions and calculated their values:
tan(40) = 1.42;
cos(0) = 1;
sin(90) = 0;
I want to insert these values back into the expression string as:
10-5*(1.42)-(1)-40*(0);
Please assist
I would use Regex.Replace and then use a custom MatchEvaluator to convert your values and insert these, check:
http://msdn.microsoft.com/en-us/library/cft8645c(v=vs.110).aspx
Which would look something like:
class Program
{
static string ConvertMathFunc(Match m)
{
Console.WriteLine(m.Groups["mathfunc"]);
Console.WriteLine(m.Groups["argument"]);
double arg;
if (!double.TryParse(m.Groups["argument"].Value, out arg))
throw new Exception(String.Format("Math function argument could not be parsed to double", m.Groups["argument"].Value));
switch (m.Groups["mathfunc"].Value)
{
case "tan": return Math.Tan(arg).ToString();
case "cos": return Math.Cos(arg).ToString();
case "sin": return Math.Sin(arg).ToString();
default:
throw new Exception(String.Format("Unknown math function '{0}'", m.Groups["mathfunc"].Value));
}
}
static void Main(string[] args)
{
string input = "10 - 5 * tan(40) - cos(0) - 40 * sin(90);";
Regex pattern = new Regex(#"(?<mathfunc>(tan|cos|sin))\((?<argument>[0-9]+)\)");
string output = pattern.Replace(input, new MatchEvaluator(Program.ConvertMathFunc));
Console.WriteLine(output);
}
}

Replace named group in regex with value

I want to use regular expression same way as string.Format. I will explain
I have:
string pattern = "^(?<PREFIX>abc_)(?<ID>[0-9])+(?<POSTFIX>_def)$";
string input = "abc_123_def";
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
string replacement = "456";
Console.WriteLine(regex.Replace(input, string.Format("${{PREFIX}}{0}${{POSTFIX}}", replacement)));
This works, but i must provide "input" to regex.Replace. I do not want that. I want to use pattern for matching but also for creating strings same way as with string format, replacing named group "ID" with value. Is that possible?
I'm looking for something like:
string pattern = "^(?<PREFIX>abc_)(?<ID>[0-9])+(?<POSTFIX>_def)$";
string result = ReplaceWithFormat(pattern, "ID", 999);
Result will contain "abc_999_def". How to accomplish this?
Yes, it is possible:
public static class RegexExtensions
{
public static string Replace(this string input, Regex regex, string groupName, string replacement)
{
return regex.Replace(input, m =>
{
return ReplaceNamedGroup(input, groupName, replacement, m);
});
}
private static string ReplaceNamedGroup(string input, string groupName, string replacement, Match m)
{
string capture = m.Value;
capture = capture.Remove(m.Groups[groupName].Index - m.Index, m.Groups[groupName].Length);
capture = capture.Insert(m.Groups[groupName].Index - m.Index, replacement);
return capture;
}
}
Usage:
Regex regex = new Regex("^(?<PREFIX>abc_)(?<ID>[0-9]+)(?<POSTFIX>_def)$");
string oldValue = "abc_123_def";
var result = oldValue.Replace(regex, "ID", "456");
Result is: abc_456_def
No, it's not possible to use a regular expression without providing input. It has to have something to work with, the pattern can not add any data to the result, everything has to come from the input or the replacement.
Intead of using String.Format, you can use a look behind and a look ahead to specify the part between "abc_" and "_def", and replace it:
string result = Regex.Replace(input, #"(?<=abc_)\d+(?=_def)", "999");
There was a problem in user1817787 answer and I had to make a modification to the ReplaceNamedGroup function as follows.
private static string ReplaceNamedGroup(string input, string groupName, string replacement, Match m)
{
string capture = m.Value;
capture = capture.Remove(m.Groups[groupName].Index - m.Index, m.Groups[groupName].Length);
capture = capture.Insert(m.Groups[groupName].Index - m.Index, replacement);
return capture;
}
Another edited version of the original method by #user1817787, this one supports multiple instances of the named group (also includes similar fix to the one #Justin posted (returns result using {match.Index, match.Length} instead of {0, input.Length})):
public static string ReplaceNamedGroup(
string input, string groupName, string replacement, Match match)
{
var sb = new StringBuilder(input);
var matchStart = match.Index;
var matchLength = match.Length;
var captures = match.Groups[groupName].Captures.OfType<Capture>()
.OrderByDescending(c => c.Index);
foreach (var capt in captures)
{
if (capt == null)
continue;
matchLength += replacement.Length - capt.Length;
sb.Remove(capt.Index, capt.Length);
sb.Insert(capt.Index, replacement);
}
var end = matchStart + matchLength;
sb.Remove(end, sb.Length - end);
sb.Remove(0, matchStart);
return sb.ToString();
}
I shortened ReplaceNamedGroup, still supporting multiple captures.
private static string ReplaceNamedGroup(string input, string groupName, string replacement, Match m)
{
string result = m.Value;
foreach (Capture cap in m.Groups[groupName]?.Captures)
{
result = result.Remove(cap.Index - m.Index, cap.Length);
result = result.Insert(cap.Index - m.Index, replacement);
}
return result;
}
The simple solution is to refer to the matched groups in replacement. So the Prefix is $1 and Postfix is $3.
I've haven't tested the code below but should work similar to a regEx I've written recently:
string pattern = "^(?<PREFIX>abc_)(?<ID>[0-9])+(?<POSTFIX>_def)$";
string input = "abc_123_def";
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
string replacement = String.Format("$1{0}$3", "456");
Console.WriteLine(regex.Replace(input, string.Format("${{PREFIX}}{0}${{POSTFIX}}", replacement)));
In case this helps anyone, I enhanced the answer with the ability to replace multiple named capture groups in one go, which this answer helped massively to achieve.
public static class RegexExtensions
{
public static string Replace(this string input, Regex regex, Dictionary<string, string> captureGroupReplacements)
{
string temp = input;
foreach (var key in captureGroupReplacements.Keys)
{
temp = regex.Replace(temp, m =>
{
return ReplaceNamedGroup(key, captureGroupReplacements[key], m);
});
}
return temp;
}
private static string ReplaceNamedGroup(string groupName, string replacement, Match m)
{
string capture = m.Value;
capture = capture.Remove(m.Groups[groupName].Index - m.Index, m.Groups[groupName].Length);
capture = capture.Insert(m.Groups[groupName].Index - m.Index, replacement);
return capture;
}
}
Usage:
var regex = new Regex(#"C={BasePath:""(?<basePath>[^\""].*)"",ResultHeadersPath:""ResultHeaders"",CORS:(?<cors>true|false)");
content = content.Replace(regex, new Dictionary<string, string>
{
{ "basePath", "www.google.com" },
{ "cors", "false" }
};
All credit should go to user1817787 for this one.
You should check the documentation about RegEx replace here
I created this to replace a named group. I cannot use solution that loop on all groups name because I have case where not all expression is grouped.
public static string ReplaceNamedGroup(this Regex regex, string input, string namedGroup, string value)
{
var replacement = Regex.Replace(regex.ToString(),
#"((?<GroupPrefix>\(\?)\<(?<GroupName>\w*)\>(?<Eval>.[^\)]+)(?<GroupPostfix>\)))",
#"${${GroupName}}").TrimStart('^').TrimEnd('$');
replacement = replacement.Replace("${" + namedGroup + "}", value);
return Regex.Replace(input, regex.ToString(), replacement);
}

Categories