Substitutions in Regular Expressions, and Replacement pattern - c#

I spend 4 hours on this and still is not clear to me how should this work.
I want use logic from this link. I want to transform
Some123Grouping TO GroupingSome123
I have 3 parts and should change order using replacement ($1, $2, $3)
Also I need something to transform
name#gmail.com TO name
It is not clear to me how to define replacement and what is captured in my case?
Thanks for help, I would relay appreciate it.

$1, $2, etc. are referring to groups (i.e. the indexes of their appearance of declaration). So you need to define groups in your capturing regex. You do this by using parenthesis. For example:
Regex.Replace("Some123Grouping", #"(Some)(123)(Grouping)", #"$3$1$2")
yields "GroupingSome123".
Note that for better readability, groups can also be named and then referenced by their name. For example:
Regex.Replace("mr.smith#gmail.com", #"(?<name>.*)(#gmail.com)", #"${name}")
yields "mr.smith".
BTW, if you are looking for a general (non .NET specific but great) introduction to Regexes, I recommend Regular-Expressions.info.

Simply using your requirement yields
Regex.Replace("name#gmail.com", #"(name)(#gmail.com)", #"$1")
but I suspect what you want is more along the lines of
Regex.Replace("name#gmail.com", #"(\w*)(#.*)", #"$1")

If I understood correctly:
There is pattern with Text followed by Numbers followed by Text if that is correct this should meet your pattern:
string pattern = #"([A-Za-z]+)(\d+)([A-Za-z]+)";
The next step is getting the groups out if it like:
Regex rx = new Regex(pattern);
var match = rx.Match(input);
Then your result may be obtained in 2 ways, the short version:
result = rx.Replace(input, "$3$1$2");
And the long version:
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string input = "Some123Grouping";
string pattern = #"([A-Za-z]+)(\d+)([A-Za-z]+)";
Regex rx = new Regex(pattern);
var match = rx.Match(input);
Console.WriteLine("{0} matches found in:\n {1}",
match.Groups.Count,
input);
var newInput = "";
for(int i= match.Groups.Count;i>0;i--){
newInput += match.Groups[i];
}
Console.WriteLine(newInput);
}
}
Regarding your second issue it seems it is as simple as:
var result ="name#gmail.com".Split('#')[0];

Related

How to use grouped regex content from one file to another?

How do I get the matched regex group value from one file and paste it in a different file
I've tried something like this
var doc=File.ReadAllText(#"D:\Project\12345\database\xyz.txt");
Regex r=new Regex(#"<ttl>(\w+)</ttl>");
Match m=r.Match(doc);
string gr=m.Groups[1].Value;
File.WriteAllText(#"E:\Final\12345\2017\xyz.txt", File.ReadAllText(#"E:\Final\12345\2017\123.txt").Replace("<ce-title>[^<]+</ce-title>","<ce-title>"+gr+"</ce-title>"));
Console.WriteLine("Done");
Console.ReadLine();
But it does not work for some reason and I can't figure out what is wrong?
I'm basically trying to get content inside the first <ttl> element from one file and paste that value to another files <ce-title> element using regex.
NOTE: I'm aware that this can be done using xml/html parsing techniques but I want to know how I can do this simple thing using regex.
Can anyone help me on this?
You are using String.Replace() rather than Regex.Replace.
Re-write your code as follows:
var doc=File.ReadAllText(#"D:\Project\12345\database\xyz.txt");
var r = new Regex(#"<ttl>(\w+)</ttl>");
Match m=r.Match(doc);
if (m.Success)
{
var gr = m.Groups[1].Value;
var rx = new Regex("<ce-title>[^<]+</ce-title>");
File.WriteAllText(#"E:\Final\12345\2017\xyz.txt",
rx.Replace(
File.ReadAllText(#"E:\Final\12345\2017\123.txt‌​"), // Input
string.Format("<ce-title>{0}</ce-title>", gr), // Replacement
1 // Number of occurrences
)
);
}
Console.WriteLine("Done");
Console.ReadLine();
Since gr only consists of word chars, it is safe to use string.Format("<ce-title>{0}</ce-title>", gr) as a replacement. Else, if there is a need to support any chars, you need to use string.Format("<ce-title>{0}</ce-title>", gr.Replace("$", "$$")).

Parsing a list of functions and their parameters from a string

I have a string which contains some functions (I know their names) and their parameters like this:
translate(700 210) rotate(-30)
I would like to parse each one of them in a string array starting with the function name followed by the parameters.
I don't know much abour regex and so far I got this:
MatchCollection matches = Regex.Matches(attribute.InnerText, #"((translate|rotate|scale|matrix)\s*\(\s*(-?\d+\s*\,*\s*)+\))*");
for (int i = 0; i < matches.Count; i++)
{
Console.WriteLine(matches[i].Value);
}
That this returns is:
translate(700 210)
[blank space]
rotate(-30)
[blank space]
This works for me because I can run another regular expression one each row from the resulting collection and get the contents. What I don't understand is why there are blank rows returned between the methods.
Also, is running a regex twice - once to separate the methods and once to actually parse them a good approach?
Thanks!
Regex.Matches will match your entire regular expression multiple times. It finds one match for the whole thing, then finds the next match for the whole thing.
The outermost parens with * indicate that you're willing to accept zero or more of the preceding group's contents as a match. So when it finds none of them, it happily returns that. That is not your intent. You want exactly one.
The blanks are harmless, but "zero or more" also includes two. Consider this string, with no space between the two functions:
var text = "translate(700 210)rotate(-30)";
That's one match, according to the regex you provided. You'll get "rotate" and "-30". If the missing space is an error, detect it and warn the user. If you're not going to do that, parse it correctly.
So let's get rid of the outermost parens and that *. We'll also name the capturing groups, for readability.
var matches = Regex.Matches(text, #"(?<funcName>translate|rotate|scale|matrix)\s*\(\s*(?<param>-?\s*\d+\s*\,*\s*)+\)");
foreach (Match match in matches)
{
if (match.Groups["funcName"].Success)
{
var funcName = match.Groups["funcName"].Value;
var param = Int32.Parse(match.Groups["param"].Value);
Console.WriteLine($"{funcName}( {param} )");
}
}
I also stuck in \s* after the optional -, just in case.
I like using Regex with a dictionary
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication56
{
class Program
{
static void Main(string[] args)
{
Dictionary<string, string> dict = new Dictionary<string, string>();
string input = "translate(700 210) rotate(-30)";
string pattern = #"(?'command'[^\(]+)\((?'value'[^\)]+)\)";
MatchCollection matches = Regex.Matches(input, pattern);
foreach(Match match in matches.Cast<Match>())
{
dict.Add(match.Groups["command"].Value, match.Groups["value"].Value);
}
}
}
}

Regular Expression to match the pattern

I am looking for Regular Expression search pattern to find data within $< and >$.
string pattern = "\b\$<[^>]*>\$";
is not working.
Thanks,
You can make use of a tempered greedy token:
\$<(?:(?!\$<|>\$)[\s\S])*>\$
See demo
This way, you will match only the closest boundaries.
Your regex does not match because you do not allow > in-between your markers, and you are using \b where you most probably do not have a word boundary.
If you do not want to get the delimiters in the output, use capturing group:
\$<((?:(?!\$<|>\$)[\s\S])*)>\$
^ ^
And the result will be in Group 1.
In C#, you should consider declaring all regex patterns (whenever possible) with the help of a verbatim string literal notation (with #"") because you won't have to worry about doubling backslashes:
var rx = new Regex(#"\$<(?:(?!\$<|>\$)[\s\S])*>\$");
Or, since there is a singleline flag (and this is preferable):
var rx = new Regex(#"\$<((?:(?!\$<|>\$).)*)>\$", RegexOptions.Singleline | RegexOptions.CultureInvariant);
var res = rx.Match(text).Select(p => p.Groups[1].Value).ToList();
This pattern will do the work:
(?<=\$<).*(?=>\$)
Demo: https://regex101.com/r/oY6mO2/1
To find this pattern in php you have this REGEX code for find any patten,
/$<(.*?)>$/s
For Example:
$arrayWhichStoreKeyValueArrayOfYourPattern= array();
preg_match_all('/$<(.*?)>$/s',
$yourcontentinwhichyoufind,
$arrayWhichStoreKeyValueArrayOfYourPattern);
for($i=0;$i<count($arrayWhichStoreKeyValueArrayOfYourPattern[0]);$i++)
{
$content=
str_replace(
$arrayWhichStoreKeyValueArrayOfYourPattern[0][$i],
constant($arrayWhichStoreKeyValueArrayOfYourPattern[1][$i]),
$yourcontentinwhichyoufind);
}
using this example you will replace value using same name constant content in this var $yourcontentinwhichyoufind
For example you have string like this which has also same named constant.
**global.php**
//in this file my constant declared.
define("MYNAME","Hiren Raiyani");
define("CONSTANT_VAL","contant value");
**demo.php**
$content="Hello this is $<MYNAME>$ and this is simple demo to replace $<CONSTANT_VAL>$";
$myarr= array();
preg_match_all('/$<(.*?)>$/s', $content, $myarray);
for($i=0;$i<count($myarray[0]);$i++)
{
$content=str_replace(
$myarray[0][$i],
constant($myarray[1][$i]),
$content);
}
I think as i know that's all.

Regex that returns a list

I have a string that I am looking up that can have two possible values:
stuff 1
grouped stuff 1-3
I am not very familiar with using regex, but I know it can be very powerful when used correctly. So forgive me if this question sounds ridiculous in anyway. I was wondering if it would be possible to have some sort of regex code that would only leave the numbers of my string (for example in this case 1 and 1-3) but perhaps if it were the example of 1-3 I could just return the 1 and 3 separately to pass into a function to get the in between.
I hope I am making sense. It is hard to put what I am looking for into words. If anyone needs any further clarification I would be more than happy to answer questions/edit my own question.
To create a list of numbers in string y, use the following:
var listOfNumbers = Regex.Matches(y, #"\d+")
.OfType<Match>()
.Select(m => m.Value)
.ToList();
This is fully possible, but best done with two separate Regexes, say SingleRegex and RangedRegex - then check for one or the other, and pass into a function when the result is RangeRegex.
As long as you're checking for "numbers in a specific place" then extra numbers won't confuse your algorythm. There are also several Regex Testers out there, a simple google Search weill give you an interface to check for various syntax and matches.
Are you just wanting to loop through all of the numbers in the string?
Here's one way you can loop throw each match in a regular expression.
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
Regex r = new Regex(#"\d+");
string s = "grouped stuff 1-3";
Match m = r.Match(s);
while(m.Success)
{
string matchText = m.Groups[0].Value;
Console.WriteLine(matchText);
m = m.NextMatch();
}
}
}
This outputs
1
3

Regex replacing inside of

Well, I have this code:
StreamReader sr = new StreamReader(#"main.cl", true);
String str = sr.ReadToEnd();
Regex r = new Regex(#"&");
string[] line = r.Split(str);
foreach (string val in line)
{
string Change = val.Replace("puts","System.Console.WriteLine()");
Console.Write(Change);
}
As you can see, I'm trying to replace puts (content) by Console.WriteLine(content) but it would be need Regular Expressions and I didn't found a good article about how to do THIS.
Basically, taking * as the value that is coming, I'd like to do this:
string Change = val.Replace("puts *","System.Console.WriteLine(*)");
Then, if I receive:
puts "Hello World";
I want to get:
System.Console.WriteLine("Hello World");
You need to use Regex.Replace to capture part of the input by using a capturing group and include the captured match into the output. Example:
Regex.Replace(
"puts 'foo'", // input
"puts (.*)", // .* means "any number of characters"
"System.Console.WriteLine($1)") // $1 stands for whatever (.*) matched
If the input always ends in a semicolon you would want to move that semicolon outside the WriteLine parens. One way to do that is:
Regex.Replace(
"puts 'foo';", // input
"puts (.*);", // ; outside parens -- now it's not captured
"System.Console.WriteLine($1);") // manually adding the fixed ; at the end
If you intend to adapt these examples it's a good idea to consult a technical reference first; you can find a very good one here.
What you want to do is look at Grouping Expressions. Give the following a try
Regex.Replace(val, "puts (.*);", "System.Console.WriteLine(${1});");
Note that you can also name your groups, as opposed to using their indexes for replacement. You can do this like so:
Regex.Replace(val, "puts (?<str>.*);", "System.Console.WriteLine(${str});");

Categories