Regex to find all placeholder occurrences in text

Regex to find all placeholder occurrences in text - c#

Im struggling to create a Regex that finds all placeholder occurrences in a given text. Placeholders will have the following format:
[{PRE.Word1.Word2}]
Rules:
Delimited by "[{PRE." and "}]" ("PRE" upper case)
2 words (at least 1 char long each) separated by a dot. All chars valid on each word apart from newline.
word1: min 1 char, max 15 chars
word2: min 1 char, max 64 chars
word1 cannot have dots, if there are more than 2 dots inside placeholder extra ones will be part of word2. If less than 2 dots, placeholder is invalid.
Looking to get all valid placeholders regardless of what the 2 words are.
Im not being lazy, just spent an horrible amount of time building the rule on regexr.com, but was unable to cross all these rules.
Looking fwd to checking your suggestions.
The closest I've got to was the below, and any attempt to expand on that breaks all valid matches.
\[\{OEP\.*\.*\}\]
Much appreciated!
Sample text where Regex should find matches:
Random text here
[{Test}] -- NO MATCH
[{PRE.TestTest3}] --NO MATCH
[{PRE.TooLong.12345678901234567890}] --NO MATCH
[{PRE.Address.Country}] --MATCH
[{PRE.Version.1.0}] --MATCH
Random text here

You can use
\[{PRE\.([^][{}.]{1,15})\.(.{1,64}?)}]
See the regex demo
Details
\[{ - a [{ string
PRE\. - PRE. text
([^][{}.]{1,15}) - Group 1: any one to fifteen chars other than [, ], {, } and .
\. - a dot
(.{1,64}?) - any one to 64 chars other than line break chars as few as possible
}] - a }] text.
If you need to get all matches in C#, you can use
var pattern = #"\[{PRE\.([^][{}.]{1,15})\.(.{1,64}?)}]";
var matches = Regex.Matches(text, pattern);
See this C# demo:
using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
var text = "[{PRE.Word1.Word2}] and [{PRE.Word 3.Word..... 2 %%%}]";
var pattern = #"\[{PRE\.([^][{}.]{1,15})\.(.{1,64}?)}]";
var matches = Regex.Matches(text, pattern);
var props = new List<Property>();
foreach (Match m in matches)
props.Add(new Property(m.Groups[1].Value,m.Groups[2].Value));
foreach (var item in props)
Console.WriteLine("Word1 = " + item.Word1 + ", Word2 = " + item.Word2);
}
public class Property
{
public string Word1 { get; set; }
public string Word2 { get; set; }
public Property()
{}
public Property(string w1, string w2)
{
this.Word1 = w1;
this.Word2 = w2;
}
}
}
Output:
Word1 = Word1, Word2 = Word2
Word1 = Word 3, Word2 = Word..... 2 %%%

string input = "[{PRE.Word1.Word2}]";
// language=regex
string pattern = #"\[{ PRE \. (?'group1' .{1,15}? ) \. (?'group2' .{1,64}? ) }]";
var match = Regex.Match(input, pattern, RegexOptions.IgnorePatternWhitespace);
Console.WriteLine(match.Groups["group1"].Value);
Console.WriteLine(match.Groups["group2"].Value);

Related

Get particular parts from a string

I'm trying to get particular parts from a string. I have to get the part which starts after '#' and contains only letters from the Latin alphabet.
I suppose that I have to create a regex pattern, but I don't know how.
string test = "PQ#Alderaa1:30000!A!->20000";
var planet = "Alderaa"; //what I want to get
string test2 = "#Cantonica:3000!D!->4000NM";
var planet2 = "Cantonica";
There are some other parts which I have to get, but I will try to get them myself. (starts after ':' and is an Integer; may be "A" (attack) or "D" (destruction) and must be surrounded by "!" (exclamation mark); starts after "->" and should be an Integer)

You could get the separate parts using capturing groups:
#([a-zA-Z]+)[^:]*:(\d+)!([AD])!->(\d+)
That will match:
#([a-zA-Z]+) Match # and capture in group 1 1+ times a-zA-Z
[^:]*: Match 0+ times not a : using a negated character class, then match a : (If what follows could be only optional digits, you might also match 0+ times a digit [0-9]*)
(\d+) Capture in group 2 1+ digits
!([AD])! Match !, capture in group 3 and A or D, then match !
->(\d+) Match -> and capture in group 4 1+ digits
Demo | C# Demo

You can use this regex, which uses a positive look behind to ensure the matched text is preceded by # and one or more alphabets get captured using [a-zA-Z]+ and uses a positive look ahead to ensure it is followed by some optional text, a colon, then one or more digits followed by ! then either A or D then again a !
(?<=#)[a-zA-Z]+(?=[^:]*:\d+![AD]!)
Demo
C# code demo
string test = "PQ#Alderaa1:30000!A!->20000";
Match m1 = Regex.Match(test, #"(?<=#)[a-zA-Z]+(?=[^:]*:\d+![AD]!)");
Console.WriteLine(m1.Groups[0].Value);
test = "#Cantonica:3000!D!";
m1 = Regex.Match(test, #"(?<=#)[a-zA-Z]+(?=[^:]*:\d+![AD]!)");
Console.WriteLine(m1.Groups[0].Value);
Prints,
Alderaa
Cantonica

You already have a good answers but I would like to add a new one to show named capturing groups.
You can create a class for your planets like
class Planet
{
public string Name;
public int Value1; // name is not cleat from context
public string Category; // as above: rename it
public string Value2; // same problem
}
Now you can use regex with named groups
#(?<name>[a-z]+)[^:]*:(?<value1>\d+)!(?<category>[^!]+)!->(?<value2>[\da-z]+)
Demo
Usage:
var input = new[]
{
"PQ#Alderaa1:30000!A!->20000",
"#Cantonica:3000!D!->4000NM",
};
var regex = new Regex("#(?<name>[a-z]+)[^:]*:(?<value1>\\d+)!(?<category>[^!]+)!->(?<value2>[\\da-z]+)",
RegexOptions.IgnoreCase | RegexOptions.Compiled);
var planets = input
.Select(p => regex.Match(p))
.Select(m => new Planet
{
Name = m.Groups["name"].Value, // here and further we can access to part of input string by name
Value1 = int.Parse(m.Groups["value1"].Value),
Category = m.Groups["category"].Value,
Value2 = m.Groups["value2"].Value
})
.ToList();

How to match a specific sentence with Regex

I'm new to Regex and I couldn't cope with matching this sort of sentence: Band Name #Venue 30 450, where the digits at the end represent price and quantity.
string input = "Band Name #City 25 3500";
Match m = Regex.Match(input, #"^[A-Za-z]+\s+[A-Za-z]+\s+[\d+]+\s+[\d+]$");
if (m.Success)
{
Console.WriteLine("Success!");
}

You can use Regex and leverage usage of named groups. This will make easier to extract data later if you need them. Example is:
string pattern = #"(Band) (?<Band>[A-Za-z ]+) (?<City>#[A-Za-z ]+) (?<Price>\d+) (?<Quantity>\d+)";
string input = "Band Name #City 25 3500";
Match match = Regex.Match(input, pattern);
Console.WriteLine(match.Groups["Band"].Value);
Console.WriteLine(match.Groups["City"].Value.TrimStart('#'));
Console.WriteLine(match.Groups["Price"].Value);
Console.WriteLine(match.Groups["Quantity"].Value);
If you looked at the pattern there are few regex groups which are named ?<GroupName>. It is just a basic example which can be tweaked as well to fulfill you actual needs.

This one should work:
[A-Za-z ]+ [A-Za-z ]+ #[A-Za-z ]+ \d+ \d+
Can test it here.
With your code it'd be:
string input = "Band Name #City 25 3500";
Match m = Regex.Match(input, "[A-Za-z ]+ [A-Za-z ]+ #[A-Za-z ]+ \d+ \d+");
if (m.Success)
{
Console.WriteLine("Success!");
}

Here is a very old and elaborated way : 1st way
string re1=".*?"; // Here the part before #
string re2="(#)"; // Any Single Character 1
string re3="((?:[a-z][a-z]+))"; // Word 1, here city
string re4="(\\s+)"; // White Space 1
string re5="(\\d+)"; // Integer Number 1, here 25
string re6="(\\s+)"; // White Space 2
string re7="(\\d+)"; // Integer Number 2, here 3500
Regex r = new Regex(re1+re2+re3+re4+re5+re6+re7,RegexOptions.IgnoreCase|RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String c1=m.Groups[1].ToString();
String word1=m.Groups[2].ToString();
String ws1=m.Groups[3].ToString();
String int1=m.Groups[4].ToString();
String ws2=m.Groups[5].ToString();
String int2=m.Groups[6].ToString();
Console.Write("("+c1.ToString()+")"+"("+word1.ToString()+")"+"("+ws1.ToString()+")"+"("+int1.ToString()+")"+"("+ws2.ToString()+")"+"("+int2.ToString()+")"+"\n");
}
In the above way you can store the specific values at a time. Like in your group[6] there is 3500 or what value in this format.
you can create your own regex here : Regex
And in a short, others given answers are right. 2nd way
just create the regex with
"([A-Za-z ]+) ([A-Za-z ]+) #([A-Za-z ]+) (\d+) (\d+)"
And match with any string format. you can create you won regex and test here: Regex Tester

That is the answer to what I was trying to do:
string input = "Band Name #Location 25 3500";
Match m = Regex.Match(input, #"([A-Za-z ]+) (#[A-Za-z ]+) (\d+) (\d+)");
if (m.Success)
{
Console.WriteLine("Success!");
}

regex to strip number from var in string

I have a long string and I have a var inside it
var abc = '123456'
Now I wish to get the 123456 from it.
I have tried a regex but its not working properly
Regex regex = new Regex("(?<abc>+)=(?<var>+)");
Match m = regex.Match(body);
if (m.Success)
{
string key = m.Groups["var"].Value;
}
How can I get the number from the var abc?
Thanks for your help and time

var body = #" fsd fsda f var abc = '123456' fsda fasd f";
Regex regex = new Regex(#"var (?<name>\w*) = '(?<number>\d*)'");
Match m = regex.Match(body);
Console.WriteLine("name: " + m.Groups["name"]);
Console.WriteLine("number: " + m.Groups["number"]);
prints:
name: abc
number: 123456

Your regex is not correct:
(?<abc>+)=(?<var>+)
The + are quantifiers meaning that the previous characters are repeated at least once (and there are no characters since (?< ... > ... ) is named capture group and is not considered as a character per se.
You perhaps meant:
(?<abc>.+)=(?<var>.+)
And a better regex might be:
(?<abc>[^=]+)=\s*'(?<var>[^']+)'
[^=]+ will match any character except an equal sign.
\s* means any number of space characters (will also match tabs, newlines and form feeds though)
[^']+ will match any character except a single quote.
To specifically match the variable abc, you then put it like this:
(?<abc>abc)\s*=\s*'(?<var>[^']+)'
(I added some more allowances for spaces)

From the example you provided the number can be gotten such as
Console.WriteLine (
Regex.Match("var abc = '123456'", #"(?<var>\d+)").Groups["var"].Value); // 123456
\d+ means 1 or more numbers (digits).
But I surmise your data doesn't look like your example.

Try this:
var body = #"my word 1, my word 2, my word var abc = '123456' 3, my word x";
Regex regex = new Regex(#"(?<=var \w+ = ')\d+");
Match m = regex.Match(body);

How to grab specific elements out of a string

I need to be able to grab specific elements out of a string that start and end with curly brackets. If I had a string:
"asjfaieprnv{1}oiuwehern{0}oaiwefn"
How could I grab just the 1 followed by the 0.

Regex is very useful for this.
What you want to match is:
\{ # a curly bracket
# - we need to escape this with \ as it is a special character in regex
[^}] # then anything that is not a curly bracket
# - this is a 'negated character class'
+ # (at least one time)
\} # then a closing curly bracket
# - this also needs to be escaped as it is special
We can collapse this to one line:
\{[^}]+\}
Next, you can capture and extract the inner contents by surrounding the part you want to extract with parentheses to form a group:
\{([^}]+)\}
In C# you'd do:
var matches = Regex.Matches(input, #"\{([^}]+)\}");
foreach (Match match in matches)
{
var groupContents = match.Groups[1].Value;
}
Group 0 is the whole match (in this case including the { and }), group 1 the first parenthesized part, and so on.
A full example:
var input = "asjfaieprnv{1}oiuwehern{0}oaiwef";
var matches = Regex.Matches(input, #"\{([^}]+)\}");
foreach (Match match in matches)
{
var groupContents = match.Groups[1].Value;
Console.WriteLine(groupContents);
}
Outputs:
1
0

Use the Indexof method:
int openBracePos = yourstring.Indexof ("{");
int closeBracePos = yourstring.Indexof ("}");
string stringIWant = yourstring.Substring(openBracePos, yourstring.Len() - closeBracePos + 1);
That will get your first occurrence. You need to slice your string so that the first occurrence is no longer there, then repeat the above procedure to find your 2nd occurrence:
yourstring = yourstring.Substring(closeBracePos + 1);
Note: You MAY need to escape the curly braces: "{" - not sure about this; have never dealt with them in C#

This looks like a job for regular expressions
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string str = "asjfaieprnv{1}oiuwe{}hern{0}oaiwefn";
Regex regex = new Regex(#"\{(.*?)\}");
foreach( Match match in regex.Matches(str))
{
Console.WriteLine(match.Groups[1].Value);
}
}
}
}

Get specific numbers from string

In my current project I have to work alot with substring and I'm wondering if there is an easier way to get out numbers from a string.
Example:
I have a string like this:
12 text text 7 text
I want to be available to get out first number set or second number set.
So if I ask for number set 1 I will get 12 in return and if I ask for number set 2 I will get 7 in return.
Thanks!

This will create an array of integers from the string:
using System.Linq;
using System.Text.RegularExpressions;
class Program {
static void Main() {
string text = "12 text text 7 text";
int[] numbers = (from Match m in Regex.Matches(text, #"\d+") select int.Parse(m.Value)).ToArray();
}
}

Try using regular expressions, you can match [0-9]+ which will match any run of numerals within your string. The C# code to use this regex is roughly as follows:
Match match = Regex.Match(input, "[0-9]+", RegexOptions.IgnoreCase);
// Here we check the Match instance.
if (match.Success)
{
// here you get the first match
string value = match.Groups[1].Value;
}
You will of course still have to parse the returned strings.

Looks like a good match for Regex.
The basic regular expression would be \d+ to match on (one or more digits).
You would iterate through the Matches collection returned from Regex.Matches and parse each returned match in turn.
var matches = Regex.Matches(input, "\d+");
foreach(var match in matches)
{
myIntList.Add(int.Parse(match.Value));
}

You could use regex:
Regex regex = new Regex(#"^[0-9]+$");

you can split the string in parts using string.Split, and then travese the list with a foreach applying int.TryParse, something like this:
string test = "12 text text 7 text";
var numbers = new List<int>();
int i;
foreach (string s in test.Split(' '))
{
if (int.TryParse(s, out i)) numbers.Add(i);
}
Now numbers has the list of valid values

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex to find all placeholder occurrences in text - c#

Related

Get particular parts from a string

How to match a specific sentence with Regex

regex to strip number from var in string

How to grab specific elements out of a string

Get specific numbers from string

Categories

Resources