Regex to split words containing brackets based on group [closed] - c#

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have the following terms which are considered in a group.
Create Set and Delete.
For the below input
Create(Apple | Banana(Tree) | Mango (Tree) ) | Delete(Guava)|Set(Orange(Tree))
the expected split should be as follows
Create(Apple | Banana(Tree) | Mango (Tree) )
Delete(Guava)
Set(Orange(Tree))
I could come up with the following regex which is not giving the correct split.
(Create|Set|Delete)\(.*\)\s*\|

What if you use:
\s*\|\s*(?=\b(?:Create|Set|Delete)\b)
See the online demo
\s*\|\s* - A literal pipe-symbol surrounded by zero or more spaces (greedy).
(?= - Positive lookahead:
\b - Word-boundary.
(?: - Open non-capturing group:
Create|Set|Delete - Match either of these alternatives literally.
) - Close non-capturing group.
\b - Word-boundary.
) - Close positive lookahead.
Note: Just add the other "Associate" and "Disassociate" as alternatives as per your own attempt.
In c# code:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"\s*\|\s*(?=\b(?:Create|Set|Delete)\b)";
string input = "Create(Apple | Banana(Tree) | Mango (Tree) ) | Delete(Guava)|Set(Orange(Tree))";
string[] result = Regex.Split(input, pattern,
RegexOptions.IgnoreCase,
TimeSpan.FromMilliseconds(500));
for (int ctr = 0; ctr < result.Length; ctr++) {
Console.Write("'{0}'", result[ctr]);
if (ctr < result.Length - 1)
Console.Write(", ");
}
Console.WriteLine();
}
}
Outputs:
'Create(Apple | Banana(Tree) | Mango (Tree) )', 'Delete(Guava)', 'Set(Orange(Tree))'
Try it over here.

You can use balanced construct:
\b(?:Create|Set|Delete)\((?>[^()]+|(?<c>)\(|(?<-c>)\))*(?(c)(?!))\)
See the .NET regex demo.
Details
\b - a word boundary
(?:Create|Set|Delete) - one of the alternatives listed in the non-capturing group
\( - a ( char
(?>[^()]+|(?<c>)\(|(?<-c>)\))* - zero or more occurrences of any one or more chars other than ( and ) (see [^()]+), or a ( char (with an empty value pushed onto Group "c" stack), or a ) char (with a value popped from the Group "c" stack), then
(?(c)(?!)) - a conditional failing the match if Group "c" stack is not empty
\) - a ) char.
See the C# demo:
var reg = #"\b(?:Create|Set|Delete)\((?>[^()]+|(?<c>)\(|(?<-c>)\))*(?(c)(?!))\)";
var text = "Create(Apple | Banana(Tree) | Mango (Tree) ) | Delete(Guava)|Set(Orange(Tree))";
var result = Regex.Matches(text, reg).Cast<Match>().Select(x => x.Value).ToList();
foreach (var s in result)
Console.WriteLine(s);
Output:
Create(Apple | Banana(Tree) | Mango (Tree) )
Delete(Guava)
Set(Orange(Tree))

Related

Append arrays and lists

For example, if the entered input is:
1 2 3 |4 5 6 | 7 8
we should manipulate it to
1 2 3|4 5 6|7 8
Another example:
7 | 4 5|1 0| 2 5 |3
we should manipulate it to
7|4 5|1 0|2 5|3
This is my idea because I want to exchange some of the subarrays (7; 4 5; 1 0; 2 5; 3).
I'm not sure that this code is working and it can be the base of I want to do but I must upload it for you to see my work.
static void Main(string[] args)
{
List<string> arrays = Console.ReadLine()
.Split(' ', StringSplitOptions.RemoveEmptyEntries)
.ToList();
foreach (var element in arrays)
{
Console.WriteLine("element: " + element);
}
}
You need to split your input by "|" first and then by space. After this, you can reassemble your input with string.Join. Try this code:
var input = "1 2 3 |4 5 6 | 7 8";
var result = string.Join("|", input.Split('|')
.Select(part => string.Join(" ",
part.Trim().Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries))));
// now result is "1 2 3|4 5 6|7 8"
This could do this with a simple regular expression:
var result = Regex.Replace(input, #"\s?\|\s?", "|");
This will match any (optional) white space character, followed by a | character, followed by an (optional) white space character and replace it with a single | character.
Alternatively, if you need to potentially strip out multiple spaces around the |, replace the zero-or-one quantifiers (?) with zero-or-more quantifiers (*):
var result = Regex.Replace(input, #"\s*\|\s*", "|");
To also deal with multiple spaces between numbers (not just around | characters), I'd recommend something like this:
var result = Regex.Replace(input, #"\s*([\s|])\s*", "$1")
This will match any occurrence of zero or more white space characters, followed by either a white space character or a | character (captured in group 1), followed by zero or more white space characters and replace it with whatever was captured in group 1.

c# regex matches exclude first and last character

I know nothing about regex so I am asking this great community to help me out.
With the help of SO I manage to write this regex:
string input = "((isoCode=s)||(isoCode=a))&&(title=s)&&((ti=2)&&(t=2))||(t=2&&e>5)";
string pattern = #"\((?>\((?<DEPTH>)|\)(?<-DEPTH>)|.?)*(?(DEPTH)(?!))\)|&&|\|\|";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
Console.WriteLine(match.Groups[0].Value);
}
And the result is:
((isoCode=s)||(isoCode=a))
&&
(title=s)
&&
((ti=2)&&(t=2))
||
(t=2&&e>5)
but I need result like this (without first/last "(", ")"):
(isoCode=s)||(isoCode=a)
&&
title=s
&&
(ti=2)&&(t=2)
||
t=2&&e>5
Can it be done? I know I can do it with substring (removing first and last character), but I want to know if it can be done with regex.
You may use
\((?<R>(?>\((?<DEPTH>)|\)(?<-DEPTH>)|[^()]+)*(?(DEPTH)(?!)))\)|(?<R>&&|\|\|)
See the regex demo, grab Group "R" value.
Details
\( - an open (
(?<R> - start of the R named group:
(?> - start of the atomic group:
\((?<DEPTH>)| - an open ( and an empty string is pushed on the DEPTH group stack or
\)(?<-DEPTH>)| - a closing ) and an empty string is popped off the DEPTH group stack or
[^()]+ - 1+ chars other than ( and )
)* - zero or more repetitions
(?(DEPTH)(?!)) - a conditional construct that checks if the number of close and open parentheses is balanced
) - end of R named group
\) - a closing )
| - or
(?<R>&&|\|\|) - another occurrence of Group R matching either of the 2 subpatterns:
&& - a && substring
| - or
\|\| - a || substring.
C# code:
var pattern = #"\((?<R>(?>\((?<DEPTH>)|\)(?<-DEPTH>)|[^()]+)*(?(DEPTH)(?!)))\)|(?<R>&&|\|\|)";
var results = Regex.Match(input, pattern)
.Cast<Match>()
.Select(m => m.Groups[1].Value)
.ToList();
Brief
You can use the regex below, but I'd still strongly suggest you write a proper parser for this instead of using regex.
Code
See regex in use here
\(((?>\((?<DEPTH>)|\)(?<-DEPTH>)|.?)*(?(DEPTH)(?!)))\)|&{2}|‌​\|{2}
Usage
See regex in use here
using System;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
string input = "((isoCode=s)||(isoCode=a))&&(title=s)&&((ti=2)&&(t=2))||(t=2&&e>5)";
string pattern = #"\(((?>\((?<DEPTH>)|\)(?<-DEPTH>)|.?)*(?(DEPTH)(?!)))\)|&{2}|\|{2}";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
Console.WriteLine(match.Groups[1].Success ? match.Groups[1].Value : match.Groups[0].Value);
}
}
}
Result
(isoCode=s)||(isoCode=a)
&&
title=s
&&
(ti=2)&&(t=2)
||
t=2&&e>5

Making a group with spaces between words and decimals

I'm not new to the concept of regex but the syntax and semantics of everything get confusing for me at times. I have been trying to create a pattern to recognize
Ambient Relative Humidity: 31.59
With the grouping
Ambient Relative Humidity (Group 1)
31.59 (Group 2)
But I also need to be able to match things such as
Operator: Bob
With the grouping
Operator (Group 1)
Bob (Group 2)
Or
Sensor Number: 0001
With the grouping
Sensor Number (Group 1)
0001 (Group 2)
Here is the current pattern I created which works for the examples involving operator and sensor number but does not match with the first example (ambient humidity)
\s*([A-Za-z0-9]*\s*?[A-Za-z0-9]*)\s*:\s*([A-Za-z0-9]*)
You have to add more space separated key parts to the regex.
Also, you have to add an option for decimal numbers in the value.
Something like this ([A-Za-z0-9]*(?:\s*[A-Za-z0-9]+)*)\s*:\s*((?:\d+(?:\.\d*)?|\.\d+)|[A-Za-z0-9]+)?
https://regex101.com/r/fl0wtb/1
Explained
( # (1 start), Key
[A-Za-z0-9]*
(?: \s* [A-Za-z0-9]+ )*
) # (1 end)
\s* : \s*
( # (2 start), Value
(?: # Decimal number
\d+
(?: \. \d* )?
| \. \d+
)
| # or,
[A-Za-z0-9]+ # Alpha num's
)? # (2 end)
I may have posted too soon without thinking, I now have the following expression
\s*([A-Za-z0-9]*\s*[A-Za-z0-9]*\s*[A-Za-z0-9]*)\s*:\s*([A-Za-z0-9.]*)
The only thing is that it includes spaces sometimes that I was trying to avoid but I can just trim those later. Sorry for posting so soon!
var st = "Ambient Relative Humidity: 31.59 Operator: Bob Sensor Number: 0001";
var li = Regex.Matches(st, #"([\w]+?:)\s+(\d+\.?\d+|\w+)").Cast<Match>().ToList();
foreach (var t in li)
{
Console.WriteLine($"Group 1 {t.Groups[1]}");
Console.WriteLine($"Group 2 {t.Groups[2]}");
}
//Group 1 Humidity:
//Group 2 31.59
//Group 1 Operator:
//Group 2 Bob
//Group 1 Number:
//Group 2 0001

Regex valid AdresseEmail c#

i'm currently developping an application with , i have probleme with Regex.
i have a file txt that contain email like that:
test#test.uk
test1#test.uk
my function loademail must import email from txt and add him to list result.
but the probleme he still work he dont add any email
this is my code :
public class Loademail
{
public EmailAddress email;
public List<Loademail> loademail()
{
var result = new List<Loademail>();
string fileSocks = Path.GetFullPath(Path.Combine(Application.StartupPath, "liste.txt"));
var input = File.ReadAllText(fileSocks);
var r = new Regex(#"^(([\w-]+\.)+[\w-]+|([a-zA-Z]{1}|[\w-]{2,}))#"
+ #"((([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?
[0-9]{1,2}|25[0-5]|2[0-4][0-9])\."
+ #"([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?
[0-9]{1,2}|25[0-5]|2[0-4][0-9])){1}|"
+ #"([a-zA-Z0-9]+[\w-]+\.)+[a-zA-Z]{1}[a-zA-Z0-9-]{1,23})$", RegexOptions.IgnoreCase);
foreach (Match match in r.Matches(input))
{
string Email = match.Groups[1].Value;
Loademail bi = new Loademail();
bi.email = EmailAddress.Parse(Email);
result.Add(bi);
//result.Add(Email);
}
return result;
}
what i should do thnks?
Use ignore pattern whitespace.
Edit
Try it using a while () { next match ...}
Like this
Match _mData = Rx.Match( Input );
while (_mData.Success)
{
if (_mData.Groups[1].Success )
Console.WriteLine("{0} \r\n", _mData.Groups[1].Value);
_mData = _mData.NextMatch();
}
// -------------------
Regex Rx = new Regex(
#"
^(([\w-]+\.)+[\w-]+|([a-zA-Z]{1}|[\w-]{2,}))#((([0
-1]?[0-9]{1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?[0-9]{
1,2}|25[0-5]|2[0-4][0-9])\.([0-1]?[0-9]{1,2}|25[0-
5]|2[0-4][0-9])\.([0-1]?[0-9]{1,2}|25[0-5]|2[0-4][
0-9])){1}|([a-zA-Z0-9]+[\w-]+\.)+[a-zA-Z]{1}[a-zA-
Z0-9-]{1,23})$
",
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace );
Use a good tool to format and process large expressions.
Formatted:
^
( # (1 start)
( [\w-]+ \. )+ # (2)
[\w-]+
| ( [a-zA-Z]{1} | [\w-]{2,} ) # (3)
) # (1 end)
#
( # (4 start)
( # (5 start)
( # (6 start)
[0-1]? [0-9]{1,2}
| 25 [0-5]
| 2 [0-4] [0-9]
) # (6 end)
\.
( # (7 start)
[0-1]?
[0-9]{1,2}
| 25 [0-5]
| 2 [0-4] [0-9]
) # (7 end)
\.
( # (8 start)
[0-1]? [0-9]{1,2}
| 25 [0-5]
| 2 [0-4] [0-9]
) # (8 end)
\.
( # (9 start)
[0-1]?
[0-9]{1,2}
| 25 [0-5]
| 2 [0-4] [0-9]
) # (9 end)
){1} # (5 end)
|
( [a-zA-Z0-9]+ [\w-]+ \. )+ # (10)
[a-zA-Z]{1} [a-zA-Z0-9-]{1,23}
) # (4 end)
$
As a side note, this is a good email regex as well.
# http://www.w3.org/TR/html5/forms.html#valid-e-mail-address
# ^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+#[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$
^
[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+
#
[a-zA-Z0-9]
(?:
[a-zA-Z0-9-]{0,61}
[a-zA-Z0-9]
)?
(?:
\.
[a-zA-Z0-9]
(?:
[a-zA-Z0-9-]{0,61}
[a-zA-Z0-9]
)?
)*
$

How to get text between nested parentheses?

Reg Expression for Getting Text Between parenthesis ( ), I had tried but i am not getting the RegEx. For this example
Regex.Match(script, #"\((.*?)\)").Value
Example:-
add(mul(a,add(b,c)),d) + e - sub(f,g)
Output =>
1) mul(a,add(b,c)),d
2) f,g
.NET allows recursion in regular expressions. See Balancing Group Definitions
var input = #"add(mul(a,add(b,c)),d) + e - sub(f,g)";
var regex = new Regex(#"
\( # Match (
(
[^()]+ # all chars except ()
| (?<Level>\() # or if ( then Level += 1
| (?<-Level>\)) # or if ) then Level -= 1
)+ # Repeat (to go from inside to outside)
(?(Level)(?!)) # zero-width negative lookahead assertion
\) # Match )",
RegexOptions.IgnorePatternWhitespace);
foreach (Match c in regex.Matches(input))
{
Console.WriteLine(c.Value.Trim('(', ')'));
}

Categories