Regular Expression Help - Difficulty Matching String

Regular Expression Help - Difficulty Matching String - c#

I need to match a string:-
that always starts with 'P#' (case-insensitive)
that always contains 'Z#'
and ends with new line (\r or \n or \r\n)
Example strings:
P#M1RE2Z#
P#M2S0Z#M2SX0
P#M3S12Z#
Here is what i figured out so far but need to match 'Z#' in between
(P#.*?(\r|\n|\r\n))

this one should work for you
^P\#.*Z\#.*[\n\r]+
Note: I put \ before # because in regex # is comment,
this regex will much only if the line ends with \n or \r.

This will work
\bP#(?=.*Z#)(?=.*[\r\n]+)\b
Regex Demo

Related

Regex filtering Regex, extra additional final \r

I want to filter a regex with a ... regex ...
My target is in a file which content is
...
information 1...
Entity1=^\|1[\s\t]+[\S]+[\s\t]+(.*)$
information 2...
...
The file is transferred to mystring with the method ReadAllText(path); where path is the path to the text file.
I use the code
//Retrieve regex like ^\|1[\s\t]+[\S]+[\s\t]+(.*)$ in Entity1=^\|1[\s\t]+[\S]+[\s\t]+(.*)$
//\d for any digit followed by =
// . for any character found 1 or + times, ended with space character
m = Regex.Match(mystring, #"Entity\d=(.+)\s");
string regex = m.Groups[1].Value;
which works almost fine
What I get is ( seen from inside the degugger )
^\|1[\s\t]+[\S]+[\s\t]+(.*)$\r
There is an additional \r at the end of the result. It causes an unwanted extra newline in other parts of the code.
Trying #"Entity\d=(.+)" (i.e removing the final \s) does not help.
Any idea of how to avoid the additionnal \r gracefully ( I do not want,if possible, to track the finale \r and remove it )
Online regex tester like regex101 did not permit to foresee this problem before going to C# code

Use a negated character class to make sure \r is not matched:
m = Regex.Match(mystring, #"Entity\d=([^\r\n]+)");
The [^\r\n] class means match any character other than a carriage return and a line feed.
It is true that regex101 does not keep carriage returns. You can see the \r matching at regexhero.net:

Check if this works:
#"Entity\d=(.+)(?=(\r|\n))";
(?=(\r|\n)) is a positive lookahead and means that the \r or \n won't be included in the result.
Edit:
#"Entity\d=(.+?)(?=\r|\n)";

how to match an expression that starts and ends with a cariage return and belongs to a group?

Sample string
+ABC:108\r\nmessage a\r\n+ABC:117\r\nmessage b\r\n
here is my initial regex
+ABC:(\d+)\r\n(.+)\r\n
Groups
Group 1: Index
Group 2: Message

Where is your exact problem?
I see this points:
If you want to match the + literally you have to escape it in your regex \+ABC:(\d+)\r\n(.+)\r\n. Only one \ if you use verbatim string (#"regex") to define your regex
If you don't use the [Singleline][1] option it should be no greediness problem since the . will not match newline characters.
Are you sure that \r\n is your newline? Maybe use \r?\n to be more flexible.

Regex to exclude part of string on split

I asked a similar question a few weeks ago on how to split a string based on a specific substring. However, I now want to do something a little different. I have a line that looks like this (sorry about the formatting):
What I want to do is split this line at all the newline \r\n sequences. However, I do not want to do this if there is a PA42 after one of the PA41 lines. I want the PA41 and the PA42 line that follows it to be on the same line. I have tried using several regex expressions to no avail. The output that I am looking for will ideally look like this:
This is the regex that I am currently using, but it does not quite accomplish what I am looking for.
string[] p = Regex.Split(parameterList[selectedIndex], #"[\r\n]+(?=PA41)");
If you need any clarifications, please feel free to ask.

You're trying a positive look-ahead, you want a negative one. (Positive insures that the pattern does follow, whereas negative insures it does not.)
(\\r\\n)(?!PA42)
Works for me.

string[] splitArray = Regex.Split(subjectString, #"\\r\\n(?!PA42)");
This should work. It uses a negative lookahead assertion to ensure that a \r\n sequence is not followed by PA42.
Explanation :
#"
\\ # Match the character “\” literally
r # Match the character “r” literally
\\ # Match the character “\” literally
n # Match the character “n” literally
(?! # Assert that it is impossible to match the regex below starting at this position (negative lookahead)
PA42 # Match the characters “PA42” literally
)
"

Regex to match full lines of text excluding crlf

How would a regex pattern to match each line of a given text be?
I'm trying ^(.+)$ but it includes crlf...

Just use RegexOptions.Multiline.
Multiline mode. Changes the meaning of
^ and $ so they match at the beginning
and end, respectively, of any line,
and not just the beginning and end of
the entire string.
Example:
var lineMatches = Regex.Matches("Multi\r\nlines", "^(.+)$", RegexOptions.Multiline);

I'm not sure what you mean by "match each line of a given text" means, but you can use a character class to exclude the CR and LF characters:
[^\r\n]+

The wording of your question seems a little unclear, but it sounds like you want RegexOptions.Multiline (in the System.Text.RegularExpressions namespace). It's an option you have to set on your RegEx object. That should make ^ and $ match the beginning and end of a line rather than the entire string.
For example:
Regex re = new Regex("^(.+)$", RegexOptions.Compiled | RegexOptions.Multiline);

Have you tried:
^(.+)\r?\n$
That way the match group includes everything except the CRLF, and requires that a new line be present (Unix default), but accepts the carriage return in front (Windows default).

I assume you're using the Multiline option? In that case you'll want to match the newline explicitly with "\n". (substitute "\r\n" as appropriate.)

What is wrong with my regex (simple)?

I am trying to make a regex that matches all occurrences of words that are at the start of a line and begin with #.
For example in:
#region #like
#hey
It would match #region and #hey.
This is what I have right now:
^#\w*
I apologize for posting this question. I'm sure it has a very simple answer, but I have been unable to find it. I admit that I am a regex noob.

What you've got should work, depending on what flags you pass for RegexOptions. You need to make sure you pass RegexOptions.Multiline:
var matches = Regex.Matches(input, #"^#\w*", RegexOptions.Multiline);
See the documentation I linked to above:
Multiline Multiline mode. Changes the meaning of ^ and $ so they match at the beginning and end, respectively, of any line, and not just the beginning and end of the entire string.

The regex looks fine, make sure you're using a verbatim string literal (# prefix) to define your regex, i.e. #"^#\w*" otherwise the backslash will be treated as an escape sequence.

Use this regex
^#.+?\b
.+ will ensure at least one character after # and \b indicates word boundry. ? adds non-greediness to the + operator so as to avoid matching whole string #region #like

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regular Expression Help - Difficulty Matching String - c#

I need to match a string:- that always starts with 'P#' (case-insensitive) that always contains 'Z#' and ends with new line (\r or \n or \r\n) Example strings: P#M1RE2Z# P#M2S0Z#M2SX0 P#M3S12Z# Here is what i figured out so far but need to match 'Z#' in between (P#.*?(\r|\n|\r\n))

this one should work for you ^P\#.Z\#.[\n\r]+ Note: I put \ before # because in regex # is comment, this regex will much only if the line ends with \n or \r.

This will work \bP#(?=.Z#)(?=.[\r\n]+)\b Regex Demo

Related

Regex filtering Regex, extra additional final \r

how to match an expression that starts and ends with a cariage return and belongs to a group?

Regex to exclude part of string on split

Regex to match full lines of text excluding crlf

What is wrong with my regex (simple)?

Categories

Resources

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regular Expression Help - Difficulty Matching String - c#

I need to match a string:- that always starts with 'P#' (case-insensitive) that always contains 'Z#' and ends with new line (\r or \n or \r\n) Example strings: P#M1RE2Z# P#M2S0Z#M2SX0 P#M3S12Z# Here is what i figured out so far but need to match 'Z#' in between (P#.*?(\r|\n|\r\n))

this one should work for you ^P\#.*Z\#.*[\n\r]+ Note: I put \ before # because in regex # is comment, this regex will much only if the line ends with \n or \r.

This will work \bP#(?=.*Z#)(?=.*[\r\n]+)\b Regex Demo

Related

Regex filtering Regex, extra additional final \r

how to match an expression that starts and ends with a cariage return and belongs to a group?

Regex to exclude part of string on split

Regex to match full lines of text excluding crlf

What is wrong with my regex (simple)?

Categories

Resources

this one should work for you ^P\#.Z\#.[\n\r]+ Note: I put \ before # because in regex # is comment, this regex will much only if the line ends with \n or \r.

This will work \bP#(?=.Z#)(?=.[\r\n]+)\b Regex Demo