I need to match a string:-
that always starts with 'P#' (case-insensitive)
that always contains 'Z#'
and ends with new line (\r or \n or \r\n)
Example strings:
P#M1RE2Z#
P#M2S0Z#M2SX0
P#M3S12Z#
Here is what i figured out so far but need to match 'Z#' in between
(P#.*?(\r|\n|\r\n))
this one should work for you
^P\#.*Z\#.*[\n\r]+
Note: I put \ before # because in regex # is comment,
this regex will much only if the line ends with \n or \r.
This will work
\bP#(?=.*Z#)(?=.*[\r\n]+)\b
Regex Demo
Related
I want to filter a regex with a ... regex ...
My target is in a file which content is
...
information 1...
Entity1=^\|1[\s\t]+[\S]+[\s\t]+(.*)$
information 2...
...
The file is transferred to mystring with the method ReadAllText(path); where path is the path to the text file.
I use the code
//Retrieve regex like ^\|1[\s\t]+[\S]+[\s\t]+(.*)$ in Entity1=^\|1[\s\t]+[\S]+[\s\t]+(.*)$
//\d for any digit followed by =
// . for any character found 1 or + times, ended with space character
m = Regex.Match(mystring, #"Entity\d=(.+)\s");
string regex = m.Groups[1].Value;
which works almost fine
What I get is ( seen from inside the degugger )
^\|1[\s\t]+[\S]+[\s\t]+(.*)$\r
There is an additional \r at the end of the result. It causes an unwanted extra newline in other parts of the code.
Trying #"Entity\d=(.+)" (i.e removing the final \s) does not help.
Any idea of how to avoid the additionnal \r gracefully ( I do not want,if possible, to track the finale \r and remove it )
Online regex tester like regex101 did not permit to foresee this problem before going to C# code
Use a negated character class to make sure \r is not matched:
m = Regex.Match(mystring, #"Entity\d=([^\r\n]+)");
The [^\r\n] class means match any character other than a carriage return and a line feed.
It is true that regex101 does not keep carriage returns. You can see the \r matching at regexhero.net:
Check if this works:
#"Entity\d=(.+)(?=(\r|\n))";
(?=(\r|\n)) is a positive lookahead and means that the \r or \n won't be included in the result.
Edit:
#"Entity\d=(.+?)(?=\r|\n)";
Sample string
+ABC:108\r\nmessage a\r\n+ABC:117\r\nmessage b\r\n
here is my initial regex
+ABC:(\d+)\r\n(.+)\r\n
Groups
Group 1: Index
Group 2: Message
Where is your exact problem?
I see this points:
If you want to match the + literally you have to escape it in your regex \+ABC:(\d+)\r\n(.+)\r\n. Only one \ if you use verbatim string (#"regex") to define your regex
If you don't use the [Singleline][1] option it should be no greediness problem since the . will not match newline characters.
Are you sure that \r\n is your newline? Maybe use \r?\n to be more flexible.
I asked a similar question a few weeks ago on how to split a string based on a specific substring. However, I now want to do something a little different. I have a line that looks like this (sorry about the formatting):
What I want to do is split this line at all the newline \r\n sequences. However, I do not want to do this if there is a PA42 after one of the PA41 lines. I want the PA41 and the PA42 line that follows it to be on the same line. I have tried using several regex expressions to no avail. The output that I am looking for will ideally look like this:
This is the regex that I am currently using, but it does not quite accomplish what I am looking for.
string[] p = Regex.Split(parameterList[selectedIndex], #"[\r\n]+(?=PA41)");
If you need any clarifications, please feel free to ask.
You're trying a positive look-ahead, you want a negative one. (Positive insures that the pattern does follow, whereas negative insures it does not.)
(\\r\\n)(?!PA42)
Works for me.
string[] splitArray = Regex.Split(subjectString, #"\\r\\n(?!PA42)");
This should work. It uses a negative lookahead assertion to ensure that a \r\n sequence is not followed by PA42.
Explanation :
#"
\\ # Match the character “\” literally
r # Match the character “r” literally
\\ # Match the character “\” literally
n # Match the character “n” literally
(?! # Assert that it is impossible to match the regex below starting at this position (negative lookahead)
PA42 # Match the characters “PA42” literally
)
"
How would a regex pattern to match each line of a given text be?
I'm trying ^(.+)$ but it includes crlf...
Just use RegexOptions.Multiline.
Multiline mode. Changes the meaning of
^ and $ so they match at the beginning
and end, respectively, of any line,
and not just the beginning and end of
the entire string.
Example:
var lineMatches = Regex.Matches("Multi\r\nlines", "^(.+)$", RegexOptions.Multiline);
I'm not sure what you mean by "match each line of a given text" means, but you can use a character class to exclude the CR and LF characters:
[^\r\n]+
The wording of your question seems a little unclear, but it sounds like you want RegexOptions.Multiline (in the System.Text.RegularExpressions namespace). It's an option you have to set on your RegEx object. That should make ^ and $ match the beginning and end of a line rather than the entire string.
For example:
Regex re = new Regex("^(.+)$", RegexOptions.Compiled | RegexOptions.Multiline);
Have you tried:
^(.+)\r?\n$
That way the match group includes everything except the CRLF, and requires that a new line be present (Unix default), but accepts the carriage return in front (Windows default).
I assume you're using the Multiline option? In that case you'll want to match the newline explicitly with "\n". (substitute "\r\n" as appropriate.)
I am trying to make a regex that matches all occurrences of words that are at the start of a line and begin with #.
For example in:
#region #like
#hey
It would match #region and #hey.
This is what I have right now:
^#\w*
I apologize for posting this question. I'm sure it has a very simple answer, but I have been unable to find it. I admit that I am a regex noob.
What you've got should work, depending on what flags you pass for RegexOptions. You need to make sure you pass RegexOptions.Multiline:
var matches = Regex.Matches(input, #"^#\w*", RegexOptions.Multiline);
See the documentation I linked to above:
Multiline Multiline mode. Changes the meaning of ^ and $ so they match at the beginning and end, respectively, of any line, and not just the beginning and end of the entire string.
The regex looks fine, make sure you're using a verbatim string literal (# prefix) to define your regex, i.e. #"^#\w*" otherwise the backslash will be treated as an escape sequence.
Use this regex
^#.+?\b
.+ will ensure at least one character after # and \b indicates word boundry. ? adds non-greediness to the + operator so as to avoid matching whole string #region #like