Replace exact word with an optional letter prefix - c#

I need to replace the word PARAM_DATETIME in the string:
string input = "^FT734,274^A0I,28,28^FH\\^FDPARAM_DATETIME^FS";
I'm trying with:
string newstr = Regex.Replace("^FT734,274^A0I,28,28^FH\\^FDPARAM_DATETIME^FS", #"\bPARAM_DATETIME\b", "27-01-2022");
but it doesn´t work.
The goal is to match the word PARAM_DATETIME even if it is preceded with F at the start of the word followed with any uppercase letter.

You can use
Regex.Replace(text, #"\b(F[A-Z])?PARAM_DATETIME\b", "${1}27-01-2022")
See the regex demo. Details:
\b - a word boundary
(F[A-Z])? - Group 1 (optional): F and then any one ASCII uppercase letter
PARAM_DATETIME - a word
\b - a word boundary
The match is replaced with Group 1 value (${1}) and the hardcoded string.

If you only need to change the word PARAM_DATETIME, isn't it easier to use String.Replace?
string input = "^FT734,274^A0I,28,28^FH\\^FDPARAM_DATETIME^FS";
input = input.Replace("PARAM_DATETIME", "27-01-2022");

Related

How to negate filename after a specific term in a regex

I have a regex that detect urls:
#"((http|ftp|https)\:\/\/)?([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?";
I am using it with regex.replace to remove urls from text.
I do not want it to replace any word that starts with /images
for example if the text is "this is my text here is a link http://dfdf.com and my is /images/dd.gif"
I need the http://dfdf.com replaces but not the /images/dd.gif
my regex replaces the dd.gif
so I want to negate any word after images/
any idea how can I fix this ?
You may start matching after a word boundary, and fail the match if it is immediately preceded with a whole "word" images/ using
\b(?<!\bimages/)(?:(?:http|ftp)s?://)?([\w-]+(?:\.[\w-]+)+)([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?
See the regex demo. Details:
\b - a word boundary
(?<!\bimages/) - no images/ as a whole word is allowed immediately on the left
(?:(?:http|ftp)s?://)? - an optional sequence of either http or ftp followed with an optional s and then :// substring
([\w-]+(?:\.[\w-]+)+) - Group 1: one or more word or hyphen chars followed with one or more sequences of a . and then one or more word or hyphen chars
([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])? - an optional Group 2: zero or more word chars or chars from the .,#?^=%&:/~+#- set and then a word char or a char from the #?^=%&/~+#- set.
As an alternative solution, you could match match what you don't want to remove and capture what you do want to remove.
You can use a callback with Replace and test for the existence of group 1. If it is there, return an empty string. If it is not there, return the match to leave it unchanged.
\S*/images\S*|(?<!\S)((?:(?:https?|ftp)://)?[\w-]+(?:(?:\.[\w-]+)+)(?:[\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?)
Explanation
\S*/images\S* Match /images preceded and followed by optional non whitespace chars that your want to keep
| Or
(?<!\S) Assert a whitespace boundary to the left
((?:(?:https?|ftp)://)?[\w-]+(?:(?:\.[\w-]+)+)(?:[\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?) The pattern that you tried with some minor changes to make it a bit shorter
Regex demo (Click on the Table tab to see the matches)
For example
var s = #"this is my text here is a link http://dfdf.com and my is /images/dd.gif";
var regex = new Regex(#"\S*/images\S*|(?<!\S)((?:(?:https?|ftp)://)?[\w-]+(?:(?:\.[\w-]+)+)(?:[\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?)");
var result = regex.Replace(s, match => match.Groups[1].Success ? "" : match.Value);
Console.WriteLine(result);
See a C# demo

Alternate regex with -SDR?

I have the following regex in my c#:
(?<!\w)M20A\w+
Actual code:
string regex = $#"(?<!\w){prefix}\w+";
Notice the prefix var matches strings such as M20A and X50G.
It perfectly matches the following cases:
M20A0820
M20A1234
M20A7U8V
But now I got a new requirement from the business to match, for example:
M20A-SDR
It will be the prefix followed by the exact string "-SDR". Not just a dash followed by 3 alphanumerics, but literally "-SDR". The existing matches need to still work, but prefix + "-SDR" must also be matched.
What would be the regex that would match the following:
M20A0820
M20A1234
M20A7U8V
M20A-SDR
You may use
string regex = $#"(?<!\w){prefix}\w*(?:-SDR)?";
See the regex demo.
Or, to match as a whole word, you may use word boundaries:
string regex = $#"\b{prefix}\w*(?:-SDR)?\b";
See this regex demo
The \b word boundary at the start will work if all the values in prefix start with a word char, a letter, digit or _. The word boundary at the end will make sense if after -SDR, there can be no more word chars.
The (?:-SDR)? will match a -SDR string optonally.
Details
\b - word boundary
M20A - a literal string
\w* - 0+ word chars
(?:-SDR)? - a non-capturing group that matches 1 or 0 times (as there is a ? after it) an -SDR substring
\b - a word boundary.

Regex that removes the 2 trailing letters from a string not preceded with other letters

This is in C#. I've been bugging my head but not luck so far.
So for example
123456BVC --> 123456BVC (keep the same)
123456BV --> 123456 (remove trailing letters)
12345V -- > 12345V (keep the same)
12345 --> 12345 (keep the same)
ABC123AB --> ABC123 (remove trailing letters)
It can start with anything.
I've tried #".*[a-zA-Z]{2}$" but no luck
This is in C# so that I always return a string removing the two trailing letters if they do exist and are not preceded with another letter.
Match result = Regex.Match(mystring, pattern);
return result.Value;
Your #".*[a-zA-Z]{2}$" regex matches any 0+ characters other than a newline (as many as possible) and 2 ASCII letters at the end of the string. You do not check the context, so the 2 letters are matched regardless of what comes before them.
You need a regex that will match the last two letters not preceded with a letter:
(?<!\p{L})\p{L}{2}$
See this regex demo.
Details:
(?<!\p{L}) - fails the match if a letter (\p{L}) is found before the current position (you may use [a-zA-Z] if you only want to deal with ASCII letters)
\p{L}{2} - 2 letters
$ - end of string.
In C#, use
var result = Regex.Replace(mystring, #"(?<!\p{L})\p{L}{2}$", string.Empty);
If you're looking to remove those last two letters, you can simply do this:
string result = Regex.Replace(originalString, #"[A-Za-z]{2}$", string.Empty);
Remember that in regex $ means the end of the input or the string before a newline.

Regex search for string like "$12,56,45" using c#

I want it to search string like "$12,56,450" using Regex in c#, but it doesn't match the string
Here is my code:
string input="Total earn for the year $12,56,450";
string pattern = #"\b(?mi)($12,56,450)\b";
Regex regex = new Regex(pattern);
if (regex.Match(input).Success)
{
return true;
}
This Regex will do the job, (?mi)(\$\d{2},\d{2},\d{3}), and here's a Regex 101 to prove it.
Now let's break it down a little:
\$ matches the literal $ at the beginning of the string
\d{2} matches any two digits
, matches the literal ,
\d{2} matches any two digits
, matches the literal ,
\d{3} matches any three digits
Now, for the purposes of the demonstration I removed the word boundaries, \b, but I'm also pretty confident you don't need them anyway. See, word boundaries aren't generally necessary for such a finite string match. Consider their definition:
Before the first character in the string, if the first character is a word character.
After the last character in the string, if the last character is a word character.
Between two characters in the string, where one is a word character and the other is not a word character.
You need to escape $ and some other special regex caracters.
try this #"\b(?mi)(\$12,56,450)\b";
if you want you can use \d to match a digit, and use \d{2,3} to match a digit with size 2 or 3.

regex check for white space in middle of string

I want to validate that the characters are alpha numeric:
Regex aNum = Regex("[a-z][A-Z][0-9]");
I want to add the option that there might be a white space, so it would be a two word expression:
Regex aNum = Regex("[a-z][A-Z][0-9]["\\s]");
but couldn't find the correct syntax.
id applicate any incite.
[A-Za-z0-9\s]{1,} should work for you. It matches any string which contains alphanumeric or whitespace characters and is at least one char long. If you accept underscores, too you shorten it to [\w\s]{1,}.
You should add ^ and $ to verify the whole string matches and not only a part of the string:
^[A-Za-z0-9\s]{1,}$ or ^[\w\s]{1,}$.
Exactly two words with single space:
Regex aNum = Regex("[a-zA-Z0-9]+[\s][a-zA-Z0-9]+");
OR any number of words having any number of spaces:
Regex aNum = Regex("[a-zA-Z0-9\s]");
"[A-Za-z0-9\s]*"
matches alphanumeric characters and whitespace. If you want a word that can contain whitespace but want to ensure it starts and ends with an alphanumeric character you could try
"[A-Za-z0-9][A-Za-z0-9\s]*[A-Za-z0-9]|[A-Za-z0-9]"
To not allow empty strings then
Regex.IsMatch(s ?? "",#"^[\w\s]+$");
and to allow empty strings
Regex.IsMatch(s ?? "",#"^[\w\s]*$");
I added the ?? "" as IsMatch does not accept null arguments
If you want to check for white space in middle of string you can use these patterns :
"(\w\s)+" : this must match a word with a white space at least.
"(\w\s)+$" : this must match a word with a white space at least and must finish with white space.
"[\w\s]+" : this match for word or white space or the two.

Categories