Escaping characters in RegEx pattern string [duplicate] - c#

This question already has answers here:
Escape Special Character in Regex
(3 answers)
Closed 7 years ago.
I'm trying to extract MTQ0ODQ3NjcyNDoxNDQ4NDc2NzI0OjE6LTM4OTc1OTc2MjM4MDc1OTM2NjY6MTQ0ODQ3NjAwMzowOjA6NTQw from the string below.
I am having issues with the \\ (backslash) characters. How do I escape these in C#. Is there any documentation that shows characters that need escaping in regex patterns, and how to escape them?
first_cursor\\":\\"MTQ0ODQ3NjcyNDoxNDQ4NDc2NzI0OjE6LTM4OTc1OTc2MjM4MDc1OTM2NjY6MTQ0ODQ3NjAwMzowOjA6NTQw\\"
I've tried the following to no avail. I tried to avoid having to escape the backslashes altogether:
MatchCollection matches = Regex.Matches(content, "first_cursor*.quot;([-0-9A-Za-z]+)");
Any help would be much appreciated.

In C# each backslash in a string can be written as \\\\.
You can use the following:
MatchCollection matches = Regex.Matches(content, "first_cursor\\\\{2}":\\\\{2}&quot([-0-9A-Za-z]+)");

I prefer to use verbatim string literals when writing RegEx strings in C#:
string pattern = #"first_cursor\\\\":\\\\"([-0-9A-Za-z]+)\\\\"";
This prevents you from having to escape the slashes twice; once for C# and again for the RegEx engine.
As an aside, this syntax is also useful when storing paths in strings:
string logFile = #"C:\Temp\mylog.txt";
And even supports multi-line for SQL commands and such:
string query = #"
SELECT *
FROM tblStudents
WHERE FirstName = 'Bobby'
AND LastName = 'Tables'
";

You can use lookahead to elimate some of the contenders:
var example = #"first_cursor\\":\\"MTQ0ODQ3NjcyNDoxNDQ4NDc2NzI0OjE6LTM4OTc1OTc2MjM4MDc1OTM2NjY6MTQ0ODQ3NjAwMzowOjA6NTQw\\"";
var regex = new Regex("(?<!&[-0-9A-Za-z]*)(?<!_[-0-9A-Za-z]*)[-0-9A-Za-z]+");
var matches = regex.Matches(example);
foreach(var match in matches)
{
if (match.ToString() != "first")
{
Console.WriteLine(match);
}
}
This would give you two matches. One for first and one for the string you are looking for. Then you can iterate over the matches and see if it's not "first" then it should be what you are looking for.

Related

Regex replace problems with \\ [duplicate]

This question already has answers here:
Regular expression to allow backslash in C#
(3 answers)
Closed 3 years ago.
I created my own regex, and everything work fine except the backslash thing. I tried my versions, but none of them helped.
var regexItem = new Regex("[^A-Za-z0-9_.,&/*:;=+{}()'\"\\ -]+");
string temp2 = "";
while ((#line = file2.ReadLine()) != null)
{
if (regexItem.IsMatch(line) && (line.Contains(".jpg") || line.Contains(".png") || line.Contains(".jpeg") || line.Contains(".svg")))
{
#temp2 = Regex.Replace(line, "[^A-Za-z0-9_.,&/\\*:;=+{}()'\" -]+", "");
postki.WriteLine(#temp2);
Console.WriteLine(#"{0} ==> {1}", #line, #temp2);
}
else
{
postki.WriteLine(#line);
}
}
In c# string literals, special characters (like ") need to be escaped. C# uses the \ to escape those characters.
For example - as you already did - a string containing a " would be declared like that:
string quote = "\"";
So of course the backslash itself needs escapting, too. So a string containing a back-slash would look like this:
string backslash = "\\";
So with two backslashs in the literal, you have one real backslash in the string.
But a backslash also is a special character in regular expressions (also used to escape other symbols, for example if you mean a literal . instead of the any-character-dot, you use \.). So to use a literal backslash (meaning to match a single backslash in the search string) you need to use 4 back-slashes in your c# string literal:
string regexBackSlash = "\\\\";
In your regex you probably meant:
var regexItem = new Regex("[^A-Za-z0-9_.,&/*:;=+{}()'\"\\\\ -]+");
// difference here ^
Or you may use verbatim string literals (with a leading #):
var regexItem = new Regex(#"[^A-Za-z0-9_.,&/*:;=+{}()'""\\ -]+");
(but then again you need to escape the " with a second one).

RegEx in C# Replace Method [duplicate]

This question already has answers here:
C#: How to Delete the matching substring between 2 strings?
(6 answers)
Closed 4 years ago.
I am trying to write the RegEx for replacing "name" part in below string.
\profile\name\details
Where name: -Can have special characters
-No spaces
Let's say I want to replace "name" in above path with ABCD, the result would be
\profile\ABCD\details
What would be the RegEx to be used in Replace for this?
I have tried [a-zA-Z0-9##$%&*+\-_(),+':;?.,!\[\]\s\\/]+$ but it's not working.
As your dynamic part is surrounded by two static part you can use them to find it.
\\profile\\(.*)\\details
Now if you want to replace only the middle part you can either use LookAround.
string pattern = #"(?<=\\profile\\).*(?=\\details)";
string substitution = #"titi";
string input = #"\profile\name\details
\profile\name\details
";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
Or use the replacement patterns $GroupIndex
string pattern = #"(\\profile\\)(.*)(\\details)";
string substitution = #"$1Replacement$3";
string input = #"\profile\name\details
\profile\name\details
";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
For readable nammed group substitution is a possibility.

Split a string by Regex [duplicate]

This question already has answers here:
Regular expression to extract text between square brackets
(15 answers)
Closed 5 years ago.
I'm currently thinking of how to split this kind of string into regex using c#.
[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]
Can someone knowledgeable on regex can point me on how to achieved this goal?
sample regex pattern that don't work:
[\dd,\dd,\dd]
sample output:
[01,01,01]
[02,03,00]
[03,07,00]
[04,06,00]
[05,02,00]
[06,04,00]
[07,08,00]
[08,05,00]
This will do the job in C# (\[.+?\]), e.g.:
var s = #"[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]";
var reg = new Regex(#"(\[.+?\])");
var matches = reg.Matches(s);
foreach(Match m in matches)
{
Console.WriteLine($"{m.Value}");
}
EDIT This is how the expression (\[.+?\]) works
first the outter parenthesis, ( and ), means to capture whatever the inside pattern matched
then the escaped square brackets, \[ and \], is to match the [ and ] in the source string
finally the .+? means to match one or more characters, but as few times as possible, so that it won't match all the characters before the first [ and the last ]
I know you stipulated Regex, however it's worth looking at Split again, if for only for academic purposes:
Code
var input = "[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]";
var output = input.Split(']',StringSplitOptions.RemoveEmptyEntries)
.Select(x => x + "]") // the bracket back
.ToList();
foreach(var o in output)
Console.WriteLine(o);
Output
[01,01,01]
[02,03,00]
[03,07,00]
[04,06,00]
[05,02,00]
[06,04,00]
[07,08,00]
[08,05,00]
The Regex solution below is restricted to 3 values of only 2 digits seperated by comma. Inside the foreach loop you can access the matching value via match.Value. >> Refiddle example
Remember to include using System.Text.RegularExpressions;
var input = "[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]";
foreach(var match in Regex.Matches(input, #"(\[\d{2},\d{2},\d{2}\])+"))
{
// do stuff
}
Thanks all for the answer i also got it working by using this code
string pattern = #"\[\d\d,\d\d,\d\d]";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(myResult);
Debug.WriteLine(matches.Count);
foreach (Match match in matches)
Debug.WriteLine(match.Value);

Replace text place holders with Regular Expression [duplicate]

This question already has answers here:
Extract string between braces using RegEx, ie {{content}}
(3 answers)
Closed 6 years ago.
I have a text template that has text variables wrapped with {{ and }}.
I need a regular expression to gives me all the matches that "Include {{ and }}".
For example if I have {{FirstName}} in my text I want to get {{FirstName}} back as a match to be able to replace it with the actual variable.
I already found a regular expression that probably gives me what is INSIDE { and } but I don't know how can I modify it to return what I want.
/\{([^)]+)\}/
This pattern should do the trick:
string str = "{{FirstName}} {{LastName}}";
Regex rgx = new Regex("{{.*?}}");
foreach (var match in rgx.Matches(str))
{
// {{FirstName}}
// {{LastName}}
}
Maybe:
alert(/^\{{2}[\w|\s]+\}{2}$/.test('{{FirstName}}'))
^: In the beginning.
$: In the end.
\{{2}: Character { 2 times.
[\w|\s]+: Alphabet characters or whitespace 1 or more times.
\}{2}: Character } 2 times.
UPDATE:
alert(/(^\{{2})?[\w|\s]+(\}{2})?$/.test('FirstName'))

C# Regex for retrieving capital string in quotation mark

Given a string, I want to retrieve a string that is in between the quotation marks, and that is fully capitalized.
For example, if a string of
oqr"awr"q q"ASRQ" asd "qIKQWIR"
has been entered, the regex would only evaluate "ASRQ" as matching string.
What is the best way to approach this?
Edit: Forgot to mention the string takes a numeric input as well I.E: "IO8917AS" is a valid input
EDIT: If you actually want "one or more characters, and none of the characters is a lower-case letter" then you probably want:
Regex regex = new Regex("\"\\P{Ll}+\"");
That will then allow digits as well... and punctuation. If you want to allow digits and upper case letters but nothing else, you can use:
Regex regex = new Regex("\"[\\p{Lu}\\d]+\"");
Or in verbatim string literal form (makes the quotes more confusing, but the backslashes less so):
Regex regex = new Regex(#"""[\p{Lu}\d]+""");
Original answer (before digits were required)
Sounds like you just want (within the pattern)
"[A-Z]*"
So something like:
Regex regex = new Regex("\"[A-Z]*\"");
Or for full Unicode support, use the Lu Unicode character category:
Regex regex = new Regex("\"\\p{Lu}*\"");
EDIT: As noted, if you don't want to match an empty string in quotes (which is still "a string where everything is upper case") then use + instead of *, e.g.
Regex regex = new Regex("\"\\p{Lu}+\");
Short but complete example of finding and displaying the first match:
using System;
using System.Text.RegularExpressions;
class Program
{
public static void Main()
{
Regex regex = new Regex("\"\\p{Lu}+\"");
string text = "oqr\"awr\"q q\"ASRQ\" asd \"qIKQWIR\"";
Match match = regex.Match(text);
Console.WriteLine(match.Success); // True
Console.WriteLine(match.Value); // "ASRQ"
}
}
Like this:
"\"[A-Z]+\""
The outermost quotes are not part of the regex, they delimit a C# string.
This requires at least one uppercase character between quotes and works for the English language.
Please try the following:
[\w]*"([A-Z0-9]+)"

Categories