I need some help on Regex. I need to find a word that is surrounded by whatever element, for example - *. But I need to match it only if it has spaces or nothing on the ether sides. For example if it is at start of the text I can't really have space there, same for end.
Here is what I came up to
string myString = "You will find *me*, and *me* also!";
string findString = #"(\*(.*?)\*)";
string foundText;
MatchCollection matchCollection = Regex.Matches(myString, findString);
foreach (Match match in matchCollection)
{
foundText = match.Value.Replace("*", "");
myString = myString.Replace(match.Value, "->" + foundText + "<-");
match.NextMatch();
}
Console.WriteLine(myString);
You will find ->me<-, and ->me<- also!
Works correct, the problem is when I add * in the middle of text, I don't want it to match then.
Example: You will find *m*e*, and *me* also!
Output: You will find ->m<-e->, and <-me* also!
How can I fix that?
Try the following pattern:
string findString = #"(?<=\s|^)\*(.*?)\*(?=\s|$)";
(?<=\s|^)X will match any X only if preceded by a space-char (\s), or the start-of-input, and
X(?=\s|$) matches any X if followed by a space-char (\s), or the end-of-input.
Note that it will not match *me* in foo *me*, bar since the second * has a , after it! If you want to match that too, you need to include the comma like this:
string findString = #"(?<=[\s,]|^)\*(.*?)\*(?=[\s,]|$)";
You'll need to expand the set [\s,] as you see fit, of course. You might want to add !, ? and . at the very least: [\s,!?.] (and no, . and ? do not need to be escaped inside a character-set!).
EDIT
A small demo:
string Txt = "foo *m*e*, bar";
string Pattern = #"(?<=[\s,]|^)\*(.*?)\*(?=[\s,]|$)";
Console.WriteLine(Regex.Replace(Txt, Pattern, ">$1<"));
which would print:
>m*e<
You can add "beginning of line or space" and "space or end of line" around your match:
(^|\s)\*(.*?)\*(\s|$)
You'll now need to refer to the middle capture group for the match string.
Related
I'm attempting to replace all instances of any special characters between each occurrence of a set of delimiters in a string. I believe the solution will include some combination of a regular expression match to retrieve the text between each set of delimiters and a regular expression replace to replace each offending character within the match with a space. Here’s what I have so far:
string input = "***XX*123456789~N3*123 E. Fake St. Apt# 456~N4*Beverly Hills*CA*902122405~REF*EI*902122405~HL*1*1*50*0~SBR*P*18*******MA~NM1*IL*1*Tom*Thompson*T***MI*123456789A~N3*456 W. False Ave.*Apt. #6B~N4*Beverly Hills*CA*90210~DMG*";
string matchPattern = "(~N3\\*)(.*?)(~N4\\*)";
string replacePattern = "[^0-9a-zA-Z ]?";
var matches = Regex.Matches(input, matchPattern);
foreach (Match match in matches)
{
match.Value = "~N3*" + Regex.Replace(match.Value, replacePattern, " ") + "~N4*";
}
MessageBox.Show(input);
I would expect the message box to show the following:
"***XX*123456789~N3*123 E Fake St Apt 456~N4*Beverly Hills*CA*902122405~REF*EI*902122405~HL*1*1*50*0~SBR*P*18*******MA~NM1*IL*1*Tom*Thompson*T***MI*123456789A~N3*456 W False Ave *Apt 6B~N4*Beverly Hills*CA*90210~DMG*"
Obviously this isn’t working because I can’t assign to the matched value inside the loop, but I hope you can follow my thought process. It is important that any characters which are not between the delimiters remain unchanged. Any direction or advice would be helpful. Thank you so much!
Use a Regex.Replace with a match evaluator where you may call the second Regex.Replace:
string input = "***XX*123456789~N3*123 E. Fake St. Apt# 456~N4*Beverly Hills*CA*902122405~REF*EI*902122405~HL*1*1*50*0~SBR*P*18*******MA~NM1*IL*1*Tom*Thompson*T***MI*123456789A~N3*456 W. False Ave.*Apt. #6B~N4*Beverly Hills*CA*90210~DMG*";
string matchPattern = #"(~N3\*)(.*?)(~N4\*)";
string replacePattern = "[^0-9a-zA-Z ]";
string res = Regex.Replace(input, matchPattern, m =>
string.Format("{0}{1}{2}",
m.Groups[1].Value,
Regex.Replace(m.Groups[2].Value, replacePattern, " "), // Here, you modify just inside the 1st regex matches
m.Groups[3].Value));
Console.Write(res); // Just to print the demo result
// => ***XX*123456789~N3*123 E Fake St Apt 456~N4*Beverly Hills*CA*902122405~REF*EI*902122405~HL*1*1*50*0~SBR*P*18*******MA~NM1*IL*1*Tom*Thompson*T***MI*123456789A~N3*456 W False Ave Apt 6B~N4*Beverly Hills*CA*90210~DMG*
See the C# demo
Actually, since ~N3* and ~N4* are literal strings, you may use a single capturing group in the pattern and then add those delimiters as hard-coded in the match evaluator, but it is up to you to decide what suits you best.
I have a string to parse. First I have to check if string contains special pattern:
I wanted to know if there is substrings which starts with "$(",
and end with ")",
and between those start and end special strings,there should not be
any white-empty space,
it should not include "$" character inside it.
I have a little regex for it in C#
string input = "$(abc)";
string pattern = #"\$\(([^$][^\s]*)\)";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(input);
foreach (var match in matches)
{
Console.WriteLine("value = " + match);
}
It works for many cases but failed at input= $(a$() , which inside the expression is empty. I wanted NOT to match when input is $().[ there is nothing between start and end identifiers].
What is wrong with my regex?
Note: [^$] matches a single character but not of $
Use the below regex if you want to match $()
\$\(([^\s$]*)\)
Use the below regex if you don't want to match $(),
\$\(([^\s$]+)\)
* repeats the preceding token zero or more times.
+ Repeats the preceding token one or more times.
Your regex \(([^$][^\s]*)\) is wrong. It won't allow $ as a first character inside () but it allows it as second or third ,, etc. See the demo here. You need to combine the negated classes in your regex inorder to match any character not of a space or $.
Your current regex does not match $() because the [^$] matches at least 1 character. The only way I can think of where you would have this match would be when you have an input containing more than one parens, like:
$()(something)
In those cases, you will also need to exclude at least the closing paren:
string pattern = #"\$\(([^$\s)]+)\)";
The above matches for example:
abc in $(abc) and
abc and def in $(def)$()$(abc)(something).
Simply replace the * with a + and merge the options.
string pattern = #"\$\(([^$\s]+)\)";
+ means 1 or more
* means 0 or more
I am trying to find the correct regex syntax for matching and splitting on a word that is surrounded by double brackets.
const string originalString = "I love to [[verb]] while I [[verb]].";
I tried
var arrayOfStrings = Regex.Split(originalString,#"\[\[(.+)\]\]");
But it did not work correctly. I don't know what I am doing wrong
I would like the arrayOfStrings to come out like so
arrayOfStrings[0] = "I love to "
arrayOfStrings[1] = "[[verb]]"
arrayOfStrings[2] = " while I "
arrayOfStrings[3] = "[[verb]]"
arrayOfStrings[4] = "."
I think that is what you need.
string input = "I love to [[verb]] while I [[verb]].";
string pattern = #"(\[\[.+?\]\])";
string[] matches = Regex.Split( input, pattern );
foreach (string match in matches)
{
Console.WriteLine(match);
}
The answer which will produce exactly what you want is #"(?=\[\[.*?\]\])|(?<=\]\])".
This has two parts to it, separated by the | "or" symbol.
(?=\[\[.*?\]\]) will look for any symbol which is immediately followed by a [[ some characters, and a ]], and split inbetween it and the [.
(?<=\]\]) will look for any symbol which is immediately preceded by ]] and split just after the ].
These are called "lookahead" and "lookbehind", and you can find more variants of them here.
I am trying to match a string in the following pattern with a regex.
string text = "'Emma','The Last Leaf','Gulliver's travels'";
string pattern = #"'(.*?)',?";
foreach (Match match in Regex.Matches(text,pattern,RegexOptions.IgnoreCase))
{
Console.WriteLine(match + " " + match.Index);
Console.WriteLine(match.Groups[1].Captures[0]);
}
This matches "Emma" and "The Last leaf" correctly, however the third match is "Gulliver". But the desired match is "Gulliver's travels". How can I build a regex for a patterns like this?
Since , is your delimiter, you can try changing your pattern like this. It should work.
string pattern = #"'(.*?)'(?:,|$)";
The way this works is, it looks for a single quote followed by a comma or end of the line.
I think this can work '(.*?)',|'(.*)' as regular expression.
you may consider to use look behind /look ahead:
"(?<=^'|',').*?(?='$|',')"
test with grep:
kent$ echo "'Emma','The Last Leaf','Gulliver's travels'"|grep -Po "(?<=^'|',').*?(?='$|',')"
Emma
The Last Leaf
Gulliver's travels
You can't, if you have single-quote delimited strings and Gulliver's contains a single, unescaped quote there's no way to distinguish it from the end of a string. You could always just split it by commas and trim 's from either side but I'm not sure that's what you want:
string text = "'Emma','The Last Leaf','Gulliver's travels'";
foreach(string s in text.split(new char[] {','})) {
Console.WriteLine(s.Trim('\''));
}
I am trying to create a regex in C# to extract the artist, track number and song title from a filename named like: 01.artist - title.mp3
Right now I can't get the thing to work, and am having problems finding much relevant help online.
Here is what I have so far:
string fileRegex = "(?<trackNo>\\d{1,3})\\.(<artist>[a-z])\\s-\\s(<title>[a-z])\\.mp3";
Regex r = new Regex(fileRegex);
Match m = r.Match(song.Name); // song.Name is the filname
if (m.Success)
{
Console.WriteLine("Artist is {0}", m.Groups["artist"]);
}
else
{
Console.WriteLine("no match");
}
I'm not getting any matches at all, and all help is appreciated!
You might want to put ?'s before the <> tags in all your groupings, and put a + sign after your [a-z]'s, like so:
string fileRegex = "(?<trackNo>\\d{1,3})\\.(?<artist>[a-z]+)\\s-\\s(?<title>[a-z]+)\\.mp3";
Then it should work. The ?'s are required so that the contents of the angled brackets <> are interpreted as a grouping name, and the +'s are required to match 1 or more repetitions of the last element, which is any character between (and including) a-z here.
Your artist and title groups are matching exactly one character. Try:
"(?<trackNo>\\d{1,3})\\.(?<artist>[a-z]+\\s-\\s(?<title>[a-z]+)\\.mp3"
I really recommend http://www.ultrapico.com/Expresso.htm for building regular expressions. It's brilliant and free.
P.S. i like to type my regex string literals like so:
#"(?<trackNo>\d{1,3})\.(?<artist>[a-z]+\s-\s(?<title>[a-z]+)\.mp3"
Maybe try:
"(?<trackNo>\\d{1,3})\\.(<artist>[a-z]*)\\s-\\s(<title>[a-z]*)\\.mp3";
CODE
String fileName = #"01. Pink Floyd - Another Brick in the Wall.mp3";
String regex = #"^(?<TrackNumber>[0-9]{1,3})\. ?(?<Artist>(.(?!= - ))+) - (?<Title>.+)\.mp3$";
Match match = Regex.Match(fileName, regex);
if (match.Success)
{
Console.WriteLine(match.Groups["TrackNumber"]);
Console.WriteLine(match.Groups["Artist"]);
Console.WriteLine(match.Groups["Title"]);
}
OUTPUT
01
Pink Floyd
Another Brick in the Wall