I have a html string which i'm parsing which looks like below. I need to get the value of #Footer.
strHTML = "<html><html>\r\n\r\n<head>\r\n<meta http-equiv=Content-Type
content=\"text/html; charset=windows-1252\">\r\n
<meta name=Generator content=\"Microsoft Word 14></head></head><body>
<p>#Footer=CONFIDENTIAL<p></body></html>"
I have tried the below code, how do i get the value?
Regex m = new Regex("#Footer", RegexOptions.Compiled);
foreach (Match VariableMatch in m.Matches(strHTML.ToString()))
{
Console.WriteLine(VariableMatch);
}
You need to capture the value after the =. This will work, as long as the value cannot contain any < characters:
Regex m = new Regex("#Footer=([^<]+)", RegexOptions.Compiled);
foreach (Match VariableMatch in m.Matches(strHTML.ToString()))
{
Console.WriteLine(VariableMatch.Groups[1].Value);
}
You can do this with regex, but it's not necessary. One simple way to do this would be:
var match = strHTML.Split(new string[] { "#Footer=" }, StringSplitOptions.None).Last();
match = match.Substring(0, match.IndexOf("<"));
This assumes that your html string only has one #Footer.
Your regex will match the string "#Footer". The value of the match will be "#Footer".
Your regex should look like this instead :
Regex regex = new Regex("#Footer=[\w]+");
string value = match.Value.Split('=')[1];
Use a matching group.
Regex.Matches(strHTML, #"#Footer=(?<VAL>([^<\n\r]+))").Groups["VAL"].Value;
If that's all your string, we can use string methods to solve it without touching regex stuff:
var result = strHTML.Split(new string[]{"#Footer=", "<p>"}, StringSplitOptions.RemoveEmptyEntries)[1]
Related
Supposed I have the following string:
string str = "<tag>text</tag>";
And I would like to change 'tag' to 'newTag' so the result would be:
"<newTag>text</newTag>"
What is the best way to do it?
I tried to search for <[/]*tag> but then I don't know how to keep the optional [/] in my result...
Why use regex when you can do:
string newstr = str.Replace("tag", "newtag");
or
string newstr = str.Replace("<tag>","<newtag>").Replace("</tag>","</newtag>");
Edited to #RaYell's comment
To make it optional, simply add a "?" AFTER THE "/", LIKE THIS:
<[/?]*tag>
string str = "<tag>text</tag>";
string newValue = new XElement("newTag", XElement.Parse(str).Value).ToString();
Your most basic regex could read something like:
// find '<', find an optional '/', take all chars until the next '>' and call it
// tagname, then take '>'.
<(/?)(?<tagname>[^>]*)>
If you need to match every tag.
Or use positive lookahead like:
<(/?)(?=(tag|othertag))(?<tagname>[^>]*)>
if you only want tag and othertag tags.
Then iterate through all the matches:
string str = "<tag>hoi</tag><tag>second</tag><sometag>otherone</sometag>";
Regex matchTag = new Regex("<(/?)(?<tagname>[^>]*)>");
foreach (Match m in matchTag.Matches(str))
{
string tagname = m.Groups["tagname"].Value;
str = str.Replace(m.Value, m.Value.Replace(tagname, "new" + tagname));
}
var input = "<tag>text</tag>";
var result = Regex.Replace(input, "(</?).*?(>)", "$1newtag$2");
I have a string like this.
*>-0.0532*>-0.0534*>-0.0534*>-0.0532*>-0.0534*>-0.0534*>-0.0532*>-0.0532*>-0.0534*>-0.0534*>-0.0534*>-0.0532*>-0.0534*
I wanna extract between *> and * characters.
I tried this pattern which is wrong here below:
string pattern = "\\*\\>..\\*";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(seriGelen);
if (matches.Count > 0)
{
foreach (Match match in matches)
MessageBox.Show("{0}", match.Value);
}
You can use simple regex:
(?<=\*>).*?(?=\*)
Sample code:
string text = "*>-0.0532*>-0.0534*>-0.0534*>-0.0532*>-0.0534*>-0.0534*>-0.0532*>-0.0532*>-0.0534*>-0.0534*>-0.0534*>-0.0532*>-0.0534*";
string[] values = Regex.Matches(text, #"(?<=\*>).*?(?=\*)")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
Looks like there are can be very different values (UPD: there was an integer positive value). So, let me to not check numbers format. Also I will consider that *> and >, and also * are just different variants of delimiters.
I'd like to suggest the following solution.
(?<=[>\*])([^>\*]+?)(?=[>\*]+)
(http://regex101.com/r/mM7nK1)
Not sure it is ideal. Will only works if your input starts and ends with delimiters, but will allow to you to use matches instead groups, as your code does.
========
But you know, why wouldn't you use String.Split function?
var toprint = seriGelen.Split(new [] {'>', '*'}, StringSplitOptions.RemoveEmptyEntries);
Is there an error at the beginning of the string? Missing an asterisk after first number? >-0.0532>-0.0534*>
If not try this.
>([-+]?[0-9]*\.?[0-9]+)\*
C# Code
string strRegex = #">([-+]?[0-9]*\.?[0-9]+)\*";
Regex myRegex = new Regex(strRegex, RegexOptions.IgnoreCase | RegexOptions.Singleline);
string strTargetString = #">-0.0532>-0.0534*>-0.0534*>-0.0532*>-0.0534*>-0.0534*>-0.0532*>-0.0532*>-0.0534*>-0.0534*>-0.0534*>-0.0532*>-0.0534*";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
// Add your code here
}
}
I have a string:
productDescription
In it are some custom tags such as:
[MM][/MM]
For example the string might read:
This product is [MM]1000[/MM] long
Using a regular expression how can I find those MM tags, take the content of them and replace everything with another string? So for example the output should be:
This product is 10 cm long
I think you'll need to pass a delegate to the regex for that.
Regex theRegex = new Regex(#"\[MM\](\d+)\[/MM\]");
text = theRegex.Replace(text, delegate(Match thisMatch)
{
int mmLength = Convert.ToInt32(thisMatch.Groups[1].Value);
int cmLength = mmLength / 10;
return cmLength.ToString() + "cm";
});
Using RegexDesigner.NET:
using System.Text.RegularExpressions;
// Regex Replace code for C#
void ReplaceRegex()
{
// Regex search and replace
RegexOptions options = RegexOptions.None;
Regex regex = new Regex(#"\[MM\](?<value>.*)\[\/MM\]", options);
string input = #"[MM]1000[/MM]";
string replacement = #"10 cm";
string result = regex.Replace(input, replacement);
// TODO: Do something with result
System.Windows.Forms.MessageBox.Show(result, "Replace");
}
Or if you want the orginal text back in the replacement:
Regex regex = new Regex(#"\[MM\](?<theText>.*)\[\/MM\]", options);
string replacement = #"${theText} cm";
A regex like this
\[(\w+)\](\d+)\[\/\w+\]
will find and collect the units (like MM) and the values (like 1000). That would at least allow you to use the pairs of parts intelligently to do the conversion. You could then put the replacement string together, and do a straightforward string replacement, because you know the exact string you're replacing.
I don't think you can do a simple RegEx.Replace, because you don't know the replacement string at the point you do the search.
Regex rex = new Regex(#"\[MM\]([0-9]+)\[\/MM\]");
string s = "This product is [MM]1000[/MM] long";
MatchCollection mc = rex.Matches(s);
Will match only integers.
mc[n].Groups[1].Value;
will then give the numeric part of nth match.
Supposed I have the following string:
string str = "<tag>text</tag>";
And I would like to change 'tag' to 'newTag' so the result would be:
"<newTag>text</newTag>"
What is the best way to do it?
I tried to search for <[/]*tag> but then I don't know how to keep the optional [/] in my result...
Why use regex when you can do:
string newstr = str.Replace("tag", "newtag");
or
string newstr = str.Replace("<tag>","<newtag>").Replace("</tag>","</newtag>");
Edited to #RaYell's comment
To make it optional, simply add a "?" AFTER THE "/", LIKE THIS:
<[/?]*tag>
string str = "<tag>text</tag>";
string newValue = new XElement("newTag", XElement.Parse(str).Value).ToString();
Your most basic regex could read something like:
// find '<', find an optional '/', take all chars until the next '>' and call it
// tagname, then take '>'.
<(/?)(?<tagname>[^>]*)>
If you need to match every tag.
Or use positive lookahead like:
<(/?)(?=(tag|othertag))(?<tagname>[^>]*)>
if you only want tag and othertag tags.
Then iterate through all the matches:
string str = "<tag>hoi</tag><tag>second</tag><sometag>otherone</sometag>";
Regex matchTag = new Regex("<(/?)(?<tagname>[^>]*)>");
foreach (Match m in matchTag.Matches(str))
{
string tagname = m.Groups["tagname"].Value;
str = str.Replace(m.Value, m.Value.Replace(tagname, "new" + tagname));
}
var input = "<tag>text</tag>";
var result = Regex.Replace(input, "(</?).*?(>)", "$1newtag$2");
How can I take the string foo[]=1&foo[]=5&foo[]=2 and return a collection with the values 1,5,2 in that order. I am looking for an answer using regex in C#. Thanks
In C# you can use capturing groups
private void RegexTest()
{
String input = "foo[]=1&foo[]=5&foo[]=2";
String pattern = #"foo\[\]=(\d+)";
Regex regex = new Regex(pattern);
foreach (Match match in regex.Matches(input))
{
Console.Out.WriteLine(match.Groups[1]);
}
}
I don't know C#, but...
In java:
String[] nums = String.split(yourString, "&?foo[]");
The second argument in the String.split() method is a regex telling the method where to split the String.
I'd use this particular pattern:
string re = #"foo\[\]=(?<value>\d+)";
So something like (not tested):
Regex reValues = new Regex(re,RegexOptions.Compiled);
List<integer> values = new List<integer>();
foreach (Match m in reValues.Matches(...putInputStringHere...)
{
values.Add((int) m.Groups("value").Value);
}
Use the Regex.Split() method with an appropriate regex. This will split on parts of the string that match the regular expression and return the results as a string[].
Assuming you want all the values in your querystring without checking if they're numeric, (and without just matching on names like foo[]) you could use this: "&?[^&=]+="
string[] values = Regex.Split(“foo[]=1&foo[]=5&foo[]=2”, "&?[^&=]+=");
Incidentally, if you're playing with regular expressions the site http://gskinner.com/RegExr/ is fantastic (I'm just a fan).
Assuming you're dealing with numbers this pattern should match:
/=(\d+)&?/
This should do:
using System.Text.RegularExpressions;
Regex.Replace(s, !#"^[0-9]*$”, "");
Where s is your String where you want the numbers to be extracted.
Just make sure to escape the ampersand like so:
/=(\d+)\&/
Here's an alternative solution using the built-in string.Split function:
string x = "foo[]=1&foo[]=5&foo[]=2";
string[] separator = new string[2] { "foo[]=", "&" };
string[] vals = x.Split(separator, StringSplitOptions.RemoveEmptyEntries);