.Net regex for string between \" "\ - c#

I have been trying to get the id from the following text
<body id=\"body\" runat=\"server\">
In C# using substring or even Regex, but nothing seems to be working. No matter what regex i use, i always get the whole line back. I have been trying to use ^id, ^id.*, ^id=\\\\\\\\.* and id=.* but they don't either work or give me the desired output. Is there any way i can get the id portion from this text which is enclosed between the characters \" "\?

Try this:
string htmlString = "<body id=\"body\" runat=\"server\">";
Regex regex = new Regex("id=\"(.*?)\"");
Match m = regex.Match(htmlString);
Group g = m.Groups[1];
string id = g.ToString();
Console.WriteLine(id); //body
Test here:
http://rextester.com/BQSF93427

Related

grab text between two patern

Check the code bellow. I want to grab everything between this id="a-popover-sp-info-popover- until ". I already tried to use following Regex.Match formula but there is syntax error. Its not valid in c#. How can i do this in proper way. My goal is to grab ABC123 text.
string foo = #id="a-popover-sp-info-popover-ABC123";
string output = Regex.Match(foo, #"id="a-popover-sp-info-popover-(.*)"").Groups[1].Value;
i need to grab only text: ABC123
since your pattern is so rigid, actually the string.Split method could also do the trick:
string output1 = foo.Split(new string[] {"info-popover-"}, StringSplitOptions.RemoveEmptyEntries)
.Last()
.TrimEnd('"');
Console.WriteLine(output1);
Output:
ABC123
You have to make sure to surround your strings with quotation marks ".
If you want to have quotation marks inside of your string you have to escape them with a backslash:
string foo = "id=\"a-popover-sp-info-popover-ABC123\"";
string output = Regex.Match(foo, "id=\"a-popover-sp-info-popover-(.*)\"").Groups[1].Value;
string pattern = "id=\"a-popover-sp-info-popover-[A-Z]{3}[1-9]{3}\"";
string input = "id=\"a-popover-sp-info-popover-ABC123\"";
Match m = Regex.Match(input, pattern);
if (m.Success) Console.WriteLine("Found '{0}'", m.Value);

Troubles with finding/replacing in string

I am having some troubles finding/replacing a value in a string. DonĀ“t know if i should do it in RegEx or C# has some nifty feature to make it work. Regex gives me headace.
The problem:
<doc name="tester" value="p1,p2,p3" />
So i want the "value" (p1,p2,p3) and replace it with the current value + ",p4".
Any help appriciated.
Although you get Regex headache, this is actually very simple to do with the following regex:
#"(?<=value=\"")[^""]+"
It starts by looking back for 'value="', then it matches all character up to the ending double quote.
string test = #"<doc name=""tester"" value=""p1,p2,p3"" />";
Regex regex = new Regex(#"(?<=value=\"")[^""]+");
string result = regex.Replace(test, "p1,p2,p3,p4");
// result will be: #"<doc name=""tester"" value=""p1,p2,p3,p4"" />";
Edit:
You can of course capture the original content, simply by calling:
string match = regex.Match(test).Value;

Regex to replace a string between special characters in c#

I have a situation where I have a string in which I have to replace a part that lies between special characters.
I can do the same using substrings and length,but that is the dirty way.
Is there any better way of doing this using regex?
e.g. of the string is
string str1 = "This is the <![CDATA[ SampleDataThatNeedsToBeReplaced ]]";
string repl = "Replacement Text";
I need a regex to get the output as
This is the Replacement Text
I did try a few regex like the following
result = Regex.Replace(str1, #"(?<=CDATA\[)(\w+?)(?=\]\])", repl);
I also tried
Regex x = new Regex("(\\[CDATA\\])(.*?)(\\[\\]\\]\\])");
string Result = str1.Replace(text, "$1" + repl + "$3");
did not get any results.
Any help is appreciated.
Regex.Replace (
"This is the <![CDATA[ SampleDataThatNeedsToBeReplaced ]]",
#"<!\[CDATA\[(.+)]]",
"Replacement Text");
Note that in case you need it ; the old text (between the inner brackets) is available as group1 (and so can be referenced via $1)

Regex, match text inside a tag, then match all text not in that tag both from same string?

I suck at Regex's and am surprised I was able to get as far as I did by myself.
So far I've got this:
string text = "Whoa here is some very cool text.<phone>222-222-5555</phone><timesCalled>6</timescalled>";
Regex phoneRegex = new Regex(#"<phone>(.*?)<\/phone>");
Regex calledRegex = new Regex(#"<timesCalled>(.*?)<\/timesCalled>");
string phone = phoneRegex.Match(text).Value;
string timesCalled = calledRegex.Match(text).Value;
Both of these give me the full tags and the value inside, how do I make it so it only returns what is inside the tag? Also I need a final regex that would return all the text not inside those tags, so Whoa here is some very cool text. from the above example. The special tags would always appear after the normal text, if that matters.
Edit: Thanks for the answers all, I still need the final regex though (bolded above).
So far I tried this:
string pattern = #"^" + phoneRegex.Match(text).Value + calledRegex.Match(text).Value;
Regex textRegex = new Regex(pattern);
string normalText = textRegex.Match(text).Groups[1].Value;
but that is returning nothing.
You want to get the value of the group:
calledregex.Match(text).Groups[1].Value
Groups are 1-based.
How about reading/parsing XML with an Xml class?
var doc = XElement.Parse("<root>" + text + "</root>");
string phone = doc.Descendants("phone").First().Value;
Here is my suggestion, gives you the chance to serach for more tags with values.
string text = "Whoa here is some very cool text.<phone>222-222-5555</phone><timesCalled>6</timesCalled>";
Regex regex = new Regex(#"<(?<tag>[^>]*)>(?<value>[^<]*)</\k<tag>>");
Match match = regex.Match(text);
string phone = match.Groups["value"].Captures[match.Groups["tag"].Captures.OfType<Capture>().FirstOrDefault(item => item.Value == "phone").Index].Value;
string timesCalled = match.Groups["value"].Captures[match.Groups["tag"].Captures.OfType<Capture>().FirstOrDefault(item => item.Value == "timesCalled").Index].Value;
The Value of a match is everything that matched the pattern. If you only want the grouped contents (the stuff within the tags), you'd have to access them through the Groups property.
string phone = phoneRegex.Match(text).Groups[1].Value;
string timesCalled = calledregex.Match(text).Groups[1].Value;
In the case of inline xml/html I would also ignore case, tag capitalization can be wonky sometimes.
string text = "Whoa here is some very cool text.<phone>222-222-5555</phone><timesCalled>6</timesCalled>";
Regex phoneRegex = new Regex(#"<phone>(.*?)<\/phone>", RegexOptions.IgnoreCase);
Regex calledRegex = new Regex(#"<timesCalled>(.*?)<\/timesCalled>", RegexOptions.IgnoreCase);
string phone = phoneRegex.Match(text).Groups[1].Value;
string timesCalled = calledRegex.Match(text).Groups[1].Value;

Get the number of an href url parameter from downloaded html page?

I am trying to get an ID from a url parameter inside an href that looks like this:
MyItemName
I want the 71312 only and at the momment I am trying to do it using regex (but if you have a better approch I would be glad to try):
string html,itemID;
using (var client = new WebClient())
{
html = client.DownloadString("http://www.mysite.com/search.php?search_text=" + myItemName);
}
string pattern = "" + myItemName + "";
Match m = Regex.Match(html, pattern, RegexOptions.IgnoreCase);
if (m.Success)
{
itemID = m.Groups[1].Value;
MessageBox.Show(itemID);
}
Example of the html:
more html body
<h1>Items - List</h1>
<p>MyItemNameTest, MyItemNameTestB, MYItemNameOther</p>
</div>
more html body
To show where your regex went wrong:
. and ? are special characters in regular expressions. . means "any character" and ? means "zero or one occurences of the previous expression". Therefore your regex fails to match. Also, you need to use verbatim strings in C# (unless you want to escape every backslash):
#"" + myItemName + "";
will probably work.
That said, unless all the links you're examining follow exactly this format, you might run into problems. It's kind of a running gag here on SO that parsing HTML with regular expressions will earn you the wrath of Cthulhu.
Use:
Uri u = new Uri("http://www.mysite.com/myitem.php?id=12313");
string s = u.Query;
HttpUtility.ParseQueryString(s).Get("id");
In variable id you have the number. Figure out the rest of the function :)

Categories