Using Regex expression with xml [duplicate] - c#

This question already has answers here:
How does one parse XML files? [closed]
(12 answers)
Closed 9 years ago.
I have a xml file:
"<?xml version=\"1.0\" encoding=\"utf-8\"?><response list=\"true\"><count>2802</count><post><id>4210</id><from_id>2176594</from_id><to_id>-11423648</to_id><date>1365088358</date><text>dsadsad #ADMIN</text>...
I want to take a string between <from_id> and </from_id>.
I have a regex exprassion <from_id>(.*?)</from_id>, but it return string with tags.
Can you help me?
Waiting for response)
P.S.: Sorry for my poor english!

As others have already pointed out, you'd probably be on a safer and cleaner side by using an XML parser.
That said, you've already got a working regular expression. Just make sure to retrieve the capture group #1. That will get you just what is inside the first pair of parentheses.
If you're using C#, instead of calling toString() on the Match, look into its Groups property and get its first element:
string pattern = "<from_id>(.*?)</from_id>";
string input = "<?xml version=\"1.0\" encoding=\"utf-8\"?><response list=\"true\"><count>2802</count><post><id>4210</id><from_id>2176594</from_id><to_id>-11423648</to_id><date>1365088358</date><text>dsadsad #ADMIN</text>";
Match match = Regex.Match(input, pattern);
if (match.Success){
System.Console.WriteLine(match.Groups[1].Value);
}
See it working in this Ideone snippet.
If you wanted to get all matches of the pattern, you could use Regex.Matches() instead, and iterate over each Match in the MatchCollection in the same way.

Related

I want to do a search in real time of objects with Regex in c# [duplicate]

This question already has answers here:
Can I use variables in pattern in Regex (C#)
(2 answers)
Closed 3 years ago.
Is it possible to place a string variable inside a Regex? If so.. how?
I've been playing with regex for 4 hours now and i need just one more thing to finish.
return (new Regex(#"\bA=(\d+[/]\d+)").Match(From).Groups[1].Value.Trim()).ToString();
This line basically gets any fractional number like 42/13 only if it's after "A=" from a string and extracts it.
So here's my question - Is it possible to do something like that:
string variable;
Regex(#"\b"variable"=(\d+[/]\d+)").Match(From).Groups[1].Value.Trim()).ToString();
The idea is to make it so whatever is in variable becomes the regex and for example if in the variable we input D it's now D= now A=.
Thanks in advance.
This is string interpolation. You use the $ operator on your strings to use it. Example
string variable = "hello";
Regex regex = new Regex($#"\b{Regex.Escape(variable)}=(\d+[/]\d+)");
You need to concatenate your strings as usual (+) and prepend # to each string if using backslashes without escaping them. You also don't need to encase / in the character class as [/]. Alternatively, as mentioned by Josh in his answer and Ron Beyer in his comment below your question, you can use interpolation.
#"\b" + variable + #"=(\d+/\d+)"
Additionally, you should use the method Regex.Escape() against your variable to ensure any special characters are escaped (this will prevent your pattern from failing or making incorrect matches) - sanitizing your variable.

Not terminated set of [] in regex (C#) [duplicate]

This question already has answers here:
Regex Match all characters between two strings
(16 answers)
Closed 5 years ago.
I'm trying to parse a text looking for data inside this pattern:
{{([^]+)}}
i.e. any sequence of characters between {{ and }} .
But, when I try to build a Regex object:
Regex _regex = new Regex("{{([^]+)}}", RegexOptions.Compiled);
I got this error:
analysis of "{{([^]+)}}" - Set of [] not terminated....
whatever it means...
Someone has an hint?
The purpose of [^...] is to negate character classes present in the specified list. After the ^ symbol, in order to define a correct regular expression, you should include a set of characters to exclude like, for example [^a]+ (this matches one or more characters that don't include the literal a).
The regex you are attempting to define is probably:
{{\s*([\w]+)\s*}}
Visit this link for trying a working demo.
This is because [^] is not a valid regex, because you need to specify at least one symbol that you wish to exclude.
In order to capture the string up to the closing }} change the expression to this:
{{((?:[^}]|}[^}])*)}}
Demo.

regex expression to accept ENTER key in string [duplicate]

This question already has answers here:
How do I match any character across multiple lines in a regular expression?
(26 answers)
Closed 6 years ago.
Issue
I am having an issue creating a regex to accept any string and the ENTER key, at the moment i have this:
^$|^.+$
I have looked around and people have said to add \n but this does not work.
An example of the string is should allow is as follows:
Hello this is a test string
and i want this to be accepted
Try setting the s flag on the regex engine. This will ensure that the . metacharacter will match newlines.
Here's a link to a working example.
Also, as a sidenote, instead of ^$|^.+$ you can condense the whole expression to ^.*$ to achieve the same results with better performance.
In C#, you need the RegexOptions.Singleline option. See this SO post for more information.
Here is a quick example that really just matches the entire string, so it's not useful.
var regex = new Regex(#"^.*$",
RegexOptions.IgnoreCase | RegexOptions.Singleline);
In your future validation code, you need to replace .* with whatever your validation will be.

C# Regular Expression - find groups in text with a separator [duplicate]

This question already has an answer here:
Learning Regular Expressions [closed]
(1 answer)
Closed 6 years ago.
I have the following text
"a|mother" "b|father"
I want to find via Regex, groups of text that starts with '"' and ends with '"' and separate with '|' without spaces. Meaning the results would be:
"a|mother"
"b|father"
I try to use other posts to solve my question but still with no luck how can I find the |? and how can I find my pattern without spaces?
Something like this:
String source = "\"a|mother\" \"b|father\"";
var result = Regex
.Matches(source, "\"[^\"]*[^ ]\\|[^ ][^\"]*\"")
.OfType<Match>();
Console.Write(String.Join(Environment.NewLine, result));
Output is
"a|mother"
"b|father"

Find tag with contents in HTML using Regex [duplicate]

This question already has an answer here:
Extract Content from Div Tag C# RegEx
(1 answer)
Closed 8 years ago.
I am trying to find tag article and all it's content in HTML string using Regex.
I can successfully match open tag with attrs: <article[^>]*>
I've got issues with matching contents. (.*?) - this technique is not working for me.
Please help.
You cannot use regular expressions to parse HTML in general. However, in constrained scenarios (i.e. when the input follows a rigid structure), you might be able to get away with it. In your case, you can use the following regex, provided that:
The <article> tags are not self-closing
The <article> elements do not contain other <article> descendants
The strings <article and </article> do not appear as literals in your HTML.
Code:
var matches = Regex.Matches(html, #"<article.*?</article>", RegexOptions.Singleline);

Categories