Parse out Value - c#

Anyone any ideas how to parse out this value in the simplest way possible. It needs to be quick and lean. Someone said regex but I haven't used them before. Can they be used to get whats inside the value?
name="org.apache.struts.taglib.html.TOKEN" value="THIS IS WHAT IS NEEDED"

var reVal = new Regex( "name=\"org.apache.struts.taglib.html.TOKEN\"\s+value=\"(?<value>.*?)\"" );
string value = reVal.Match( input ).Groups["value"].Value;
And I will explain it as well. First we seek for the word value with a " after it. Then (?<value> specifies a named group with the name "value". .*?\" means match everything up to the first ". Then we grab the value of the group in the second line.
You could start by reading the MSDN docs of the Regex class.

var tokenString = "name=\"org.apache.struts.taglib.html.TOKEN\" value=\"THIS IS WHAT IS NEEDED\"";
Regex regex = new Regex("value=\"(.*)\"");
var match = regex.Match(tokenString);
if (match.Success)
{
Console.WriteLine(match.Groups[1]);
}

Related

How to find the third element value using Regex

All, i am currently trying to parse each element that has the format below using regex and c# to find any value in () below.. Example i would like to extract 2002_max_allow_date .. note not all the names in here will be alpha numeric etc...
I initially have the pattern: Regex regex = new Regex(#"(\w\d\d\d.[A-Z])\w+");
However this only returns the name with the numeric etc
From reply i tried the following and trying to format this so that i do not get the syntax error as well as i don't want to change the regex query...
Can someone please assist me in finding the name located in the third position.. example this,'46032','46032','2002_MAX_ALLOW_DATE'
<button class="longlist-cb longlist-cb-yes" id="cb46032"
onclick="$ll.CATG.toggleCb(this,'46032','46032','2002_MAX_ALLOW_DATE')"
</button>
Please try this
Regex rex = new Regex("'[^']+','[^']+','(?<ThirdElement>[^']+)'");
String data = "'46032','46032','2002_MAX_ALLOW_DATE'";
Match match = rex.Match(data);
Console.WriteLine(match.Groups["ThirdElement"]); // Output: 2002_MAX_ALLOW_DATE
SECOND EDIT:
I've written some code that provides all the elements inside the onclick as capture groups:
Regex regex = new Regex("onclick=\"\\$ll.CATG.toggleCb\\((.*),\\s?(.*),\\s?(.*),\\s?(.*)\\)");
string x = "<button class=\"longlist - cb longlist - cb - yes\" id=\"cb46032\" onclick=\"$ll.CATG.toggleCb(this, '46032', '46032', '2002_MAX_ALLOW_DATE')\"></button>";
Match match = regex.Match(x);
if (match.Success)
{
Console.WriteLine("match.Value returns: " + match.Value);
foreach (Group y in match.Groups)
{
Console.WriteLine("the current capture group: " + y.Value);
}
}
else
{
Console.Write("No match");
}
Console.ReadKey();
will print:
EDIT: After trying with VS, this worked for me: Regex regex = new Regex("onclick=\"\\$ll.CATG.toggleCb\\((.*),.*,.*,.*\\)");
ORIGINAL ANSWER:
If you were to use Regex regex = new Regex(#"onclick="\$ll.CATG.toggleCb\(.*,.*,(.*),.*\)"); on your provided text, that should return '46032'.
You could alter this regex by moving the capturing ( and ) to a different .* to capture, say, the fourth element, like this: onclick="\$ll.CATG.toggleCb\((.*),.*,.*,.*\) would capture this.
Why not get the attribute value of onclick, but to get the all HTML of the button which make question become complex.
And use String.Split can resolve your problem simply, but you choose to use RegExp.
the_button_element.GetAttribute('onclick').Split(',')[3]
Or use RegExp:
new Regex(#".*?,'(\w+)'\)$")

C# Regex to Get file name without extension?

I want to use regex to get a filename without extension. I'm having trouble getting regex to return a value. I have this:
string path = #"C:\PERSONAL\TEST\TESTFILE.PDF";
var name = Regex.Match(path, #"(.+?)(\.[^\.]+$|$)").Value;
In this case, name always comes back as C:\PERSONAL\TEST\TESTFILE.PDF. What am I doing wrong, I think my search pattern is correct?
(I am aware that I could use Path.GetFileNameWithoutExtension(path);but I specifically want to try using regex)
You need Group[1].Value
string path = #"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, #"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
var name = match.Groups[1].Value;
}
match.Value returns the Captures.Value which is the entire match
match.Group[0] always has the same value as match.Value
match.Group[1] return the first capture value
For example:
string path = #"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, #"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
Console.WriteLine(match.Value);
// return the substring of the matching part
//Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF
Console.WriteLine(match.Groups[0].Value)
// always the same as match.Value
//Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF
Console.WriteLine(match.Groups[1].Value)
// return the first capture group which is (.+?) in this case
//Output: C:\\PERSONAL\\TEST\\TESTFILE
Console.WriteLine(match.Groups[2].Value)
// return the second capture group which is (\.[^\.]+$|$) in this case
//Output: .PDF
}
Since the data is on the right side of the string, tell the regex parser to work from the end of the string to the beginning by using the option RightToLeft. Which will significantly reduce the processing time as well as lessen the actual pattern needed.
The pattern below reads from left to right and says, give me everything that is not a \ character (to consume/match up to the slash and not proceed farther) and start consuming up to a period.
Regex.Match(#"C:\PERSONAL\TEST\TESTFILE.PDF",
#"([^\\]+)\.",
RegexOptions.RightToLeft)
.Groups[1].Value
Prints out
TESTFILE
Try this:
.*(?=[.][^OS_FORBIDDEN_CHARACTERS]+$)
For Windows:
OS_FORBIDDEN_CHARACTERS = :\/\\\?"><\|
this is a sleight modification of:
Regular expression get filename without extention from full filepath
If you are fine to match forbidden characters then simplest regex would be:
.*(?=[.].*$)
Can be a bit shorter and greedier:
var name = Regex.Replace(#"C:\PERS.ONAL\TEST\TEST.FILE.PDF", #".*\\(.*)\..*", "$1"); // "TEST.FILE"

Regular expressions multiple matches

I have this text and I want to get the 2 matches from it but the problem is I am always getting only 1 match. This is the sample code in c#
string formattedTag = "{Tag 1}::[FORMAT] asdfa {Tag 2}::[FORMAT]";
var tagMatches = Regex.Matches(formattedTag, #"(\{.+\}\:\:\[.+\])");
i am expecting to get two matches here "{Tag 1}::[FORMAT]" and "{Tag 2}::[FORMAT]"
but the result of this code is the actual value of the variable formattedTag.
It must be something from regexp pattern so can somebody help me to figure it out?
I will appreciate every help. Thanks in advance!
You need to use the following regular expression:
(\{[^}]+\}\:\:\[[^]]+\])
You want to match any character except the closing bracket within your bracketed portions of the string, otherwise the whole string is matched because regular expressions are greedy and attempt to retrieve the longest match.
string formattedTag = "{tag 1}::[admin] adfaf{tag 2}::[test.user]";
var tagMatches = Regex.Matches(formattedTag, #"\{(\w+\s*\d{1,2})\}::\[(.*?)\]");
foreach(Match item in tagMatches)[enter image description here][1]{
Console.WriteLine(item.Groups[0]);
Console.WriteLine(item.Groups[1] + "=" + item.Groups[2]);
}

c# - regex don't work (match does not preserve the string)

Regex regOrg = new Regex(#"org(?:aniser)?\s+(\d\d):(\d\d)\s?(\d\d)?\.?(\d\d)?", RegexOptions.IgnoreCase);
MatchCollection mcOrg = regOrg.Matches(str);
Match mvOrg = regOrg.Match(str);
dayOrg = mvOrg.Value[4].ToString();
monthOrg = mvOrg.Value[5].ToString();
hourOrg = mvOrg.Value[2].ToString();
minuteOrg = mvOrg.Value[3].ToString();
This regular expression analyzes the string with text
"organiser 23:59" / "organiser 25:59 31.12"
or
"org 23:59" / "org 23:59 31.12"
Day and month of optional parameters
Accordingly, I want to see the output variables dayOrg, monthOrg, hourOrg, minuteOrg with this data, but I get this:
Query: org 23:59 31.12
The value mcOrg.Count: 1
The value dayOrg: 2
The value monthOrg: 3
The value hourOrg: g
The value minuteOrg: empty
What am I doing wrong? Tried a lot of options, but it's not working.
You're not accessing the groups correctly (you're accessing individual characters of the matched string).
dayOrg = mvOrg.Groups[4].Value;
monthOrg = mvOrg.Groups[5].Value;
hourOrg = mvOrg.Groups[2].Value;
minuteOrg = mvOrg.Groups[3].Value;
The reason you are getting that result is because you are getting Value[index] from the mvOrg Match.
The Match class, as described on MSDN says that Value is the first match, hence you are accessing the character array of the first match instead of the groups. You need to use the Groups property of the Match class to get the actual groups found.
Be sure to check the count of this collection before trying to access the optional parameters.
I added name for you pattern so now it look like this :
Regex regOrg = new Regex(#"org(?:aniser)?\s+(?<hourOrg>\d{2}):(?<minuteOrg>\d{2})\s?(?<dayOrg>\d{2})?\.?(?<monthOrg>\d{2})?", RegexOptions.IgnoreCase);
and you can access the result like this
Console.WriteLine(mvOrg.Groups["hourOrg"]);
Console.WriteLine(mvOrg.Groups["minuteOrg"]);
Console.WriteLine(mvOrg.Groups["dayOrg"]);
Console.WriteLine(mvOrg.Groups["monthOrg"]);
Using hard coded indexes is not good practice, since you can change the regex and now need to change all the indexes ...
Is it what you wanted ?

C# - Regular Expression for NO numbers allowed

I want to validate first name and last name from all existing languages.
So I want to validate that there are numbers in a string.
Thanks
[\s\p{L}]
would be the correct character class for this. But of course names can contain many more characters than those (how about Tim O'Reilly or William Henry Gates III.?).
See also Falsehoods Programmers Believe About Names.
Don't even have to use regex:
string tmp = "foo";
var match = tmp.IndexOfAny("0123456789".ToCharArray()) != -1;
Just do !Regex if your validation is in a if statement.
if ( !Regex.Match ( stringToCheck, "^[0-9]+$" ).Success ) {
// TODO.
}
I just tried this one and it should do the trick:
var regex = new Regex(#"[0-9]", RegexOptions.IgnoreCase);
var m = regex.Match(stringValue);
if (m.Success)
//TODO

Categories