Regex Regular expression in c# - c#

We have implemented the invocation of brill tagger from our c# code. We just neede to know what is the correct Regex regular expression for eliminating all from a string, but jst keep a-z,A-Z, full stop and comma. We tried [^a-zA-Z\.\,] on the online regular expression tester and it is giving the correct result, but when implemented in C#, it is not working properly. We also tried several other combinations but we are not getting the correct result.
This is the format in which we are writing:
strFileContent = Regex.Replace(strFileContent, #"[^a-zA-Z\.\,]", "");
but we are not getting the desired output. what is wrong??

Regex.Replace(yourString, #"[^a-z\.\,]", string.Empty, RegexOptions.IgnoreCase)
EDIT: I can't see anything wrong with what you are doing, my answer is exactly the same. I tested both in LINQPad and they both return the same result.

Related

Using regular expression on string for use in C#

I'm trying to extract a url from a string.
{ns:"images",k:"5127",mid:"A04F21EB77CF61E10E43BA33CF1986CA44357448"
,md5:"e2987d19c953bd836ec8fd2e0aa8492",surl:"http://someURLIdontwant/"
,imgurl:"http://THISISTHEURLINEED.jpg",tid:"OIP.Me2987d199c953bd836ec8fd2e0aa8492H0"
,ow:"300", docid:"608010036892077154",oh:"225",tft:"49"}
So it is located after "imgurl:". I am no expert on Regex and all I could produce is:
imgurl:'(.*)',tid
whitch worked on some online regex tester. But not the way I'm using it in C# apperantly.
webClient.DownloadFile(System.Text.RegularExpressions.Regex.Match
(stringWithText, "imgurl:'(.*)',tid").Groups[1].Value,"path\file.jpg");
Can it be done? Thanks
As #WiktorStribiżew already pointed out: The expression is almost correct. Use this instead:
Regex.Match(stringWithText, "imgurl:\"(.*)\",tid").Groups[1].Value
Example on dotNetFiddle
And as I mentioned earlier in a comment: You should parse the Json data instead.

Regular expression of simple boolean expression with parenthesis

I'm trying to write regular expression that should get only the following patterns:
WordWihoutNumbers.WordWihoutNumbers='value'
and patterns with multiple sub expressions like:
WordWihoutNumbers.WordWihoutNumbers='value' OR WordWihoutNumbers.WordWihoutNumbers='value2' AND WordWihoutNumbers.WordWihoutNumbers='value3'
WordWihoutNumbers must be at least two characters and without digits.
for example, those are valid string:
Hardware.Make=’Lenovo’
Hardware.Make=’Lenovo’ OR User.Sitecode=’PRC’
and those are not:
Hardware.Make=’Lenovo’ OR => because there is nothing after the OR operator
Hardware.Make=’Lenovo => ' missing
Hardware Make=’Lenovo => . missing
Hardware.Make’Lenovo' => = missing
I used RegexBuddy to write the following Regex string:
(?i)(\s)*[a-z][a-z]+(.[a-z][a-z]+)(\s)*=(\s)*'[a-z0-9]+'(\s)*((\s)*(AND|OR)(\s)*[a-z][a-z]+(.[a-z][a-z]+)(\s)*=(\s)*'[a-z0-9]+')*
When I tested it using RegexBuddy it worked fine but when I using it inside my C# code I'm always getting 'false' result.
What am I'm doing wrong?
This is what I did in my C# code:
string expression = "Hardware.Make=’Lenovo’ OR User.Sitecode=’PRC’";
Regex expressionFormat = new Regex(#"(?i)(\s)*[a-z][a-z]+(.[a-z][a-z]+)(\s)*=(\s)*'[a-z0-9]+'(\s)*((\s)*(AND|OR)(\s)*[a-z][a-z]+(.[a-z][a-z]+)(\s)*=(\s)*'[a-z0-9]+')*");
bool result = expressionFormat.IsMatch(expression );
and result parameter is always false
UPDATE: thanks to #nhahtdh for his comment, I used a ’ in my input checking instead of '
I need to add to this expression also parenthesis validation, for example:
((WordWihoutNumbers.WordWihoutNumbers='value' OR WordWihoutNumbers.WordWihoutNumbers='value2') AND WordWihoutNumbers.WordWihoutNumbers='value3') is valid but
)WordWihoutNumbers.WordWihoutNumbers='value' OR WordWihoutNumbers.WordWihoutNumbers='value2') AND WordWihoutNumbers.WordWihoutNumbers='value3') is invalid.
Is it possible to implement using Regex? do you have an idea?
Thanks to #nhahtdh that found my issue.
the problem was that ’ and ' are different code points and that was the reason my regular expression didn't work (it was input problem).

Can I add a regular expression into a .Net Assertion?

I'm trying to pull out page source from a set of pages and run an assertion on the results, this is a Test that runs to check that we are crawling specific pages in our site. Sometimes the results come back with a different case for the URL string, I'd like to account for that in the Assertion where I am checking page source. This is probably the wrong way to do this but I was wondering if there is a way to add in the .Net regex commands to the Assertion text. I have this as an assertion:
Assert.IsTrue(driver.PageSource.Contains("/explore"));
But is there a way to be sure that I can capture explore, Explore or EXPLORE? I though I could use (?i) here but that doesn't seem to work. I'm more used to Perl and it's regex capabilities but with C# and .Net I'm a little lost on where I can and can't use the inline regex commands.
Anthonys answer is valid, you don't really need regex. But if you do want to use it, you can use
Regex.IsMatch(driver.PageSource, "/explore", RegexOptions.IgnoreCase)
You don't need a regular expression to perform a case-insensitive check. Use IndexOf and compare that the result is greater than -1. IndexOf has overloads that allow you to specify if casing matters. Something like
bool containsExplore = driver.PageSource.IndexOf("/explore", StringComparison.InvariantCultureIgnoreCase) > -1;
Assert.IsTrue(containsExplore);
Try:
RegEx.Match("string", "regexp", RegExOptions.IgnoreCase).Success
How about using
StringAssert.Matches(string, regex);
In your case, that would translate to
StringAssert.Matches("drive.PageSource", "\/explore");

RegEx replace with calculations?

Is it possible somehow to do a RegEx-replace with a calculation in the result? (in VS2010)
Such as:
Grid\.Row\=\"{[0-9]+}\"
to
Grid.Row="eval(int(\1) + 1)"
You can use a MatchEvaluator do achieve this, like
String s = Regex.Replace("1239", #"\d", m => (Int32.Parse(m.ToString()) + 1).ToString());
Output: 23410
Edit:
I just noticed... if you mean "using the VS2010 find-replace feature" and not "using C#", then the answer is "no", i am afraid.
You could always use capturing to retrieve any values you need for your calculation and then perform a RegEx Replace with a new RegEx that's constructed from you're equation and any values you captured.
If the equation doesn't use anything from the input text, one RegEx would be sufficient. You'd simply construct it by concatenating the static portions together with the computed value(s).
Unfortunately, C# and .NET do not provide an eval method or equivalent. However, it is possible to either use a library for expression parsing (a quick google gave me this .NET Math Expression Parser) or write your own (which is actually pretty easy, check out the Shunting-yard Algorithm and Postfix Notation). Simply capture the group then output the group value to the library/method you have written.
Edit: I see now you want this for the VS2010 program. This is unachievable unless you write your own VS extension. You could always write a program to search and replace your code and feed the code into it, then replace it the original code with its output.

What is C# equivalent of preg_match_all?

The theme is i opened a file and get all it's data into string and i am matching this string with the regex returning none. But the same regex in PHP is returning values for the same text using preg_match_all. Anyone having a idea?
The method in .NET that’s closest to preg_match_all() is the static Regex.Matches(String,String) call, or the equivalent Matches method on a compiled regular expression. It returns a MatchCollection that you can use to count the matches and to loop over each one.
Can you provide some short, self-contained code to show what’s not working?
There is a Regex.Matches method in C# that you can use.

Categories