searching a hash in a string with regex - c#

I want to get this bold part from this string:
some other code src='/pages/captcha?t=c&s=**51afb384edfc&h=513cc6f5349b**' `</td><td><input type=text name=captchaenter id=captchaenter size=3`
This is my regex that is not working:
Regex("src=\\'/pages/captcha\\?t=c&s=([\\d\\w&=]+)\\'", RegexOptions.IgnoreCase)
In tool for regex testing it's working.
How can this be fixed?

Your string-based regex is different from the regex you tested in the tool. In your regex, you have [\d\w\W]+ which matches any character and is aggressive (i.e. no ? after + to make it non-aggressive). So it may match a very long string, which may be all the way up to the last end quote.
In your tool you have [\d\w&=] which only matches digits, letters, & and =, so obviously it will stop when hitting the end quote.

The regex's aren't the same. The one in code has a character class ([\\d\\w\\W]+) that is different from the one in the tool ([\\d\\w&=]+])

Works perfectly fine with this code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string s = "src='/pages/captcha?t=c&s=51afb384edfc&h=513cc6f5349b' </td><td><input type=text name=captchaenter id=captchaenter size=3";
Regex rgx = new Regex("src=\\'/pages/captcha\\?t=c&s=([\\d\\w\\W]+)\\'", RegexOptions.IgnoreCase);
Match m = rgx.Match(s);
Console.Write(m.Groups[1]);
}
}
}
It outputs
51afb384edfc&h=513cc6f5349b

I despise regular expressions. I would do it similar to (but safer than) this:
private static string GetStuff(string source)
{
var start = source.IndexOf("s=") + 2;
var end = source.IndexOf('\'', start + 3);
return source.Substring(start, end - start);
}

Related

C# Regex for all chars and '-'

I've been trying all combinations that I could find on google, but nothing seems to allow me the regex for all letters and only one special charachter..
Example of what Im trying to achieve:
Something-something (accepted)
something something (accepted)
-Something(not accepted)
SomethingSomething-(not accepted)
something-something-somthing(accepted)
So as you see, I need to be able to use all letters and special char - anywhere inbetween letters. Hopefully someone knows the answer on how to achieve this.
My code:
private void textBox8_TextChanged(object sender, EventArgs e)
{
if (string.IsNullOrWhiteSpace(textBox8.Text) || !Regex.IsMatch(textBox8.Text, #"^[A-Za-z]\-+$"))
{
textBox8.BackColor = Color.Red;
}
else
{
textBox8.BackColor = Color.White;
}
}
You can use this regex
^[A-Za-z]+(?:[- ][A-Za-z]+)*$
Check here
Regex Breakdown
^ #Start of string
[A-Za-z]+ #Match alphabets
(?:[- ][A-Za-z]+)* #Match - followed by alphabets 0 or more times
$ #End of string
If you need letters before and after the hyphen:
[A-Za-z]+\-[A-Za-z]+
If you just need a hyphen and letters somewhere in the string
[A-Za-z]?\-[A-Za-z]?
Does this pattern work for you?
^([A-Za-z]+-)+[A-Za-z]+$
Matches
Something-Something
Something-Something-Something
S-o-m-e-t-h-i-n-g
Does not match
-Something
Something-
Something-Something-
Something
Try this Regex
^[^-]([a-zA-z-]*)[^-]$
Try this. You didn't specify if the dash was required or not required. I assume the dash was optional.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] inputs = {
"Something-something",
"-Something",
"SomethingSomething-",
"something-something-somthing",
"1234"
};
string pattern = #"^([A-Za-z]+[-\s]?[A-Za-z]+)+$";
foreach (string input in inputs)
{
Console.WriteLine("Input : '{0}', Match : '{1}'", input, Regex.IsMatch(input, pattern) ? "Yes" : "No");
}
Console.ReadLine();
}
}
}

how to build a regex with square brackets

I need to build a reg-ex to find all strings between [" and ",
There are multiple occurrences of both the above strings and i want all content between them. Help please?
here is an example : http://pastebin.com/crFDit2N
You mean a string such as [" my beautiful string ", ?
Then it sounds like this simple regex:
\[".*?",
To get all the strings in C#, you can do something like
using System;
using System.Text.RegularExpressions;
using System.Collections.Specialized;
class Program {
static void Main() {
string s1 = #" ["" my beautiful string "", ["" my second string "", ";
var resultList = new StringCollection();
try {
var myRegex = new Regex(#"\["".*?"",", RegexOptions.Multiline);
Match matchResult = myRegex.Match(s1);
while (matchResult.Success) {
resultList.Add(matchResult.Groups[0].Value);
Console.WriteLine(matchResult.Groups[0].Value);
matchResult = matchResult.NextMatch();
}
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
Console.WriteLine("\nPress Any Key to Exit.");
Console.ReadKey();
} // END Main
} // END Program
You can use this regex:
(?<=\[").*?(?=",)
It uses look-behind and look-ahead positive assertions to check that the match is preceded by [" and followed by ",.
#Szymon and #zx81 :
Be careful : there can be a problem with your regex (depending of xhammer needs). If the string is for example :
["anything you want["anything you want",anything you want
Your regex will catch ["anything you want["anything you want", and not ["anything you want",
To solve this problem, you can use : [^\[] instead of the . in each regex.
The best way to see if a regex works for your needs is to test it in an online regex tester.
(PS : Even this solution isn't perfect in case there can be '[' in the string but I don't see how to solve this case in only one regex)

Check if string contains character and number

How do I check if a string contains the following characters "RM" followed by a any number or special character(-, _ etc) and then followed by "T"?
Ex: thisIsaString ABRM21TC = yes, contains "RM" followed by a number and followed by "T"
Ex: thisIsaNotherString RM-T = yes, contain "RM" followed by a special character then followed by "T"
Your going to want to check the string using a regex (regular expression). See this MSDN for info on how to do that
http://msdn.microsoft.com/en-us/library/ms228595.aspx
Try this regexp.
[^RM]*RM[^RMT]+T[^RMT]*
Here is a sample program.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication12
{
class Program
{
static void Main(string[] args)
{
String rg = "[^RM]*RM[^RMT]+T[^RMT]*";
string input = "111RM----T222";
Match match = Regex.Match(input, rg, RegexOptions.IgnoreCase);
Console.WriteLine(match.Success);
}
}
}
You can do it with a simple regular expression:
var match = Regex.Match(s, "RM([^T]+)T");
Check if the pattern is present by calling match.Success.
Get the captured value by calling match.Groups[1].
Here is a demo (on ideone: link):
foreach (var s in new[] {"ABRM21TC", "RM-T", "RxM-T", "ABR21TC"} ) {
var match = Regex.Match(s, "RM([^T]+)T");
Console.WriteLine("'{0}' - {1} (Captures '{2}')", s, match.Success, match.Groups[1]);
}
It prints
'ABRM21TC' - True (Captures '21')
'RM-T' - True (Captures '-')
'RxM-T' - False (Captures '')
'ABR21TC' - False (Captures '')
Use regular expressions
http://www.webresourcesdepot.com/learn-test-regular-expressions-with-the-regulator/
The Regulator is an advanced, free regular expressions testing and learning tool. It allows you to build and verify a regular expression against any text input, file or web, and displays matching, splitting or replacement results within an easy to understand, hierarchical tree.
You should play around with more sample data especially regarding special characters, you can use regexpal, I have added the two cases and an expression to get you started.

Regex pattern incorrect and combination

I have regex pattern like below:
Regex rx1 = new Regex(#"<div>/\*(.(?!\*/))*\*/(</div>|<br/></div>|<br></div>)");
Regex rx2 = new Regex(#"/\*[^>]+?\*/(<br/>|<br>)");
Regex rx3 = new Regex(#"/\*[^>]+?\*/");
Can anybody help to join together the regexes become 1 pattern?
Your problem with RX1 is because of (.(?!\*/))*\*/ which captures any character zero or more times aslong as it is not followed by */ because of this the answer can never match.
UPDATED Answer
#"(?'div'<div>)?/\*((?<!\*/).)*?\*/(?:<br/?>)?(?'-div'</div>)?(?(div)(?!))"
This will capture:
(?'div'<div>) an optional opening div stored in capture group div
/\* char sequence /*
((<!\*/).)*? zero or more characters, non greedy and each character is not
preceded by the string */
\*/ char sequence `*/`
(?:<br/?>)? optionally <br> or <br/>
(?'-div'</div>)? optionally </div> remove from capture group `div`
(?(div)(?!)) match only if capture group div is empty (ie balanced <div> </div>)
I think you need this for combining the patterns:
(pattern1|pattern2|pattern3) means pattern1 or pattern2 or pattern3
Try the following(It's frankenstein code but it helps you manage each regex variable as it's own as opposed to concatenating all three into one big regex(although that it is not wrong but it can be hard to manage changes to the regex).:
CODE:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
namespace BatchRegex
{
class Program
{
static void Main(string[] args)
{
string[] target =
{
"<div>/*...*/</div> <div>/*...*/<br></div> <div>/*...*/<br></div>",
"/*...*/<br></div> or /*...*/<br/></div>"
};
foreach (var tgt in target)
{
var rx1 = new Regex[]{new Regex(#"<div>/\*(.(?!\*/))*\*/(</div>|<br/></div>|<br></div>)", RegexOptions.Multiline),
new Regex(#"/\*[^>]+?\*/(<br/>|<br>)", RegexOptions.Multiline),
new Regex(#"/\*[^>]+?\*/", RegexOptions.Multiline)};
foreach (var rgx in rx1)
{
var rgxMatches = rgx.Matches(tgt).Cast<Match>();
Parallel.ForEach(rgxMatches, match =>
{
Console.WriteLine("Found {0} in target {1}.", match, tgt);
});
}
}
Console.Write("Press any key to exit...");
Console.ReadKey();
}
}
}

I want to strip off everything but numbers, $, comma(,)

I want to strip off everything but numbers, $, comma(,).
this only strip letters
string Cadena;
Cadena = tbpatronpos6.Text;
Cadena = Regex.Replace(Cadena, "([^0-9]|\\$|,)", "");
tbpatronpos6.Text = Cadena;
Why doesn't my regex work, and how can I fix it?
I suspect this is what you want:
using System;
using System.Text.RegularExpressions;
class Test
{
static void Main(string[] args)
{
string original = #"abc%^&123$\|a,sd";
string replaced = Regex.Replace(original, #"[^0-9$,]", "");
Console.WriteLine(replaced); // Prints 123$,
}
}
The problem was your use of the alternation operator, basically - you just want the set negation for all of (digits, comma, dollar).
Note that you don't need to escape the dollar within a character group.
you want something like this?
[^\\d\\$,]

Categories