I am providing a textbox for one to enter a Regular Expression to match filenames. I plan to detect any named capture groups that they provide with the Regex method GetGroupNames().
I want to get the expression that they entered inside each named capture group.
As an example, they might enter a regular expression like this:
December (?<FileYear>\d{4}) Records\.xlsx
Is there a method or means to get the sub-expression \d{4} apart from manually parsing the regular expression string?
Here is an ugly brute force extension for parsing without using another Regex to detect the subexpression (or subpattern):
public static string GetSubExpression(this Regex pRegex, string pCaptureName)
{
string sRegex = pRegex.ToString();
string sGroupText = #"(?<" + pCaptureName + ">";
int iStartSearchAt = sRegex.IndexOf(sGroupText) + sGroupText.Length;
string sRemainder = sRegex.Substring(iStartSearchAt);
string sThis;
string sPrev = "";
int iOpenParenCount = 0;
int iEnd = 0;
for (int i = 0; i < sRemainder.Length; i++)
{
sThis = sRemainder.Substring(i, 1);
if (sThis == ")" && sPrev != #"\" && iOpenParenCount == 0)
{
iEnd = i;
break;
}
else if (sThis == ")" && sPrev != #"\")
{
iOpenParenCount--;
}
else if (sThis == "(" && sPrev != #"\")
{
iOpenParenCount++;
}
sPrev = sThis;
}
return sRemainder.Substring(0, iEnd);
}
The usage looks like this:
Regex reFromUser = new Regex(txtFromUser.Text);
string[] asGroupNames = reFromUser.GetGroupNames();
int iItsInt;
foreach (string sGroupName in asGroupNames)
{
if (!Int32.TryParse(sGroupName, out iItsInt)) //don't want numbered groups
{
string sSubExpression = reParts.GetSubExpression(sGroupName);
//Do what I need to do with the sub-expression
}
}
Now, if you would like to generate test or sample data, you can use the NuGet package called "Fare" in the following way after you get a sub-expression:
//Generate test data for it
Fare.Xeger X = new Fare.Xeger(sSubExpression);
string sSample = X.Generate();
This pattern (?<=\(\?<\w+\>)([^)]+) will give you all the named match capture expression with the name of the capture. It uses a negative look behind to make sure the text matched will have a (?<...> before it.
string data = #"December (?<FileYear>\d{4}) Records\.xlsx";
string pattern = #"(?<=\(\?<\w+\>)([^)]+)";
Regex.Matches(data, pattern)
.OfType<Match>()
.Select(mt => mt.Groups[0].Value)
returns one item of
\d{4}
While the data such as (?<FileMonth>[^\s]+)\s+(?<FileYear>\d{4}) Records\.xlsx would return two matches:
[^\s]+
\d{4}
Here is a solution using a regular expression to match the capturing groups in a regular expression. Idea is from this post Using RegEx to balance match parenthesis:
\(\?\<(?<MyGroupName>\w+)\>
(?<MyExpression>
((?<BR>\()|(?<-BR>\))|[^()]*)+
)
\)
or more concisely...
\(\?\<(?<MyGroupName>\w+)\>(?<MyExpression>((?<BR>\()|(?<-BR>\))|[^()]*)+)\)
and to use it might look like this:
string sGetCaptures = #"\(\?\<(?<MyGroupName>\w+)\>(?<MyExpression>((?<BR>\()|(?<-BR>\))|[^()]*)+)\)";
MatchCollection MC = Regex.Matches(txtFromUser.Text, sGetCaptures );
foreach (Match M in MC)
{
string sGroupName = M.Groups["MyGroupName"].Value;
string sSubExpression = M.Groups["MyExpression"].Value;
//Do what I need to do with the sub-expression
MessageBox.Show(sGroupName + ":" + sSubExpression);
}
And for the example in the original question, the message box would return FileYear:\d{4}
The variable PostedId contain for example 7 strings.
For example in index 0 i see: {"id":"1234567890"}
I want in the loop FOR
To parse/extract from the current index string only the number.
So in the line :
objFacebookClient.Delete(PostedId[i]).ToString();
Instead PostedId[i] will be {"id":"1234567890"} it should be only: 1234567890 only the number.
private void button7_Click(object sender, EventArgs e)
{
var responsePost = "";
try
{
var objFacebookClient = new FacebookClient(AccessPageToken);
for (int i = 0; i < PostedId.Count; i++)
{
objFacebookClient.Delete(PostedId[i]).ToString();
}
}
catch (Exception ex)
{
responsePost = "Facebook Posting Error Message: " + ex.Message;
}
}
If I understand your question correctly, you want to get numbers in your string then parsing them to int.
If it is, you can use Char.IsDigit method to get numbers inside a string and use Int32.TryParse method to parsing it.
For example;
string s = "{\"id\":\"1234567890\"}";
char[] array = s.Where(c => Char.IsDigit(c)).ToArray();
string s1 = new string(array);
int i;
if (Int32.TryParse(s1, NumberStyles.Integer, CultureInfo.InvariantCulture, out i))
{
Console.WriteLine(i);
}
Output will be;
1234567890
Here a demonstration.
Based on your comment under Soner Gönül`s answer, you probably want to do something like this.
This is a modification of that answer; just outputting the resulting string, without parsing it to an int, which obviously will not work when including "_".
Please make sure you include details like this in your original post in future questions.
string input = "{\"id\":\"12345_67890\"}";
char[] array = input.Where(c => Char.IsDigit(c) || c == '_').ToArray();
// Will contain: "12345_67890"
string result = new string(array);
Should be easy enough to do with a string.split or making use of a regex to extract the number part of the id. I can give an example if needed
The format you have here is looks like Json.
You can use
JSON.NET to parse your string.
Make class called for example 'FaceBookResponse':
public class FaceBookResponse{
public int id { get; set; }
}
Then use JavaScriptSerializer (Reference System.Web.Extensions):
JavaScriptSerializer serializer = new JavaScriptSerializer();
In the for loop use this:
var obj = serializer.Deserialize<FaceBookResponse>();
objFacebookClient.Delete(obj.id).ToString();
you can use regx for this,
5 is minimum digits you want,
numberonly is output, strText is input text
Regex _Regex = new Regex("\\d{5,}", RegexOptions.IgnoreCase);
MatchCollection matchList = _Regex.Matches(strText);
string NumberOnly = string.Empty;
if (matchList.Count == 0)
NumberOnly = matchList(0).ToString();
//if strText = "Test12345Test6789"
//Output will be
//matchList(0) = 12345
//matchList(1) = 6789
I might have not stated the question as what I would like to. Please consider below scenario.
Scenario:
I am implementing a Search/Replace functionality in my C# Win Form application. This feature will have the option to replace a substring that "starts with" or "ends with" a certain value. For example:
A string contains "123ABCD". Replacing "123" with "XYZ" should produce: "XYZABCD"
A string contains "ABCD123". Replacing "123" with "XYZ" should produce: "ABCDXYZ"
Both of these features are working fine. My problem is when the string contains "123ABCD123". Both operations return the wrong value when using "XYZ".
"starts with" produces "XYZABCDXYZ", instead of "XYZABCD"
"ends with" produces "XYZABCDXYZ" instead of "ABCDXYZ"
Can anyone give me an idea how to achieve that?
Thanks !!!
Code Snippet:
if (this.rbMatchFieldsStartedWith.Checked)
{
if (caseSencetive)
{
matched = currentCellValue.StartsWith(findWhat);
}
else
{
matched = currentCellValue.ToLower().StartsWith(findWhat.ToLower());
}
}
else if (this.rbMatchFieldsEndsWith.Checked)
{
if (caseSencetive)
{
matched = currentCellValue.EndsWith(findWhat);
}
else
{
matched = currentCellValue.ToLower().EndsWith(findWhat.ToLower());
}
}
if (matched)
{
if (replace)
{
if (this.rbMatchWholeField.Checked)
{
currentCell.Value = replaceWith;
}
else
{
currentCellValue = currentCellValue.Replace(findWhat, replaceWith);
currentCell.Value = currentCellValue;
}
this.QCGridView.RefreshEdit();
}
else
{
currentCell.Style.BackColor = Color.Aqua;
}
}
Implement the replacement method dependent on the search mode.
Replace the line
currentCellValue = currentCellValue.Replace(findWhat, replaceWith);
with
if (this.rbMatchFieldsStartedWith.Checked)
{
// target string starts with findWhat, so remove findWhat and prepend replaceWith
currentCellValue = replaceWith + currentCellValue.SubString(findWhat.Length);
}
else
{
// target string end with findWhat, so remove findWhat and append replaceWith.
currentCellValue = currentCellValue.SubString(0, currentCellValue.Length - findWhat.Length) + replaceWith;
}
currentCell.Value = newValue;
This sounds like a good one for regular expressions.
It is supported by .NET, and also has a replacement syntax.
I just want to try a replacement method without using regex.
(Regex could be the right way to do it, but it was funny to find an alternative)
void Main()
{
string test = "123ABCD123"; // String to change
string rep = "XYZ"; // String to replace
string find = "123"; // Replacement string
bool searchStart = true; // Flag for checkbox startswith
bool searchEnd = true; // Flag for checkbox endswith
bool caseInsensitive = true; // Flag for case type replacement
string result = test;
int pos = -1;
int lastPos = -1;
if(caseInsensitive == true)
{
pos = test.IndexOf(find, StringComparison.InvariantCultureIgnoreCase);
lastPos = test.LastIndexOf(find, StringComparison.InvariantCultureIgnoreCase);
}
else
{
pos = test.IndexOf(find, StringComparison.Ordinal);
lastPos = test.LastIndexOf(find, StringComparison.Ordinal);
}
result = test;
if(pos == 0 && searchStart == true)
{
result = rep + test.Substring(find.Length);
}
if(lastPos != 0 && lastPos != pos && lastPos + find.Length == test.Length && searchEnd == true)
{
result = result.Substring(0, lastPos) + rep;
}
Console.WriteLine(result);
}
First off all let's trace your scenario assuming:
string to work on is 123ABCD123
starts with is checked.
aim is to replace "123" with "XYZ"
by just reading your code. We hit if (this.rbMatchFieldsStartedWith.Checked) and which evaluates to true. So we step in that block. We hit matched = currentCellValue.StartsWith(findWhat); and matched = true. We continue with if (matched) condition which also evaluates to true. After that if (replace) evaluates to true. Finally we make the last decision with if (this.rbMatchWholeField.Checked) which evaluates to false so we continue with else block:
currentCellValue = currentCellValue.Replace(findWhat, replaceWith);
currentCell.Value = currentCellValue;
First line in this block replaces all the occurrences of findWhat with replaceWith, namely all the occurrences of 123 with XYZ. Of course this is not the desired behaviour. Instead of Replace you must use a function that replaces just the first or the last occurrence of the string according to the input of course.
String mystring="start i dont know hot text can it to have here important=value5; x=1; important=value2; z=3;";
suggest i want to get the value of "importante" now i know how to do it with a substring, but it has 2 subistring, then how do i get, first one, and after the next? ...??
if it is not posible i want to try it... save the first. and delete since "start" until value5 for next query save the value2...
how to do any of two things?
i get the first value so...
string word = "important=";
int c= mystring.IndexOf(word);
int c2 = word.Length;
for (int i = c+c2; i < mystring.Length; i++)
{
if (mystring[i].ToString() == ";")
{
break;
}
else
{
label1.Text += mystring[i].ToString(); // c#
// label1.setText(label1.getText()+mystring[i].ToString(); //java
}
}
If you want to extract all values you could use a regex:
string input = "start i dont know hot text can it to have here important=value5; x=1; important=value2; z=3;";
Regex regex = new Regex(#"important=(?<value>\w+)");
List<string> values = new List<string>();
MatchCollection matches = regex.Matches(input);
foreach (Match match in matches)
{
string value= match.Groups["value"].Value;
values.Add(value);
}
You can save the values in an array, instead of showing them with MessageBox.
string mystring = "start i dont know hot text can it to have here important=value5; x=1; important=value2; z=3;";
string temp = mystring;
string word = "important=";
while (temp.IndexOf(word) > 0)
{
MessageBox.Show( temp.Substring(temp.IndexOf(word) + word.Length).Split(';')[0]);
temp = temp.Remove(temp.IndexOf(word), word.Length);
}
You can use 2 methods:
String.Remove()
and
String.Replace()
use regular expression, find all the match and reconstruct the string yourself.
Do any of you know of an easy/clean way to find a substring within a string while ignoring some specified characters to find it. I think an example would explain things better:
string: "Hello, -this- is a string"
substring to find: "Hello this"
chars to ignore: "," and "-"
found the substring, result: "Hello, -this"
Using Regex is not a requirement for me, but I added the tag because it feels related.
Update:
To make the requirement clearer: I need the resulting substring with the ignored chars, not just an indication that the given substring exists.
Update 2:
Some of you are reading too much into the example, sorry, i'll give another scenario that should work:
string: "?A&3/3/C)412&"
substring to find: "A41"
chars to ignore: "&", "/", "3", "C", ")"
found the substring, result: "A&3/3/C)41"
And as a bonus (not required per se), it will be great if it's also not safe to assume that the substring to find will not have the ignored chars on it, e.g.: given the last example we should be able to do:
substring to find: "A3C412&"
chars to ignore: "&", "/", "3", "C", ")"
found the substring, result: "A&3/3/C)412&"
Sorry if I wasn't clear before, or still I'm not :).
Update 3:
Thanks to everyone who helped!, this is the implementation I'm working with for now:
http://www.pastebin.com/pYHbb43Z
An here are some tests:
http://www.pastebin.com/qh01GSx2
I'm using some custom extension methods I'm not including but I believe they should be self-explainatory (I will add them if you like)
I've taken a lot of your ideas for the implementation and the tests but I'm giving the answer to #PierrOz because he was one of the firsts, and pointed me in the right direction.
Feel free to keep giving suggestions as alternative solutions or comments on the current state of the impl. if you like.
in your example you would do:
string input = "Hello, -this-, is a string";
string ignore = "[-,]*";
Regex r = new Regex(string.Format("H{0}e{0}l{0}l{0}o{0} {0}t{0}h{0}i{0}s{0}", ignore));
Match m = r.Match(input);
return m.Success ? m.Value : string.Empty;
Dynamically you would build the part [-, ] with all the characters to ignore and you would insert this part between all the characters of your query.
Take care of '-' in the class []: put it at the beginning or at the end
So more generically, it would give something like:
public string Test(string query, string input, char[] ignorelist)
{
string ignorePattern = "[";
for (int i=0; i<ignoreList.Length; i++)
{
if (ignoreList[i] == '-')
{
ignorePattern.Insert(1, "-");
}
else
{
ignorePattern += ignoreList[i];
}
}
ignorePattern += "]*";
for (int i = 0; i < query.Length; i++)
{
pattern += query[0] + ignorepattern;
}
Regex r = new Regex(pattern);
Match m = r.Match(input);
return m.IsSuccess ? m.Value : string.Empty;
}
Here's a non-regex string extension option:
public static class StringExtensions
{
public static bool SubstringSearch(this string s, string value, char[] ignoreChars, out string result)
{
if (String.IsNullOrEmpty(value))
throw new ArgumentException("Search value cannot be null or empty.", "value");
bool found = false;
int matches = 0;
int startIndex = -1;
int length = 0;
for (int i = 0; i < s.Length && !found; i++)
{
if (startIndex == -1)
{
if (s[i] == value[0])
{
startIndex = i;
++matches;
++length;
}
}
else
{
if (s[i] == value[matches])
{
++matches;
++length;
}
else if (ignoreChars != null && ignoreChars.Contains(s[i]))
{
++length;
}
else
{
startIndex = -1;
matches = 0;
length = 0;
}
}
found = (matches == value.Length);
}
if (found)
{
result = s.Substring(startIndex, length);
}
else
{
result = null;
}
return found;
}
}
EDIT: here's an updated solution addressing the points in your recent update. The idea is the same except if you have one substring it will need to insert the ignore pattern between each character. If the substring contains spaces it will split on the spaces and insert the ignore pattern between those words. If you don't have a need for the latter functionality (which was more in line with your original question) then you can remove the Split and if checking that provides that pattern.
Note that this approach is not going to be the most efficient.
string input = #"foo ?A&3/3/C)412& bar A341C2";
string substring = "A41";
string[] ignoredChars = { "&", "/", "3", "C", ")" };
// builds up the ignored pattern and ensures a dash char is placed at the end to avoid unintended ranges
string ignoredPattern = String.Concat("[",
String.Join("", ignoredChars.Where(c => c != "-")
.Select(c => Regex.Escape(c)).ToArray()),
(ignoredChars.Contains("-") ? "-" : ""),
"]*?");
string[] substrings = substring.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
string pattern = "";
if (substrings.Length > 1)
{
pattern = String.Join(ignoredPattern, substrings);
}
else
{
pattern = String.Join(ignoredPattern, substring.Select(c => c.ToString()).ToArray());
}
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine("Index: {0} -- Match: {1}", match.Index, match.Value);
}
Try this solution out:
string input = "Hello, -this- is a string";
string[] searchStrings = { "Hello", "this" };
string pattern = String.Join(#"\W+", searchStrings);
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine(match.Value);
}
The \W+ will match any non-alphanumeric character. If you feel like specifying them yourself, you can replace it with a character class of the characters to ignore, such as [ ,.-]+ (always place the dash character at the start or end to avoid unintended range specifications). Also, if you need case to be ignored use RegexOptions.IgnoreCase:
Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
If your substring is in the form of a complete string, such as "Hello this", you can easily get it into an array form for searchString in this way:
string[] searchString = substring.Split(new[] { ' ' },
StringSplitOptions.RemoveEmptyEntries);
This code will do what you want, although I suggest you modify it to fit your needs better:
string resultString = null;
try
{
resultString = Regex.Match(subjectString, "Hello[, -]*this", RegexOptions.IgnoreCase).Value;
}
catch (ArgumentException ex)
{
// Syntax error in the regular expression
}
You could do this with a single Regex but it would be quite tedious as after every character you would need to test for zero or more ignored characters. It is probably easier to strip all the ignored characters with Regex.Replace(subject, "[-,]", ""); then test if the substring is there.
Or the single Regex way
Regex.IsMatch(subject, "H[-,]*e[-,]*l[-,]*l[-,]*o[-,]* [-,]*t[-,]*h[-,]*i[-,]*s[-,]*")
Here's a non-regex way to do it using string parsing.
private string GetSubstring()
{
string searchString = "Hello, -this- is a string";
string searchStringWithoutUnwantedChars = searchString.Replace(",", "").Replace("-", "");
string desiredString = string.Empty;
if(searchStringWithoutUnwantedChars.Contains("Hello this"))
desiredString = searchString.Substring(searchString.IndexOf("Hello"), searchString.IndexOf("this") + 4);
return desiredString;
}
You could do something like this, since most all of these answer require rebuilding the string in some form.
string1 is your string you want to look through
//Create a List(Of string) that contains the ignored characters'
List<string> ignoredCharacters = new List<string>();
//Add all of the characters you wish to ignore in the method you choose
//Use a function here to get a return
public bool subStringExist(List<string> ignoredCharacters, string myString, string toMatch)
{
//Copy Your string to a temp
string tempString = myString;
bool match = false;
//Replace Everything that you don't want
foreach (string item in ignoredCharacters)
{
tempString = tempString.Replace(item, "");
}
//Check if your substring exist
if (tempString.Contains(toMatch))
{
match = true;
}
return match;
}
You could always use a combination of RegEx and string searching
public class RegExpression {
public static void Example(string input, string ignore, string find)
{
string output = string.Format("Input: {1}{0}Ignore: {2}{0}Find: {3}{0}{0}", Environment.NewLine, input, ignore, find);
if (SanitizeText(input, ignore).ToString().Contains(SanitizeText(find, ignore)))
Console.WriteLine(output + "was matched");
else
Console.WriteLine(output + "was NOT matched");
Console.WriteLine();
}
public static string SanitizeText(string input, string ignore)
{
Regex reg = new Regex("[^" + ignore + "]");
StringBuilder newInput = new StringBuilder();
foreach (Match m in reg.Matches(input))
{
newInput.Append(m.Value);
}
return newInput.ToString();
}
}
Usage would be like
RegExpression.Example("Hello, -this- is a string", "-,", "Hello this"); //Should match
RegExpression.Example("Hello, -this- is a string", "-,", "Hello this2"); //Should not match
RegExpression.Example("?A&3/3/C)412&", "&/3C\\)", "A41"); // Should match
RegExpression.Example("?A&3/3/C) 412&", "&/3C\\)", "A41"); // Should not match
RegExpression.Example("?A&3/3/C)412&", "&/3C\\)", "A3C412&"); // Should match
Output
Input: Hello, -this- is a string
Ignore: -,
Find: Hello this
was matched
Input: Hello, -this- is a string
Ignore: -,
Find: Hello this2
was NOT matched
Input: ?A&3/3/C)412&
Ignore: &/3C)
Find: A41
was matched
Input: ?A&3/3/C) 412&
Ignore: &/3C)
Find: A41
was NOT matched
Input: ?A&3/3/C)412&
Ignore: &/3C)
Find: A3C412&
was matched