Doing a search for different strings using wildcards, such as doing a search for test0? (there is a space after the ?). The strings the search produces are:
test01
test02
test03
(and so on)
The replacement text should be for example:
test0? -
The wildcard above in test0? - represents the 1, 2, or 3...
So, the replacement strings should be:
test01 -
test02 -
test03 -
string pattern = WildcardToRegex(originalText);
fileName = Regex.Replace(originalText, pattern, replacementText);
public string WildcardToRegex(string pattern)
{
return "^" + System.Text.RegularExpressions.Regex.Escape(pattern).
Replace("\\*", ".*").Replace("\\?", ".") + "$";
}
The problem is saving the new string with the original character(s) plus the added characters. I could search the string and save the original with some string manipulation, but that seems like too much overhead. There has to be an easier way.
Thanks for any input.
EDIT:
Search for strings using the wildcard ?
Possible string are:
test01 someText
test02 someotherText
test03 moreText
Using Regex, the search string patter will be:
test0? -
So, each string should then read:
test01 - someText
test02 - someotherText
test03 - moreText
How to keep the character that was replaced by the regex wildcard '?'
As my code stands, it will come out as test? - someText
That is wrong.
Thanks.
EDIT Num 2
First, thanks everyone for their answers and direction.
It did help and lead me to the right track and now I can better ask the exact question:
It has to do with substitution.
Inserting text after the Regex.
The sample string I gave, they may not always be in that format. I have been looking into substitution but just can't seem to get the syntax right. And I am using VS 2008.
Any more suggestions?
Thanks
If you want to replace "test0? " with "test0? -", you would write:
string bar = Regex.Replace(foo, "^test0. ", "$0- ");
The key here is the $0 substitution, which will include the matched text.
So if I understand your question correctly, you just want your replacementText to be "$0- ".
If I understand the question correctly, couldn't you just use a match?
//Convert pattern to regex (I'm assuming this can be done with your "originalText")
Regex regex = pattern;
//For each match, replace the found pattern with the original value + " -"
foreach (Match m in regex.Matches)
{
RegEx.Replace(pattern, m.Groups[0].Value + " -");
}
So I'm not 100% clear on what you're doing, but I'll give it a try.
I'm going with the assumption that you want to use "file wildcards" (?/*) and search for a set of values that match (while retaining the values stored using the placeholder itself), then replace it with the new value (re-inserting those placeholders). given that, and probably a lot of overkill (since your requirement is kind of weird) here's what I came up with:
// Helper function to turn the file search pattern in to a
// regex pattern.
private Regex BuildRegexFromPattern(String input)
{
String pattern = String.Concat(input.ToCharArray().Select(i => {
String c = i.ToString();
return c == "?" ? "(.)"
: c == "*" ? "(.*)"
: c == " " ? "\\s"
: Regex.Escape(c);
}));
return new Regex(pattern);
}
// perform the actual replacement
private IEnumerable<String> ReplaceUsingPattern(IEnumerable<String> items, String searchPattern, String replacementPattern)
{
Regex searchRe = BuildRegexFromPattern(searchPattern);
return items.Where(s => searchRe.IsMatch(s)).Select (s => {
Match match = searchRe.Match(s);
Int32 m = 1;
return String.Concat(replacementPattern.ToCharArray().Select(i => {
String c = i.ToString();
if (m > match.Groups.Count)
{
throw new InvalidOperationException("Replacement placeholders exceeds locator placeholders.");
}
return c == "?" ? match.Groups[m++].Value
: c == "*" ? match.Groups[m++].Value
: c;
}));
});
}
Then, in practice:
String[] samples = new String[]{
"foo01", "foo02 ", "foo 03",
"bar0?", "bar0? ", "bar03 -",
"test01 ", "test02 ", "test03 "
};
String searchTemplate = "test0? ";
String replaceTemplate = "test0? -";
var results = ReplaceUsingPattern(samples, searchTemplate, replaceTemplate);
Which, from the samples list above, gives me:
matched: & modified to:
test01 test01 -
test02 test02 -
test03 test03 -
However, if you really want to save headaches you should be using replacement references. there's no need to re-invent the wheel. The above, with replacements, could have been changed to:
Regex searchRe = new Regex("test0(.*)\s");
samples.Select(x => searchRe.Replace(s, "test0$1-"));
You can catch any piece of your matched string and place anywhere in the replace statement, using symbol $ followed by the index of catched element (it starts at index 1).
You can catch element with parenthesis "()"
Example:
If I have several strings with testXYZ, being XYZ a 3-digit number, and I need to replace it, say, with testZYX, inverting the 3 digits, I would do:
string result = Regex.Replace(source, "test([0-9])([0-9])([0-9])", "test$3$2$1");
So, in your case, it can be done:
string result = Regex.Replace(source, "test0([0-9]) ", "test0$1 - ");
Related
I currently have a string which looks like this when it is returned :
//This is the url string
// the-great-debate---toilet-paper-over-or-under-the-roll
string name = string.Format("{0}",url);
name = Regex.Replace(name, "-", " ");
And when I perform the following Regex operation it becomes like this :
the great debate toilet paper over or under the roll
However, like I mentioned in the question, I want to be able to apply regex to the url string so that I have the following output:-
the great debate - toilet paper over or under the roll
I would really appreciate any assistance.
[EDIT] However, not all the strings look like this, some of them just have a single hyphen so the above method work
world-water-day-2016
and it changes to
world water day 2016
but for this one:
the-great-debate---toilet-paper-over-or-under-the-roll
I need a way to check if the string has 3 hyphens than replace those 3 hyphens with [space][hyphen][space]. And than replace all the remaining single hyphens between the words with space.
First of all, there is always a very naive solution to this kind of problem: you replace your specific matches in context with some chars that are not usually used in the current environment and after replacing generic substrings you may replace the temporary substrings with the necessary exception.
var name = url.Replace("---", "[ \uFFFD ]").Replace("-", " ").Replace("[ \uFFFD ]", " - ");
You may also use a regex based replacement that matches either a 3-hyphen substring capturing it, or just match a single hyphen, and then check if Group 1 matched inside a match evaluator (the third parameter to Regex.Replace can be a Match evaluator method).
It will look like
var name = Regex.Replace(url, #"(---)|-", m => m.Groups[1].Success ? " - " : " ");
See the C# demo.
So, when (---) part matches, the 3 hyphens are put into Group 1 and the .Success property is set to true. Thus, m => m.Groups[1].Success ? " - " : " " replaces 3 hyphens with space+-+space and 1 hyphen (that may be actually 1 of the 2 consecutive hyphens) with a space.
Here's a solution using LINQ rather than Regex:
var str = "the-great-debate---toilet-paper-over-or-under-the-roll";
var result = str.Split(new string[] {"---"}, StringSplitOptions.None)
.Select(s => s.Replace("-", " "))
.Aggregate((c,n) => $"{c} - {n}");
// result = "the great debate - toilet paper over or under the roll"
Split the string up based on the ---, then remove hyphens from each substring, then join them back together.
The easy way:
name = Regex.Replace(name, "\b-|-\b", " ");
The show-off way:
name = Regex.Replace(name, "(\b)?-(?(1)|\b)", " ");
Slightly similar to this question, I want to replace argv contents:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
to this:
"-help=none\n-URL=(default)\n-password=********\n-uname=Khanna\n-p=100"
I have tried very basic string find and search operations (using IndexOf, SubString etc.). I am looking for more elegant solution so as to replace this part of string:
-password=AnyPassword
to:
-password=*******
And keep other part of string intact. I am looking if String.Replace or Regex replace may help.
What I've tried (not much of error-checks):
var pwd_index = argv.IndexOf("--password=");
string converted;
if (pwd_index >= 0)
{
var leftPart = argv.Substring(0, pwd_index);
var pwdStr = argv.Substring(pwd_index);
var rightPart = pwdStr.Substring(pwdStr.IndexOf("\n") + 1);
converted = leftPart + "--password=********\n" + rightPart;
}
else
converted = argv;
Console.WriteLine(converted);
Solution
Similar to Rubens Farias' solution but a little bit more elegant:
string argv = "-help=none\n-URL=(default)\n-password=\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)[^\n]*", "$1********");
It matches password= literally, stores it in capture group $1 and the keeps matching until a \n is reached.
This yields a constant number of *'s, though. But telling how much characters a password has, might already convey too much information to hackers, anyway.
Working example: https://dotnetfiddle.net/xOFCyG
Regular expression breakdown
( // Store the following match in capture group $1.
password= // Match "password=" literally.
)
[ // Match one from a set of characters.
^ // Negate a set of characters (i.e., match anything not
// contained in the following set).
\n // The character set: consists only of the new line character.
]
* // Match the previously matched character 0 to n times.
This code replaces the password value by several "*" characters:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)([\s\S]*?\n)",
match => match.Groups[1].Value + new String('*', match.Groups[2].Value.Length - 1) + "\n");
You can also remove the new String() part and replace it by a string constant
In the middle of a long string, I am looking for "No. 1234. "
The number (1234) in my example above can be any length whole number. It also has to match on the space at the end.
So I am looking for examples:
1) This is a test No. 42. Hello Nice People
2) I have no idea wtf No. 1234412344124. I am doing.
I have figured out a way to match on this pattern with the following regex:
(No. [\d]{1,}. )'
What I cannot figure out, though, is how to do one simple thing when finding a match: Replace that last period with a darn comma!
So, with the two examples up above, I want to transform them into:
1) This is a test No. 42, Hello Nice People
2) I have no idea wtf No. 1234412344124, I am doing.
(Notice the commas now after the numbers)
How might one do this in C# and RegEx? Thank you!
EDIT:
Another way of looking at this is...
I can do this easily and have for years:
str = Replace(str, "Find this", "Replace it with this")
However, how can I do that by combining regex and the unknown portion of the string in the middle to replace the last period (not to be confused with the last character since the last character still needs to be a space)
This is a test No. 42. Hello Nice People
This is a test No. (some unknown length number). Hello Nice People
becomes
This is a test No. 42, Hello Nice People
This is a test No. (some unknown length number), Hello Nice People
(Notice the comma)
So you are essentially trying to match two adjacent groups, "\d+" and ". " then replace the second with ", ".
var r = new Regex(#"(\d+)(\. )");
var input = "This is a test No. 42. Hello Nice People";
var output = r.Replace(input, "$1, ");
Use the parenthesis to match two groups then with replace keep the first group and dump in the ", ".
Edit: derp, escape that period.
Edit - #1:
neilh's way is much better!
Ok, i know the code looks ugly.. i don't know how to edit the last char of a match directly in a regex
string[] stringhe = new string[5] {
"This is a test No. 42, Hello Nice People",
"I have no idea wtf No. 1234412344124. I am doing.",
"Very long No. 74385748957348957893458934; Hello World",
"Nope No. 48394839!!!",
"Nope"
};
Regex reg = new Regex(#"No.\s*([0-9]+)");
Match match;
int idx = 0;
StringBuilder builder;
foreach(string stringa in stringhe)
{
match = reg.Match(stringa);
if (match.Success)
{
Console.WriteLine("No. Stringa #" + idx + ": " + stringhe[idx]);
int indexEnd = match.Groups[1].Index + match.Groups[1].Length;
builder = new StringBuilder(stringa);
builder[indexEnd] = '.';
stringhe[idx] = builder.ToString();
Console.WriteLine("New String: " + stringhe[idx]);
}
++idx;
}
Console.ReadKey(true);
If you want to edit the char after the number of if it's a ',':
int indexEnd = match.Groups[1].Index + match.Groups[1].Length;
if (stringa[indexEnd] == ',')
{
builder = new StringBuilder(stringa);
builder[indexEnd] = '.';
stringhe[idx] = builder.ToString();
Console.WriteLine("New String: " + stringhe[idx]);
}
Or, we can edit the Regex to detect only if the number is followed by a comma with (better anyway)
No.\s*([0-9]+),
I'm not the best at Regex, but this should do what you want.
No.\s+([0-9]+)
If you except zero or more whitespaces between No. {NUMBER} this Regex should do the work:
No.\s*([0-9]+)
An example of how can look C# code:
string[] stringhe = new string[4] {
"This is a test No. 42, Hello Nice People",
"I have no idea wtf No. 1234412344124. I am doing.",
"Very long No. 74385748957348957893458934; Hello World",
"Nope No. 48394839!!!"
};
Regex reg = new Regex(#"No.\s+([0-9]+)");
Match match;
int idx = 0;
foreach(string stringa in stringhe)
{
match = reg.Match(stringa);
if (match.Success)
{
Console.WriteLine("No. Stringa #" + idx + ": " + match.Groups[1].Value);
}
++idx;
}
Here is the code :
private string Format(string input)
{
Match m = new Regex("No. [0-9]*.").Match(input);
int targetIndex = m.Index + m.Length - 1;
return input.Remove(targetIndex, 1).Insert(targetIndex, ",");
}
It is very basic question but i am not sure why it is not working. I have code where 'And' can be written in any of the ways 'And', 'and', etc. and i want to replace it with ','
I tried this:
and.Replace("and".ToUpper(),",");
but this is not working, any other way to do this or make it work?
You should check out the Regex class
http://msdn.microsoft.com/en-us/library/xwewhkd1.aspx
using System.Text.RegularExpressions;
Regex re = new Regex("\band\b", RegexOptions.IgnoreCase);
string and = "This is my input string with and string in between.";
re.Replace(and, ",");
words = words.Replace("AND", ",")
.Replace("and", ",");
Or use RegEx.
The Replace method returns a string where the replacement is visible. It does not modify the original string. You should try something along the lines of
and = and.Replace("and",",");
You can do this for all variations of "and" you may encounter, or as other answers have suggested, you could use a regex.
I guess you should take care if some words contain and, say "this is sand and sea". The word "sand" must not be influenced by the replacement.
string and = "this is sand and sea";
//here you should probably add those delimiters that may occur near your "and"
//this substitution is not universal and will omit smth like this " and, "
string[] delimiters = new string[] { " " };
//it result in: "this is sand , sea"
and = string.Join(" ",
and.Split(delimiters,
StringSplitOptions.RemoveEmptyEntries)
.Select(s => s.Length == 3 && s.ToUpper().Equals("AND")
? ","
: s));
I would also add smth like this:
and = and.Replace(" , ", ", ");
So, the output:
this is sand, sea
try this way to use the static Regex.Replace() method:
and = System.Text.RegularExpressions.Regex.Replace(and,"(?i)and",",");
The "(?i)" causes the following text search to be case-insensitive.
http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx
http://msdn.microsoft.com/en-us/library/xwewhkd1(v=vs.100).aspx
how to change
XXX#YYY.ZZZ into XXX_YYY_ZZZ
One way i know is to use the string.replace(char, char) method,
but i want to replace "#" & "." The above method replaces just one char.
one more case is what if i have XX.X#YYY.ZZZ...
i still want the output to look like XX.X_YYY_ZZZ
Is this possible?? any suggestions thanks
So, if I'm understanding correctly, you want to replace # with _, and . with _, but only if . comes after #? If there is a guaranteed # (assuming you're dealing with e-mail addresses?):
string e = "XX.X#YYY.ZZZ";
e = e.Substring(0, e.IndexOf('#')) + "_" + e.Substring(e.IndexOf('#')+1).Replace('.', '_');
Here's a complete regex solution that covers both your cases. The key to your second case is to match dots after the # symbol by using a positive look-behind.
string[] inputs = { "XXX#YYY.ZZZ", "XX.X#YYY.ZZZ" };
string pattern = #"#|(?<=#.*?)\.";
foreach (var input in inputs)
{
string result = Regex.Replace(input, pattern, "_");
Console.WriteLine("Original: " + input);
Console.WriteLine("Modified: " + result);
Console.WriteLine();
}
Although this is simple enough to accomplish with a couple of string Replace calls. Efficiency is something you will need to test depending on text size and number of replacements the code will make.
You can use the Regex.Replace method:
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace(v=VS.90).aspx
You can use the following extension method to do your replacement without creating too many temporary strings (as occurs with Substring and Replace) or incurring regex overhead. It skips to the # symbol, and then iterates through the remaining characters to perform the replacement.
public static string CustomReplace(this string s)
{
var sb = new StringBuilder(s);
for (int i = Math.Max(0, s.IndexOf('#')); i < sb.Length; i++)
if (sb[i] == '#' || sb[i] == '.')
sb[i] = '_';
return sb.ToString();
}
you can chain replace
var newstring = "XX.X#YYY.ZZZ".Replace("#","_").Replace(".","_");
Create an array with characters you want to have replaced, loop through array and do the replace based off the index.
Assuming data format is like XX.X#YYY.ZZZ, here is another alternative with String.Split(char seperator):
string[] tmp = "XX.X#YYY.ZZZ".Split('#');
string newstr = tmp[0] + "_" + tmp[1].Replace(".", "_");