Regex function not working for one illegal character - c#

I have a problem regarding illegal character usage in file name, under Windows OS.
I have the following function, which should replace any illegal characters with underscore character.
But, for some reason, when my string to be replaced is something like "ABC_test\/:*?"<>|_Jan2016_ABC", my function does not replace the backslash character and the final string is "ABC_test\_________Jan2016_ABC".
Could you please show me what am I doing wrong, because I had expected that after my function was used, no more illegal character should've been present.
My function is:
public static String ReplaceIllegalPathCharacters(String path, String replacement = "_")
{
string pattern = "[\\~#%&*{}//:<>?|\"-]";
Regex regEx = new Regex(pattern);
string final = Regex.Replace(regEx.Replace(path, replacement), #"\s+", " ");
return final;
}
Regards,

You need to double-escape your backslashes - once for C#, and once for RegEx:
string pattern = "[\\\\~#%&*{}//:<>?|\"-]";
Code I used to test:
void Main()
{
var stringToReplace = "ABC_test\\/:*?\"<>|_Jan2016_ABC";
string pattern = "[\\\\~#%&*{}//:<>?|\"-]";
Regex regEx = new Regex(pattern);
var final = regEx.Replace(stringToReplace, "_");
Console.WriteLine(final);
}

Related

How to use regex to extract a string with multiple curly braces?

I have this sample string
`{1:}{2:}{3:}{4:\r\n-}{5:}`
and I want to extract out only {4:\r\n-}
This is my code but it is not working.
var str = "{1:}{2:}{3:}{4:\r\n-}{5:}";
var regex = new Regex(#"{4:*.*-}");
var match = regex.Match(str);
You need to escape the special regex characters (in this case the opening and closing braces and the backslashes) in the search string. This would capture just that part:
var regex = new Regex("\{4:\\r\\n-\}");
... or if you wanted anything up to and including the slash before the closing brace (which is what it looks like you might be trying to do)...
var regex = new Regex("\{4:[^-]*-\}");
You just need to escape your \r and \n characters in your regular expression. You can use the Regex.Escape() method to escape characters in your regex string which returns a string of characters that are converted to their escaped form.
Working example: https://dotnetfiddle.net/6GLZrl
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
string str = #"{1:}{2:}{3:}{4:\r\n-}{5:}";
string regex = #"{4:\r\n-}"; //Original regex
Match m = Regex.Match(str, Regex.Escape(regex));
if (m.Success)
{
Console.WriteLine("Found '{0}' at position {1}.", m.Value, m.Index);
}
else
{
Console.WriteLine("No match found");
}
}
}

How can I replace "XX,XXX" with "XX XXX"?

I need to replace string like "XX,XXX" with "XX XXX". The string "XX,XXX" is in another string, e.g:
"-1299-5,"XXX,XXXX",trft,4,0,10800"
The string is fetched from a text file. I want to split the string by ",". But the comma in the substring led to the wrong result.
The X represents a char. I think regex can help, who can give me the right regex expression.
This expression,
(.*"[^,]*),([^,]*".*)
with a replacement of $1 $2 might work.
Demo
Example
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"(.*""[^,]*),([^,]*"".*)";
string substitution = #"\1 \2";
string input = #"-1299-5,""XXX,XXXX"",trft,4,0,10800";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
}
}
Simply, use 'Replace' to replace char from your string.
var test = "XXX,XXXX";
var filtered = test.Replace(',', ' ');
Console.WriteLine(filtered);
Output :
XXX XXXX

How to remove multiple first characters using regex?

I have string string A = "... :-ggw..-:p";
using regex: string B = Regex.Replace(A, #"^\.+|:|-|", "").Trim();
My Output isggw..p.
What I want is ggw..-:p.
Thanks
You may use a character class with your symbols and whitespace shorthand character class:
string B = Regex.Replace(A, #"^[.:\s-]+", "");
See the regex demo
Details
^ - start of string
[.:\s-]+ - one or more characters defined in the character class.
Note that there is no need escaping . inside [...]. The - does not have to be escaped since it is at the end of the character class.
A regex isn't necessary if you only want to trim specific characters from the start of a string. System.String.TrimStart() will do the job:
var source = "... :-ggw..-:p";
var charsToTrim = " .:-".ToCharArray();
var result = source.TrimStart(charsToTrim);
Console.WriteLine(result);
// Result is 'ggw..-:p'

Replacing all occurrences of alphanumeric characters in a string

I'm trying to replace all alphanumeric characters in my string with the character "-" using regex. So if the input is "Dune" i should get "----". currently though I'm getting just the single "-";
string s = "^[a-zA-Z0-9]*$";
Regex rgx = new Regex(s);
string s = "dune";
string result = rgx.Replace(s, "-");
Console.WriteLine(result);
Console.Read();
right now i know its looking for the string "dune" rather then the letters "d" "u" "n" "e". but i can find another class that would work.
Your regex is too greedy, remove the * and start end string matches. It should be
string s = "[a-zA-Z0-9]";
This will then only match 1 character anywhere in the string rather than all. You could also look at the shorthand for any alphanumeric
String s= "\w";
Try
string s = "[a-zA-Z0-9]";
Regex rgx = new Regex(s);
string s = "dune";
string result = rgx.Replace(s, "-");
Console.WriteLine(result);
Console.Read();
Why do you have one String s for your regular expression and another String s for your string? I would change this to eliminate confusion/error here.
Also to replace each alphanumeric character, you need to remove the beginning of string/end of string anchors ^ $ and the * quantifier meaning (0 or more times, matching the most amount possible)
Regex rgx = new Regex("[a-zA-Z0-9]");
string s = "dune";
string result = rgx.Replace(s, "-");
Console.WriteLine(result); //=> "----"

Replace any string between quotes

Problem:
Cannot find a consistent way to replace a random string between quotes with a specific string I want. Any help would be greatly appreciated.
Example:
String str1 = "test=\"-1\"";
should become
String str2 = "test=\"31\"";
but also work for
String str3 = "test=\"foobar\"";
basically I want to turn this
String str4 = "test=\"antyhingCanGoHere\"";
into this
String str4 = "test=\"31\"";
Have tried:
Case insensitive Regex without using RegexOptions enumeration
How do you do case-insensitive string replacement using regular expressions?
Replace any character in between AnyText: and <usernameredacted#example.com> with an empty string using Regex?
Replace string in between occurrences
Replace a String between two Strings
Current code:
Regex RemoveName = new Regex("(?VARIABLE=\").*(?=\")", RegexOptions.IgnoreCase);
String convertSeccons = RemoveName.Replace(ruleFixed, "31");
Returns error:
System.ArgumentException was caught
Message=parsing "(?VARIABLE=").*(?=")" - Unrecognized grouping construct.
Source=System
StackTrace:
at System.Text.RegularExpressions.RegexParser.ScanGroupOpen()
at System.Text.RegularExpressions.RegexParser.ScanRegex()
at System.Text.RegularExpressions.RegexParser.Parse(String re, RegexOptions op)
at System.Text.RegularExpressions.Regex..ctor(String pattern, RegexOptions options, Boolean useCache)
at System.Text.RegularExpressions.Regex..ctor(String pattern, RegexOptions options)
at application.application.insertGroupID(String rule) in C:\Users\winserv8\Documents\Visual Studio 2010\Projects\application\application\MainFormLauncher.cs:line 298
at application.application.xmlqueryDB(String xmlSaveLocation, TextWriter tw, String ruleName) in C:\Users\winserv8\Documents\Visual Studio 2010\Projects\application\application\MainFormLauncher.cs:line 250
InnerException:
found answer
string s = Regex.Replace(ruleFixed, "VARIABLE=\"(.*)\"", "VARIABLE=\"31\"");
ruleFixed = s;
I found this code sample at Replace any character in between AnyText: and with an empty string using Regex? which is one of the links i previously posted and just had skipped over this syntax because i thought it wouldnt handle what i needed.
var str1 = "test=\"foobar\"";
var str2 = str1.Substring(0, str1.IndexOf("\"") + 1) + "31\"";
If needed add check for IndexOf != -1
I don't know if I understood you correct, but if you want to replace all chars inside string, why aren't you using simple regular expresission
String str = "test=\"-\"1\"";
Regex regExpr = new Regex("\".*\"", RegexOptions.IgnoreCase);
String result = regExpr.Replace(str , "\"31\"");
Console.WriteLine(result);
prints:
test="31"
Note: You can take advantage of plain old XAttribute
String ruleFixed = "test=\"-\"1\"";
var splited = ruleFixed.Split('=');
var attribute = new XAttribute(splited[0], splited[1]);
attribute.Value = "31";
Console.WriteLine(attribute);//prints test="31"
var parts = given.Split('=');
return string.Format("{0}=\"{1}\"", parts[0], replacement);
In the case that your string has other things in it besides just the key/value pair of key="value", then you need to make the value-match part not match quote marks, or it will match all the way from the first value to the last quote mark in the string.
If that is true, then try this:
Regex.Replace(ruleFixed, "(?<=VARIABLE\s*=\s*\")[^\"]*(?=\")", "31");
This uses negative look-behind to match the VARIABLE=" part (with optional white space around it so VARIABLE = " would work as well, and negative look-ahead to match the ending ", without including the look-ahead/behind in the final match, enabling you to just replace the value you want.
If not, then your solution will work, but is not optimal because you have to repeat the value and the quote marks in the replace text.
Assuming that the string within the quotes does not contain quotes itself, you can use this general pattern in order to find a position between a prefix and a suffix:
(?<=prefix)find(?=suffix)
In your case
(?<=\w+=").*?(?=")
Here we are using the prefix \w+=" where \w+ denotes word characters (the variable) and =" are the equal sign and the quote.
We want to find anything .*? until we encounter the next quote.
The suffix is simply the quote ".
string result = Regex.Replace(input, "(?<=\\w+=\").*?(?=\")", replacement);
Try this:
[^"\r\n]*(?:""[\r\n]*)*
var pattern = "\"(.*)?\"";
var regex = new Regex(pattern, RegexOptions.IgnoreCase);
var replacement = regex.Replace("test=\"hereissomething\"", "\"31\"");
string s = Regex.Replace(ruleFixed, "VARIABLE=\"(.*)\"", "VARIABLE=\"31\"");
ruleFixed = s;
I found this code sample at Replace any character in between AnyText: and <usernameredacted#example.com> with an empty string using Regex? which is one of the links i previously posted and just had skipped over this syntax because i thought it wouldnt handle what i needed.
String str1 = "test=\"-1\"";
string[] parts = str1.Split(new[] {'"'}, 3);
string str2 = parts.Length == 3 ? string.Join(#"\", parts.First(), "31", parts.Last()) : str1;
String str1 = "test=\"-1\"";
string res = Regex.Replace(str1, "(^+\").+(\"+)", "$1" + "31" + "$2");
Im pretty bad at RegEx but you could make a simple ExtensionMethod using string functions to do this.
public static class StringExtensions
{
public static string ReplaceBetweenQuotes(this string str, string replacement)
{
if (str.Count(c => c.Equals('"')) == 2)
{
int start = str.IndexOf('"') + 1;
str = str.Replace(str.Substring(start, str.LastIndexOf('"') - start), replacement);
}
return str;
}
}
Usage:
String str3 = "test=\"foobar\"";
str3 = str3.ReplaceBetweenQuotes("31");
returns: "test=\"31\""

Categories