Slightly similar to this question, I want to replace argv contents:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
to this:
"-help=none\n-URL=(default)\n-password=********\n-uname=Khanna\n-p=100"
I have tried very basic string find and search operations (using IndexOf, SubString etc.). I am looking for more elegant solution so as to replace this part of string:
-password=AnyPassword
to:
-password=*******
And keep other part of string intact. I am looking if String.Replace or Regex replace may help.
What I've tried (not much of error-checks):
var pwd_index = argv.IndexOf("--password=");
string converted;
if (pwd_index >= 0)
{
var leftPart = argv.Substring(0, pwd_index);
var pwdStr = argv.Substring(pwd_index);
var rightPart = pwdStr.Substring(pwdStr.IndexOf("\n") + 1);
converted = leftPart + "--password=********\n" + rightPart;
}
else
converted = argv;
Console.WriteLine(converted);
Solution
Similar to Rubens Farias' solution but a little bit more elegant:
string argv = "-help=none\n-URL=(default)\n-password=\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)[^\n]*", "$1********");
It matches password= literally, stores it in capture group $1 and the keeps matching until a \n is reached.
This yields a constant number of *'s, though. But telling how much characters a password has, might already convey too much information to hackers, anyway.
Working example: https://dotnetfiddle.net/xOFCyG
Regular expression breakdown
( // Store the following match in capture group $1.
password= // Match "password=" literally.
)
[ // Match one from a set of characters.
^ // Negate a set of characters (i.e., match anything not
// contained in the following set).
\n // The character set: consists only of the new line character.
]
* // Match the previously matched character 0 to n times.
This code replaces the password value by several "*" characters:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)([\s\S]*?\n)",
match => match.Groups[1].Value + new String('*', match.Groups[2].Value.Length - 1) + "\n");
You can also remove the new String() part and replace it by a string constant
Related
My requirement
If string contains single slash (/ or \) it should be replace with
double slash
Note :- string is randomly generated so, I have no control.
e.g. I have string
string str = #"*?i//y\^Pk#t9`n2";
When I tried as
str = str.Replace(#"\", #"\\").Replace(#"/",#"//");
it replaced // with //// but I need to replace only single slash(\) with double slash(\\).
Above code actual result is
*?i////y\^Pk#t9`n2
expected result is
*?i//y\\^Pk#t9`n2
Note :- If string contain double slash in sequence like "//" or "\\" then no need to modify string. but string contains single slash (/ or \) need to replace with double slash.
I have tried to find out other approach then I found following stack-overflow already question-answer
Replace single backslash with double backslash
Replace "\\" with "\" in a string in C#
How to change backslash to double backslash?
Question :-
How to check if string contain single slash and how to replace it?
What best practice should follows while doing string manipulation like this?
Edit :-
I have random generated string comes from user like.
string str = #"*?i//y\^Pk#t9`n2";
sometimes that string contain single slash as above (\). if we consider above string without verbatim(#) it is not a valid string in C#. it gives compile time error. to make above string valid I need to replace "\" with "\\".
How I can achieve this?
Pls try this, first i repleced all double slash with single slash and then vice versa:
var str = #"*?i//y\^Pk#t9`n2";
var tempStr = str.Replace(#"\\", #"\").Replace(#"//",#"/");
var result = tempStr.Replace(#"\", #"\\").Replace(#"/",#"//");
I had to do two Regex.Replace and use look arounds to achieve this. The final solution was
Regex.Replace(Regex.Replace(str, #"(?<!\/)\/(?!\/)", #"//"), #"(?<!\\)\\(?!\\)", #"\\")
If you've never dealt with regex before, it can be a beast. Essentially I am looking for all backslashes and forward slashes (\\ and \/ escaped) and once I match a backslash and forward slash, I am going to use negative lookbehinds and negative aheads to not match if it there is a match in front or behind it.
Negative Look Behind:
(?<!\/)
Negative Look Ahead:
(?!\/)
I am then repeating it twice for forward slashes and backwards slashes
The best solution might be to roll your own algorithm. Step through the string character by character looking for a slash, and if it finds one, check the next character and previous, if 1 of those exist, then do not insert a duplicate slash because that means it is not alone
This replaces all of the individual occurrences of a character and also fills up an odd number of occurrences:
public static string ReplaceSingle(this string s, char needle)
{
var valueSpan = s.AsSpan();
var length = valueSpan.Length * 2;
char[]? resultArray = null;
Span<char> resultSpan = length <= 256
? stackalloc char[length]
: (resultArray = ArrayPool<char>.Shared.Rent(length));
var value = char.MinValue;
var written = 0;
for (int index = 0; index < valueSpan.Length; index++)
{
value = valueSpan[index];
resultSpan[written++] = value;
if (value == needle && ++index < valueSpan.Length)
{
value = valueSpan[index];
resultSpan[written++] = value == needle ? value : needle;
}
}
var result = new string(resultSpan[..written]);
resultSpan.Clear();
if (resultArray is not null) ArrayPool<char>.Shared.Return(resultArray);
return result;
}
For instance, if you have / it will turn to //, but // will remain. However /// will turn to //// and so on.
There is also a usage of ArrayPool and stackalloc which are aimed at better performance.
Usage:
string value = "This/ is a //Test ///!";
string result = value.ReplaceSingle('/');
I want to replace a string if it is a part of another string from both ends.
Say for example a string +35343+3566. I want to replace +35 with 0 only if it is surrounded with characters from both sides. So desired outcome would be +35343066.
Normally I'd use line.Replace("+35", "0") and perhaps if-else to meet a condition
string a = "+35343+3566";
string b = a.Replace("+35", "0");
I would want 'b = +35343066 and not b = 0343066`
You can use regex for this. For example:
var replaced = Regex.Replace("+35343+3566", "(?<=.)(\\+35)(?=.)", "0");
// replaced will contain +35343066
So what this pattern is saying is that +35 (\\+35) must have one character behind (?<=.) and one character ahead (?=.)
You can do this with a Regular Expression, as follows:
string a = "+35343+3566";
var regex = new Regex(#"(.)\+35(.)"); // look for "+35" between any 2 characters, while remembering the characters that were found in ${1} and ${2}
string b = regex.Replace(a, "${1}0${2}"); // replace all occurences with "0" surrounded by both characters that were found
See Fiddle: https://dotnetfiddle.net/OdCKsy
Or slightly simpler, if it turns out that only the prefix character matters:
string a = "+35343+3566";
var regex = new Regex(#"(.)\+35"); // look for a character followed by "+35", while remembering the character that was found in ${1}
string b = regex.Replace(a, "${1}0"); // replace all occurences with the character that was found followed by a 0
See Fiddle: https://dotnetfiddle.net/9jEHMN
I'm a doing an massive uploading of information from a .csv file and I need replace this character non ASCII "�" for a normal space, " ".
The character "�" corresponds to "\uFFFD" for C, C++, and Java, which it seems that it is called REPLACEMENT CHARACTER. There are others, such as spaces type like U+FEFF, U+205F, U+200B, U+180E, and U+202F in the C# official documentation.
I'm trying do the replace this way:
public string Errors = "";
public void test(){
string textFromCsvCell = "";
string validCharacters = "^[0-9A-Za-z().:%-/ ]+$";
textFromCsvCell = "This is my text from csv file"; //All spaces aren't normal space " "
string cleaned = textFromCsvCell.Replace("\uFFFD", "\"")
if (Regex.IsMatch(cleaned, validCharacters ))
//All code for insert
else
Errors=cleaned;
//print Errors
}
The test method shows me this text:
"This is my�texto from csv file"
I try some solutions too:
Trying solution 1: Using Trim
Regex.Replace(value.Trim(), #"[^\S\r\n]+", " ");
Try solution 2: Using Replace
System.Text.RegularExpressions.Regex.Replace(str, #"\s+", " ");
Try solution 3: Using Trim
String.Trim(new char[]{'\uFEFF', '\u200B'});
Try solution 4: Add [\S\r\n] to validCharacters
string validCharacters = "^[\S\r\n0-9A-Za-z().:%-/ ]+$";
Nothing works.
How can I replace it?
Sources:
Unicode Character 'REPLACEMENT CHARACTER' (U+FFFD)
Trying to replace all white space with a single space
Strip the byte order mark from string in C#
Remove extra whitespaces, but keep new lines using a regular expression in C#
EDITED
This is the original string:
"SYSTEM OF MONITORING CONTINUES OF GLUCOSE"
in 0x... notation
SYSTEM OF0xA0MONITORING CONTINUES OF GLUCOSE
Solution
Go to the Unicode code converter. Look at the conversions and do the replace.
In my case, I do a simple replace:
string value = "SYSTEM OF MONITORING CONTINUES OF GLUCOSE";
//value contains non-breaking whitespace
//value is "SYSTEM OF�MONITORING CONTINUES OF GLUCOSE"
string cleaned = "";
string pattern = #"[^\u0000-\u007F]+";
string replacement = " ";
Regex rgx = new Regex(pattern);
cleaned = rgx.Replace(value, replacement);
if (Regex.IsMatch(cleaned,"^[0-9A-Za-z().:<>%-/ ]+$"){
//all code for insert
else
//Error messages
This expression represents all possible spaces: space, tab, page break, line break and carriage return
[ \f\n\r\t\v\u00a0\u1680\u180e\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u2028\u2029\u202f\u205f\u3000]
References
Regular expressions (MDN)
Using String.Replace:
Use a simple String.Replace().
I've assumed that the only characters you want to remove are the ones you've mentioned in the question: � and you want to replace them by a normal space.
string text = "imp�ortant";
string cleaned = text.Replace('\u00ef', ' ')
.Replace('\u00bf', ' ')
.Replace('\u00bd', ' ');
// Returns 'imp ortant'
Or using Regex.Replace:
string cleaned = Regex.Replace(text, "[\u00ef\u00bf\u00bd]", " ");
// Returns 'imp ortant'
Try it out: Dotnet Fiddle
Define a range of ASCII characters, and replace anything that is not within that range.
We want to find only Unicode characters, so we will match on a Unicode character and replace.
Regex.Replace("This is my te\uFFFDxt from csv file", #"[^\u0000-\u007F]+", " ")
The above pattern will match anything that is not ^ in the set [ ] of this range \u0000-\u007F (ASCII characters (everything past \u007F is Unicode)) and replace it with a space.
Result
This is my te xt from csv file
You can adjust the range provided \u0000-\u007F as needed to expand the range of allowed characters to suit your needs.
If you just want ASCII then try the following:
var ascii = new ASCIIEncoding();
byte[] encodedBytes = ascii.GetBytes(text);
var cleaned = ascii.GetString(encodedBytes).Replace("?", " ");
I have no experience using regular expressions, and although I should spend some time training in them, I have a need for a simple one.
I want to find a match of P*.txt in a given string (meaning anything that starts with a P, followed by anything, and ending in ".txt".
eg:
string myString = "P671221.txt";
Regex reg = new Regex("P*.txt"); //<--- what goes here?
if (reg.IsMatch(myString)
{
Console.WriteLine("Match!"));
}
This example doesn't work because it will return a match for ".txt" or "x.txt" etc. How do I do this?
myString.StartsWith("P") && myString.EndsWith(".txt")
EDIT: Removed my regex
Updated:
string start + (p) + any characters + .txt + string end
^(?i:p).*\.txt$
A more precise alternative would be:
string start + (p) + [specific characters] + .txt + string end
( currently specified are: "a-z", "0-9", space, & underscore )
^(?i:p)(?i:[a-z0-9 _])*\.txt$
Live Demo
Original Solution
( quotes were included, as I overlooked that quotes are part of the code but not
the string )
preceding quotes + (p) + any characters + .txt + following quotes
(?<=")(?i:p).*\.txt(?=")
Image
Live Demo
P[\d]+\.txt this will work. If you have fix number of digits then you can do it like P[\d]{6}\.txt. Just replace the 6 with your desired fix number.
If the value in between the starting letter P and extension .txt can be alphanumeric use P[\w]+\.txt
string myString = "P671221.txt";
Regex reg = new Regex("P(.*?)\\.txt"); //--> if anything goes after P
if (reg.IsMatch(myString))
Console.WriteLine("Match!");
This should meet the requirements that you have presented.
c#
[Pp].*.(?:txt)+$
The best option to get files that start with P & end with .txt with regex is:
^P\w+\.txt$
I have extract the 3 usable field from a string. There is no common delimiter, there can be both blank spaces and tabs.
First, what I am doing is replacing all double blanks and tabs by '**'
Given String :
cont = Gallipelle 04/04/2012 16.03.03 5678
I am using:
cont.Replace(" ", "**").Replace(" ", "**").Replace(" ", "**").Replace("**", "").Trim()
The answer becomes:
****** Gallipelle******04/04/2012 16.03.03************************ 5678*****
Is the approach correct? How do I extract the stuffs from here? I just need all the extracts in string datatype.
Just use String.Split:
var fields = cont.Split(new[] { " ", "\t" },
StringSplitOptions.RemoveEmptyEntries);
Adding StringSplitOptions.RemoveEmptyEntries makes sure that if there are multiple consecutive tabs and/or spaces they will "count as one" when extracting the results.
An alternate option would be to use a regular expression.
You can use regex groups to find out three values name, date, number.
A group is defined as (?<group_name><regex_expr>)
So you could write
Regex regex = new Regex("(?<name>(\\S*))(\\s*)(?<date>((\\S*)\\s(\\S*)))(\\s*)(?<number>(\\d*))");
Match match = regex.Match(yourString);
if (match.Success)
{
string name = match.Groups["name"].Value;
string date = match.Groups["date"].Value;
string number = match.Groups["number"].Value;
}
\s* matches sequence of whitespaces which includes tabs.
\S* matches sequence of non-whitespace characters.
\d* matches sequence of digits.
(new Regex("\\s+")).Split(yourstring)
http://msdn.microsoft.com/en-us/library/8yttk7sy.aspx
var myText="cont = Gallipelle 04/04/2012 16.03.03 5678";
var splitString=myText.split(" ");
// splitString[1] == Gallipelle
// splitString[2] == 04/04/2012
// splitString[3] == 16.03.03
// splitString[4] == 5678
No. No need to replace it with any other delimiter. You can use String's split function and give 'space' as delimiter character. e.g. in VB.Net:
Dim value As String() = cont.split(CChar(" "))
this will give you a string array whose values you can access: value(0), value(1) and value(2)