I have this pattern to validate that if a string is correct or not.
public string isStringConstant(string vPart)
{
string pattern = #"^\""[\\(t|r|n|0|b|f|""|a|v|')a-zA-Z0-9,\.\*/;\~!\{\}##\$%\^:&\(\)\+\[\]<>_\?=\-`]*\""$";
Regex obj = new Regex(pattern);
if (obj.IsMatch(vPart))
{
return "StringConstant";
}
return "INVALID";
}
It works well but it also validates the following string which it should not.
"harisnabeel\"
The input string source is a text file.
What wrong I am doing with that pattern?
Look at this part: \\(t|r|n|0|b|f|""|a|v|')
You are trying to shortcut having to write out \t\r\n, etc. In the process you have \, which says \ is a valid character in your string. Rewrite your "or this or that" portion long hand and you should be fine. Don't have time to test this personally, but a bit of experimentation with that will solve your problem.
If we follow the title (Regex to validate String like C# does), the full regex should be:
// https://learn.microsoft.com/it-it/dotnet/csharp/language-reference/language-specification/lexical-structure#string-literals
// A character that follows a backslash character (\) in
// a regular_string_literal_character must be one of the following
// characters: ', ", \, 0, a, b, f, n, r, t, u, U, x, v.
// Otherwise, a compile-time error occurs.
var rx = new Regex(#"^""(\\(?:['""\\0abfnrtv]|u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8}|x[0-9A-Fa-f]{1,4})|[^\\""]+)*""$");
All the "" are double, because the regex is based on a string-literal (#"..."), so they became a single " each.
Note the ending [^\\""] that becomes [^\\"] because the string is a literal string, so any character that isn't an escape \ or a " (the \ is handled separately, the unescaped " is the ending of the string). Note even the doubling of the \\, because otherwise a single \ becomes an escape sequence of the regex, and the handling of \x, \u and \U.
Example of use:
var res = rx.Match(#"""Hello\nWorld\\\""\x123G""");
The "pieces" of the string:
var pieces = res.Groups[1].Captures.Cast<Capture>().ToArray();
In this case Hello, \n, World, \x123, G
Related
I am trying to replace \" in a string with ", how may i do that?
I've tried using replace but i could not find a way to do it.
Ex:
string line = "This is a \"sample\" "
string replaced = "This is a "sample" ".
Thanks.
Because quotes are used to start and end strings (they are a type of control character), you can't have a quote in the middle of a string because it would terminate the string
string replaced = "This is a "sample" ";
/*
You can see from the syntax highlighting (red) that the string is being
detected as <This is a > and <sample> is black meaning it is detected as
code (and will cause a syntax error)
*/
In order to put a quote in the middle of the string we escape it (escaping means to treat it as a character literal instead of a control character) using the escape character, which in C# is backslash.
string line = "This is a \"sample\"";
Console.WriteLine(line);
// Output: This is a "sample"
string literalLine = #"This is a ""sample""";
Console.WriteLine(literalLine);
// Output: This is a "sample"
The # symbol in C# means I want this to be a literal string (ignore control characters), however quotes still start and end strings so in order to print a quote in a literal string you write two of them "" (that's how the language is designed)
Case 1: If the value within the variable line is actually This is a \"sample\", then you could do line.Replace("\\\"", "\"").
If not:
\" is an escape sequence. it shows up as \" in the code, however when it compiles it would show up as " instead of the original \".
The reason for escaping quotes is because the compiler cannot identify whether the quote is within another quote or not. Let's see your example:
"This is a "sample" "
is this This is a as one group, then an unknown token sample, then another quote ? or This is a "sample" all within a quote? We can take a guess by looking at the context, but compiler cannot. Hence, we use escape sequence to tell the compiler "I used this double quote character as a character, not the closing/opening of a string literal."
See Also: https://en.wikipedia.org/wiki/Escape_sequences_in_C
You may try something like this:
String str = "This is a \"sample\" ";
Console.WriteLine("Original string: {0}", str);
Console.WriteLine("Replaced: {0}", str.Replace('\"', '"'));
Desired output : This is a sample
Given string : "This is a \"sample\""
The problem: you have escape characters protecting the double quotes from being interpreted. the \ escape character is an instruction to use a quotation mark literally instead of using it to indicate a break in the string. This means the actual string value is "This is a "sample"" when served as output.
The answer removing the \ may work, but it makes for smelly code because removing an escape character in this way can make it unclear what you intend and prevents you from escaping any character.
Removing the " might work, though it prevents use of any quotes and some IDEs might leave the escape character behind to ruin your day.
We want one specific target, the quotes around "sample".
string sample = "This is a \"sample\"";
List<string> sampleArray = sample.Split(' ').ToList(); // samplearray is now split into ["This", "is", "a", "\"sample\""]
var x = sampleArray.FirstOrDefault(t => t == "\"sample\""); //isolate our needed value
if (x != null) //prevent a null reference in case something went wrong and samplearray wasnt as expected
{
var index = sampleArray.IndexOf(x); //get the location of the value we just picked
x = x.Replace("\"", string.Empty); //replace chars
sampleArray[index] = x; //assign new value to the list
}
return String.Join(" ", sampleArray); //return the string joined together with spaces.
Try this:
string line="This is a \"sample\" " ;
replaced =line.Replace(#"\", "");
How can I make this C# Regex to not include the first character before the URL in the matching results:
((?!\").)https?:\/\/twitter\.com\/(?:#!\/)?(\w+)\/status(?:es)?\/(\d+)
This will match:
Xhttps://twitter.com/oppomobileindia/status/798397636780953600
Notice the first X letter.
I want it to match the URLs that start without double quotes. Also not include the first character before the https for those URLs that do not start with double quotes.
An actual example that I use in my code:
var str = "<div id=\"content\">
<p>https://twitter.com/oppomobileindia/status/798397636780953600</p>
<p>\"https://twitter.com/oppomobileindia/status/11111111111111111111</p></div>";
var pattern = #"(?<!""')https?://twitter\.com/(?:#!/)?(\w+)/status(?:es)?/(\d+)";//
var rgx = new Regex(pattern);
var results = rgx.Replace(str, "XXX");
In the above example, only the first URL should be replaces, because the second one has double quotation before the URL. It also should be replaced at the exact match, without the first letter before the matches string.
Use a (?<!") negative lookbehind:
var re = #"(?<!"")https?://twitter\.com/(?:#!/)?(\w+)/status(?:es)?/(\d+)";
The (?<!") means that there cannot be a " immediately before the current location.
In C#, you do not need to escape / inside the pattern since regex delimiters are not used when defining the regex.
Note on the C# syntax: if you want to define a " inside a verbatim string literal, double it. In a regular string literal, escape the " and \:
var re = "(?<!\")https?://twitter\\.com/(?:#!/)?(\\w+)/status(?:es)?/(\\d+)";
I have a long string (a path) with double backslashes, and I want to replace it with single backslashes:
string a = "a\\b\\c\\d";
string b = a.Replace(#"\\", #"\");
This code does nothing...
b remains "a\\b\\c\\d"
I also tried different combinations of backslashes instead of using #, but no luck.
Because you declared a without using #, the string a does not contain any double-slashes in your example. In fact, in your example, a == "a\b\c\d", so Replace does not find anything to replace. Try:
string a = #"a\\b\\c\\d";
string b = a.Replace(#"\\", #"\");
In C#, you can't have a string like "a\b\c\d", because the \ has a special meaning: it creates a escape sequence together with a following letter (or combination of digits).
\b represents actually a backspace, and \c and \d are invalid escape sequences (the compiler will complain about an "Unrecognized escape sequence").
So how do you create a string with a simple \? You have to use a backslash to espace the backslash:\\ (it's the espace sequence that represents a single backslash).
That means that the string "a\\b\\c\\d" actually represents a\b\c\d (it doesn't represent a\\b\\c\\d, so no double backslashes). You'll see it yourself if you try to print this string.
C# also has a feature called verbatim string literals (strings that start with #), which allows you to write #"a\b\c\d" instead of "a\\b\\c\\d".
You're wrong. "\\" return \ (know as escaping)
string a = "a\\b\\c\\d";
System.Console.WriteLine(a); // prints a\b\c\d
string b = a.Replace(#"\\", #"\");
System.Console.WriteLine(b); // prints a\b\c\d
You don't even need string b = a.Replace(#"\\", #"\");
this works
You don't even need string b = a.Replace(#"\", #"\");
but like if we generate a dos command through c# code... eg:- to delete a file
this wil help
I did this in a code in a UWP application.
foreach (var item in Attendances)
{
string a = item.ImagePath;
string b = a.Replace(#"\\", "/");
string c = a.Replace("\\", "/");
Console.WriteLine(b);
Console.WriteLine(a);
item.ImagePath = c;
}
and the ones without the # symbol is the one that actually worked. this is C# 8 and C# 9
Using C# we can do string check like if string.contains() method, e.g.:
string test = "Microsoft";
if (test.Contains("i"))
test = test.Replace("i","a");
This is fine. But what if I want to replace a string which contains " symbol to be replaced.
I want to achieve this:
"<html><head>
I want to remove the " symbol present in check so that the result would be:
<html><head>
The " character can also be replaced, just like any other:
test = test.Replace("\"","");
Also, note that you don't have to test if the character exists : your test.Contains("i") could be removed since the .Replace() method won't do anything (no replace, no error thrown) if the character doesn't exist inside the string.
To include a quote symbol in a string, you need to escape it, using a backslash. In your example, you want to use something lik this:
if (test.Contains("\""))
There are two ways to include a '"' character in a string literal. All the answers so far have used the c-style way:
var quotation = "Parting is such sweet sorrow";
var howSweetIsIt = quotation + " that I shall say \"good-night\" till it be morrow.";
In some contexts (especially for users experienced with Visual Basic), the verbatim string literal may be easier to read. A verbatim string literal begins with an # sign, and the only character that requires escaping is the quotation mark -- all other characters are included verbatim (hence the name). Significantly, the method of escaping the quotation mark is different: rather than preceding it with a backslash, it must be doubled:
var howSweetIsIt = quotation + " that I shall say ""good-night"" till it be morrow.";
string SymbolString = "Micro\"so\"ft";
The string above use scape char \ to insert " between the characters
string Result = SymbolString.Replace("\"", string.Empty);
With the following replace I replace the character "" for empty.
This is what you try to achieve?
if (check.Contains("\"")
output = check.Replace("\"", "");
output = check.Replace("\"", "");
Just remember to use "\"" for the quote sign as the backslash is an escape character.
if (str.Contains("\""))
{
str = str.Replace("\"", "");
}
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What does # mean at the start of a string in C#?
Sorry but I can't find this on Google. I guess it maybe is not accepting my search string when I do a search.
Can someone tell me what this means in C#
var a = #"abc";
what's the meaning of the #?
It is a string literal. Which basically means it will take any character except ", including new lines. To write out a ", use "".
The advantage of #-quoting is that escape sequences are not processed,
which makes it easy to write, for example, a fully qualified file
name:
#"c:\Docs\Source\a.txt" // rather than "c:\\Docs\\Source\\a.txt"
It means it's a literal string.
Without it, any string containing a \ will consider the next character a special character, such as \n for new line. With a # in front, it will treat the \ literally.
In the example you've given, there is no difference in the output.
This says that the characters inside the double quotation marks should be interpreted exactly as they are.
You can see that the backslash is treated as a character and not an
escape sequence when the # is used. The C# compiler also allows you to
use real newlines in verbatim literals. You must encode quotation
marks with double quotes.
string fileLocation = "C:\\CSharpProjects";
string fileLocation = #"C:\CSharpProjects";
Look at here for examples.
C# supports two forms of string literals: regular string literals and verbatim string literals.
A regular string literal consists of zero or more characters enclosed
in double quotes, as in "hello", and may include both simple escape
sequences (such as \t for the tab character) and hexadecimal and
Unicode escape sequences.
A verbatim string literal consists of an # character followed by a
double-quote character, zero or more characters, and a closing
double-quote character. A simple example is "hello". In a verbatim
string literal, the characters between the delimiters are interpreted
verbatim, the only exception being a quote-escape-sequence. In
particular, simple escape sequences and hexadecimal and Unicode
escape sequences are not processed in verbatim string literals. A
verbatim string literal may span multiple lines.
Code Example
string a = "hello, world"; // hello, world
string b = #"hello, world"; // hello, world
string c = "hello \t world"; // hello world
string d = #"hello \t world"; // hello \t world
string e = "Joe said \"Hello\" to me"; // Joe said "Hello" to me
string f = #"Joe said ""Hello"" to me"; // Joe said "Hello" to me
string g = "\\\\server\\share\\file.txt"; // \\server\share\file.txt
string h = #"\\server\share\file.txt"; // \\server\share\file.txt
string i = "one\r\ntwo\r\nthree";
string j = #"one
two
three";
Reference link: MSDN