I am reading a C# source file.
When I encounter a string, I want to get it's value.
For instance, in the following example:
public class MyClass
{
public MyClass()
{
string fileName = "C:\\Temp\\A Weird\"FileName";
}
}
I would like to retrieve
C:\Temp\A Weird"FileName
Is there an existing procedure to do that?
Coding a solution with all the possible cases should be quite tricky (#, escape sequences. ...).
I am convinced such procedure exists...
I would like to have the dual function too (to inject a string into a C# source file)
Thanks in advance.
Philippe
P.S:
I gave an example with a filename, but I look for a solution working for all kinds of strings.
I'm pretty sure you can use CodeDOM to read a C# code file and parse its elements. It generates a code tree, and then you can look for nodes representing strings.
http://www.codeproject.com/Articles/2502/C-CodeDOM-parser
Other CodeDom parsers:
http://www.codeproject.com/Articles/14383/An-Expression-Parser-for-CodeDom
NRefactory: https://github.com/icsharpcode/NRefactory and http://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code
There is a way of extracting these strings using a regular expression:
("(\\"|[^"])*")
This particular one works on your simple example and gives the filename (complete with leading and trailing quote characters); whether it would work on more complex ones I can't easily tell unfortunately.
For clarity, (\\"|[^"]) matches any character apart from ", except where it has a leading \ character.
Just use ".*" Regex to match all string values, then remove trailing inverted commas and unescape it.
this will allow \" and "" characters inside your string
so both "C:\\Temp\\A Weird\"FileName" and "Hello ""World""" will match
Related
Let's say I want to assign a text (which contains many double quotes) into variable. However, the only way seems to manually escape:
string t = "Lorem \"Ipsum\" dummy......
//or//
string t = #"Lorem ""Ipsum"" dummy.....
Is there any way to avoid manual escaping, and instead use something universal (which I dont know in C#) keywoard/method to do that automatically? In PHP, it's untoldly simple, by just using single quote:
$t = 'Lorem "Ipsum" dummy .......
btw, please don't bomb me with critiques "Why do you need to use that" or etc. I need answer to the question what I ask.
I know this answer may not be satisfying, but C# sytnax simply won't allow you to do such thing (at the time of writing this answer).
I think the best solution is to use resources. Adding/removing and using strings from resources is super easy:
internal class Program
{
private static void Main(string[] args)
{
string myStringVariable = Strings.MyString;
Console.WriteLine(myStringVariable);
}
}
The Strings is the name of the resources file without the extension (resx):
MyString is the name of your string in the resources file:
I may be wrong, but I conjecture this is the simplest solution.
No. In C# syntax, the only way to define string literals is the use of the double quote " with optional modifiers # and/or $ in front. The single quote is the character literal delimiter, and cannot be used in the way PHP would allow - in any version, including the current 8.0.
Note that the PHP approach suffers from the need to escape ' as well, which is, especially in the English language, frequently used as the apostrophe.
To back that up, the EBNF of the string literal in current C# is still this:
regular_string_literal '"' { regular_string_literal_character } '"'
The only change in the compiler in version 8.0 was that now, the order of the prefix modifiers $ (interpolated) and # (verbatim) can be either #$ or $#; it used to matter annoyingly in earlier versions.
Alternatives:
Save it to a file and use File.ReadAllText for the assignment, or embed it as a managed ressource, then the compiler will provide a variable in the namespace of your choice with the verbatim text as its runtime value.
Or use single quotes (or any other special character of your choice), and go
var t = #"Text with 'many quotes' inside".Replace("'", #"""");
where the Replace part could be modeled as an extension to the String class for brevity.
I am using HighCharts and am generating script from C# and there's an unfortunate thing where they use inline functions for formatters and events. Unfortunately, I can't output JSON like that from any serializer I know of. In other words, they want something like this:
"labels":{"formatter": function() { return Highcharts.numberFormat(this.value, 0); }}
And with my serializers available to me, I can only get here:
"labels":{"formatter":"function() { return Highcharts.numberFormat(this.value, 0); }"}
These are used for click events as well as formatters, and I absolutely need them.
So I'm thinking regex, but it's been years and years and also I was never a regex wizard.
What kind of Regex replace can I use on the final serialized string to replace any quoted value that starts with function() with the unquoted version of itself? Also, the function itself may have " in it, in which case the quoted string might have \" in it, which would need to also be replaced back down to ".
I'm assuming I can use a variant of the first answer here:
Finding quoted strings with escaped quotes in C# using a regular expression
but I can't seem to make it happen. Please help me for the love of god.
I've put more sweat into this, and I've come up with
serialized = Regex.Replace(serialized, #"""function\(\)[^""\\]*(?:\\.[^""\\]*)*""", "function()$1");
However, my end result is always:
formatter:function()$1
This tells me I'm matching the proper stuff, but my capture isn't working right. Now I feel like I'm probably being an idiot with some C# specific regex situation.
Update: Yes, I was being an idiot. I didn't have a capture around what I really wanted.
`enter code here` serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1");
that gets my match, but in a case like this:
"formatter":"function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }"
it returns:
"formatter":function() { alert(\"hi!\"); return Highcharts.numberFormat(this.value, 0); }
and I need to get those nasty backslashes out of there. Now I think I'm truly stuck.
Regexp for match
"function\(\) (?<code>.*)"
Replace expression
function() ${code}
Try this : http://regexr.com?30jpf
What it does :
Finds double quotes JUST before a function declaration and immediately after it.
Regex :
(")(?=function()).+(?<=\})(")
Replace groups 1 & 3 with nothing :
3 capturing groups:
group 1: (")
group 2: ()
group 3: (")
string serialized = JsonSerializer.Serialize(chartDefinition);
serialized = Regex.Replace(serialized, #"""function\(\)([^""\\]*(?:\\.[^""\\]*)*)""", "function()$1").Replace("\\\"", "\"");
I need to escape chars in a string I have, which content is C:\\blablabla\blabla\\bla\\SQL.exe to C:\blablabla\blabla\bla\SQL.exe so I could throw a process based on this SQL.exe file.
I tried with Mystring.Replace("\\", #"\"); and Mystring.Replace(#"\\", #"\"); but none worked.
How could I do this?
EDITED: Corrected type in string.
I very strongly suspect that you are looking this input string in the Visual Studio debugger and fooling yourself that there are actually 2 \ whereas in reality there aren't. That's the reason why attempting to replace \\ with \ doesn't do anything because in the original string there is no occurrence of \\. And since you are looking the output once again in the debugger, you are once again fooling yourself that there are 2 \.
Visual Studio debugger has this tendency to escape strings. Log it to a file or print to the console and you will see that there is a single \ in your input string and you don't need to replace anything.
It looks like you're trying to replace double backslash (#"\\") in a string with single backslash (#"\"). If so try the following
Mystring = Mystring.Replace(#"\\", #"\");
Note: Are you sure that the string even contains double backslashes? Certain environments will print out a single backslash as a double (debugger for example). Your comment mentioned my approach didn't work. That's a flag that there's not actually a double backslash in your string (else it would work).
The # character specifies a string as a verbatim literal string, but that is when constructing a string. If you use Mystring.Replace("\\", #"\") then nothing will be replaced, essentially, as the two strings are the same.
If you want a string without the escape characters, then either define it with:
string path = #"C:\Some\Directory\And\File.txt";
Or you can replace the \\ with / like so:
path = path.Replace('\\', '/');
It is worth noting, as mentioned by Darin Dimitrov, that the string containing two \ characters is likely just the display of the string (i.e. when using the debugger) and not the actual value of the string.
i think OP is asking how to escape \\ in File Path, if that in the case, as OP is not mentioning where he's trying to use this. so i'm putting a guess.
Then You use Path.Combine() method to get the FileName path.
Path.Combine() Documentation
where are you looking at this output? because it could be the string is what you expect, but viewing the value through the debugger, output window, etc. is escaping the slash
Use something like:
myStr = myStr.Replace(#"\\", #"\");
Make sure you assign the result of Replace method to myStr. Otherwise it goes into void ;)
Try adding "|DataDirectory|\MyFile.xyz" where you need it. It works with connection strings it might work with something else (I haven't really tried to apply it to something else).
I didn't understand what you want, if you just want do get the file name (escape directory chars) you can try:
string fileName = Path.GetFileName(YourString)
Noloman.... when you concatenate are you perhaps missing a "\" when concatenating the directory.. I am assuming that you are trying to join directory + some sub directory.. #noloman keep in mind that in C# "c:\Temp" is written like this "c:\Temp" or #"c:\Temp" one is Literal the other is how to represent a "\" in the legacy way of coding because the "\" is an escape Char and when dealing with directorys we represent all paths and sub paths with "\"
so perhaps by you replacing the "\" you are truly messing up your own expected process
Mystring = Mystring.Replace(#"\\", #"\");
should work for you unless you are truly meaning to do
Mystring = Mystring.Replace(#"\", "\"); which if you believe that you are expecting a "\" to be used to build the directory.. then of course it will not work.. because you have just in essense replaced the backslash with a return char.. I hope that this makes sense to you..
System.IO.Directory.GetCurrentDirectory(); you are using is also an Issue.. SQL Server is not that application thats running the code.. it's your .NET application so you need to either put the location of the SQL Server into a variable, app.config, web.config ect... please edit your question and paste the code that you are using to do what it is that you want to do inregards to the SQL Server Code.. you would probably want to look at the Are you wanting to do something like Process.Start(....) meaning the file name..?
I am getting a string in the following format in the query string:
Arnstung%20Chew(20)
I want to convert it to just Arnstung Chew.
How do I do it?
Also how do I make sure that the user is not passing a script or anything harmful in the query string?
string str = "Arnstung Chew (20)";
string replacedString = str.Substring(0, str.IndexOf("(") -1 ).Trim();
string safeString = System.Web.HttpUtility.HtmlEncode(replacedString);
It's impossible to provide a comprehensive answer without knowing what variations might appear on your input text. For example, will there always be two words separated by a space followed by a number in parentheses? Or might there be other variations as well?
I have a lot of parsing code on my Black Belt Coder site, including a sscanf() replacement for .NET that may potentially be useful in your case.
Is there a heredoc notation for strings in C#, preferably one where I don't have to escape anything (including double quotes, which are a quirk in verbatim strings)?
As others have said, there isn't.
Personally I would avoid creating them in the first place though - I would use an embedded resource instead. They're pretty easy to work with, and if you have a utility method to load a named embedded resource from the calling assembly as a string (probably assuming UTF-8 encoding) it means that:
If your embedded document is something like SQL, XSLT, HTML etc you'll get syntax highlighting because it really will be a SQL (etc) file
You don't need to worry about any escaping
You don't need to worry about either indenting your document or making your C# code look ugly
You can use the file in a "normal" way if that's relevant (e.g. view it as an HTML page)
Your data is separated from your code
Well even though it doesn't support HEREDOC's, you can still do stuff like the following using Verbatim strings:
string miniTemplate = #"
Hello ""{0}"",
Your friend {1} sent you this message:
{2}
That's all!";
string populatedTemplate = String.Format(miniTemplate, "Fred", "Jack", "HelloWorld!");
System.Console.WriteLine(populatedTemplate);
Snagged from:
http://blog.luckyus.net/2009/02/03/heredoc-in-c-sharp/
No, there is no "HEREDOC" style string literal in C#.
C# has only two types of string literals:
Regular literal, with many escape sequences necessary
Verbatim literal, #-quoted: doublequotes need to be escaped by doubling
References
csharpindepth.com - General Articles - Strings
MSDN - C# Programmer's Reference - Strings
String literals are of type string and can be written in two forms, quoted and #-quoted.
November 2022 update:
Starting with C# 11 this is now possible using Raw string literals:
var longMessage = """
This is a long message.
Some "quoted text" here.
""";