Escape Character Associativity C# 6 String Interpolation - c#

Given:
double price = 5.05;
Console.Write($"{{Price = {price:C}}}");
and the desired output: {Price = $5.05}
Is there any way to associate the last two curly braces as an escaped '}' so the interpolation works as intended? As it stands, the first two are escaped(I assume?), and the output is :{Price = C}
Console.Write($"{{Price = {price:C} }}");
works as expected, but with the extra space. And I can concatenate the tail brace, which I consider a poor man's solution. Is there a colloquial rich man's solution? Thanks.

This arises because of an "oddity" in the behavior of string.Format, and our desire to have a precise 1-to-1 mapping between interpolations and inserts in the generated format string. In short, the language behavior precisely models the behavior of string.Format.
In an interpolation (the thing inside the curly braces), the expression ends either at a colon (which starts a format string), or a close curly brace. In the latter case a doubled curly brace has no special meaning because it isn't inside a literal part of the string. So three curly braces in a row would be interpreted as a close to the interpolation, followed by a literal (escaped by doubling) close curly brace. But after the colon the format string is given for that interpolation, and that format string is any string, and it is terminated by a close curly brace. If you want a close curly brace inside your format string, you simply double it up. Which is what you have unintentionally done.
CoolBots gave the best way of handling this https://stackoverflow.com/a/42993667/241658
Read the "Escaping Braces" section of https://msdn.microsoft.com/en-us/library/txafckwd(v=vs.110).aspx for an explanation of precisely this issue.

Curious workaround:
var p = price.ToString("C");
Console.Write($"{{Price = {p}}}");
For some reason, $"{{Price = {p}}}" and $"{{Price = {p:C}}}" have different associativity outcomes, which feels like a compiler bug. I'll ask around! Note that it is consistent with how string.Format applies the same rule, so it might be intentionally propagating an earlier framework oddity.

You can interpolate instead of concatenate - pass it as a string literal:
double price = 5.05;
Console.Write($"{{Price = {price:C}{"}"}");

Well you can try with less used escape characters. Maybe \b will work as it doesn't print anything and it had no function for a really long time. Something like:
double price = 5.05;
Console.Write($"{{Price = {price:C}\b}}");
If that doesn't work for you, you can try with special UNICODE characters like U+200B or U+FEFF:
double price = 5.05;
Console.Write($"{{Price = {price:C}\x8203}}");
Escape characters: https://blogs.msdn.microsoft.com/csharpfaq/2004/03/12/what-character-escape-sequences-are-available/
UNICODE space characters: https://www.cs.tut.fi/~jkorpela/chars/spaces.html

When there are some problems with C# 6 syntax why not to use traditional string.Format() instead?
double price = 5.05;
Console.WriteLine(string.Format("{{Price = {0}}}", price.ToString("C")));

Related

Interpolated string formatting issue

I have stumbled upon one issue with interpolated strings for a several times now.
Consider the following case:
double number = 123.4567;
var str = $"{{{number:F2}}}"; //I want to get "{123.45}"
Console.WriteLine(str); // Will print "{F2}"
A little surprising at first but once you realize how the curly brackets are paired it makes sense. Two following curly brackets are an escape sequence for a single curly in the interpolated string. So the opening bracket of the interpolated expression is paired with the last curly in the string.
___pair____
| |
$"{{{number:F2}}}";
Now you could do the following to break the escape sequence:
var str = $"{{{number:F2} }}"; // This will be "{123.45 }"
Notice the space character this approach adds to the output. (Not ideal)
My question:
Lets say I want to use a single interpolated string to get exactly the output "{123.45}"
Is this at all possible without doing something hackish like the following?
var s = $"{{{number:F2}{'}'}";
This is an expected behavior of string interpolation. It is mentioned at this Microsoft document. The below content is from Microsoft link only.
Opening and closing braces are interpreted as starting and ending a format item. Consequently, you must use an escape sequence to display a literal opening brace or closing brace. Specify two opening braces ("{{") in the fixed text to display one opening brace ("{"), or two closing braces ("}}") to display one closing brace ("}"). Braces in a format item are interpreted sequentially in the order they are encountered. Interpreting nested braces is not supported.
The way escaped braces are interpreted can lead to unexpected results. For example, consider the format item "{{{0:D}}}", which is intended to display an opening brace, a numeric value formatted as a decimal number, and a closing brace. However, the format item is actually interpreted in the following manner:
The first two opening braces ("{{") are escaped and yield one
opening brace.
The next three characters ("{0:") are interpreted as the start of a
format item.
The next character ("D") would be interpreted as the Decimal standard
numeric format specifier, but the next two escaped braces ("}}")
yield a single brace. Because the resulting string ("D}") is not a
standard numeric format specifier, the resulting string is
interpreted as a custom format string that means display the literal
string "D}".
The last brace ("}") is interpreted as the end of the format item.
The final result that is displayed is the literal string, "{D}". The
numeric value that was to be formatted is not displayed.
One way to write your code to avoid misinterpreting escaped braces and format items is to format the braces and format item separately. That is, in the first format operation display a literal opening brace, in the next operation display the result of the format item, then in the final operation display a literal closing brace. The following example illustrates this approach.
int value = 6324;
string output = string.Format("{0}{1:D}{2}",
"{", value, "}");
Console.WriteLine(output);
// The example displays the following output:
// {6324}
Assuming that it is not required to use a named format string, you can use:
var s = $"{{{number:#.#0}}}";

C# Regular Expression always returns FALSE

regexPattern="\w{6}(AAAAA|BBBBB|CCCCC)"
I need the strings below to return TRUE. So ANY 6 letters followed by AAAAA or BBBBB or CCCCC:
TXCDTLAAAAA000
TXCDTLBBBBB111
TXCDTLCCCCC222
but giving the pattern above I always get a FALSE in return. How do I fix this pattern to work right?
So Basically this code is working:
if (Regex.IsMatch("123456BBBBB", #"\w{6}(AAAAA|BBBBB|CCCCC)"))
{
//true
}
so I am fixing the code now
Thank you!
You didn't mention which host language you are using, but the backslash is usually an escape character in double quoted string, so if it is a common language, you may need double backslash
regexPattern="\\w{6}(AAAAA|BBBBB|CCCCC)"
Or use another way to express the pattern that doesn't require escape characters. For example, in Python you can prefix the raw string:
regexPattern = r"\w{6}(AAAAA|BBBBB|CCCCC)"
Although Python won't treat the \w as an escape sequence anyway, but it will help for others.
With C# use # (verbatim string) to accomplish it:
var regexPattern = #"\w{6}(AAAAA|BBBBB|CCCCC)";

C# .NET Regex remove all quotes of quotes excluding one instance in a sentance

I have description field which is:
16" Alloy Upgrade
In CSV format it appears like this:
"16"" Alloy Upgrade "
What would be the best use of regex to maintain the original format? As I'm learning I would appreciate it being broke down for my understanding.
I'm already using Regex to split some text separating 2 fields which are: code, description. I'm using this:
,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))
My thoughts are to remove the quotes, then remove the delimiter excluding use in sentences.
Thanks in advance.
If you don't want to/can't use a standard CSV parser (which I'd recommend), you can strip all non-doubled quotes using a regex like this:
Regex.Replace(text, #"(?!="")""(?!"")",string.Empty)
That regex will match every " character not preceded or followed by another ".
I wouldn't use regex since they are usually confusing and totally unclear what they do (like the one in your question for example). Instead this method should do the trick:
public string CleanField(string input)
{
if (input.StartsWith("\"") && input.EndsWith("\""))
{
string output = input.Substring(1,input.Length-2);
output = output.Replace("\"\"","\"");
return output;
}
else
{
//If it doesn't start and end with quotes then it doesn't look like its been escaped so just hand it back
return input;
}
}
It may need tweaking but in essence it checks if the string starts and ends with a quote (which it should if it is an escaped field) and then if so takes the inside part (with the substring) and then replaces double quotes with single quotes. The code is a bit ugly due to all the escaping but there is no avoiding that.
The nice thing is this can be used easily with a bit of Linq to take an existing array and convert it.
processedFieldArray = inputfieldArray.Select(CleanField).ToArray();
I'm using arrays here purely because your linked page seems to use them where you are wanting this solution.

String.Format with curly braces

Our low level logging library has to cope with all sorts of log messages sent to it.
Some of these messages include curly braces (as part of the text), and some contain parameters to be formatted as part of the string using String.Format
For example, this string could be an input to the Logger class:
"Parameter: {Hostname} Value: {0}"
With the correct variable sent to use for the formatter.
In order to properly do it, i must escape the curly braces that are not part of the formatting (by doubling them up).
I thought of doing it using Regex, however this is not as simple as it may seem, since i have no idea how to match these strings inside a curly braces (ones that are NOT used by String.Format for formatting purposes).
Another issue is that the Logger class should be as performance efficient as possible, starting to handle regular expressions as part of its operation may hinder performance.
Is there any proper and known best practice for this?
Doing it in just one regex:
string input = "Parameter: {Hostname} Value: {0}";
input = Regex.Replace(input, #"{([^[0-9]+)}", #"{{$1}}");
Console.WriteLine(input);
Outputs:
Parameter: {{Hostname}} Value: {0}
This of course only works as long as there aren't any parameters that contain numbers but should still be escaped with {{ }}
I think that you should look into your loggers interface. Compare with how Console.WriteLine works:
Console.WriteLine(String) outputs exactly the string given, no formatting, nothing special with { and }.
Console.WriteLine(String, Object[]) outputs using formatting. { and } are special characters that the caller must escape to {{ and }}
I think it's flawed design having to differentiate between different curly brace occurences in the code to find out what as meant. Lay the burden of escaping { that should occur in the output into {{.
I would double all the curly braces and then I would look for those to be replaced with a regex like {{\d+}} so that they came back to their original format -- {{0}} => {0} -- in your string.
So for each line I would do sth like this
string s = input.Replace("{", "{{").Replace("}", "}}");
return Regex.Replace(s, #"{{(?<val>\d+)}}",
m => { return "{" + m.Groups["val"] + "}"; }));
So that's a technical answer to the original question but #Anders Abel is perfectly right. It would be worth considering the design again...
To allow the caller to have formatted strings and cope with formitting specifiers e.g.
Logger.Log("{0:dd/mm/yyy} {0:hh:mm:ss} {hostname} Some error {1:x4} happened on {123Component}!", DateTime.UtcNow, 257)
You'd need a regex like:
string input = "{0:dd/mm/yyy} {0:hh:mm:ss} {hostname} Some error {1:x4} happened on {123Component}!";
Regex reg = new Regex(#"(\{[^[0-9}]+?[^}]*\}|\{(?![0-9]+:)[^}]+?\})");
string output = reg.Replace(input, "{$1}");
Console.WriteLine(output);
This outputs:
"{0:dd/mm/yyy} {0:hh:mm:ss} {{hostname}} Some error {1:x4} happened on {{123Component}}!"
But to reiterate, I'd agree with Anders Abel that you ought to redesign to avoid the need for the log library to do this.

Finding C#-style unescaped strings using regular expressions

I'm trying to write a regular expression that finds C#-style unescaped strings, such as
string x = #"hello
world";
The problem I'm having is how to write a rule that handles double quotes within the string correctly, like in this example
string x = #"before quote ""junk"" after quote";
This should be an easy one, right?
Try this one:
#".*?(""|[^"])"([^"]|$)
The first parantheses mean 'If there is an " before the finishing quote, it better be two of them', the second parantheses mean 'After the finishing quote, there sould ether be not a quote, or the end of the line'.
How 'bout the regex #\"([^\"]|\"\")*\"(?=[^\"])
Due to greedy matching, the final lookahead clause is likely not to be needed in your regex engine, although it is more specific.
If I remember correctly, you have to use \"" - the double-double quotes to hash it for C# and the backslash to hash it for regex.
Try this:
#"[^"]*?(""[^"]*?)*";
It looks for the starting characters #", for the ending characters "; (you can leave the semicolon out if you need to) and in between it can have any characters except quotes, or if there are quotes they have to be doubled.
#"(?:""|[^"])*"(?!")
is the right regex for this job. It matches the #, a quote, then either two quotes in a row or any non-quote character, repeating this up unto the next quote (that isn't doubled).
"^#(""|[^"])*$" is the regex you want, looking for first an at-sign and a double-quote, then a sequence of any characters (except double-quotes) or double double-quotes, and finally a double-quote.
As a string literal in C#, you'd have to write it string regex = "^#\"(\"\"|[^\"])*\"$"; or string regex = #"^#""(""""|[^""])*""$";. Choose your poison.

Categories