What does a '$' string prefix mean when not used for interpolation? - c#

I've just come across this code in an article on .NET Core configuration, and the dictionary keys look very foreign to me:
static IReadOnlyDictionary<string, string> DefaultConfigurationStrings{get;} = new Dictionary<string, string>()
{
["Profile:UserName"] = Environment.UserName,
[$"AppConfiguration:ConnectionString"] = DefaultConnectionString,
[$"AppConfiguration:MainWindow:Height"] = "400",
[$"AppConfiguration:MainWindow:Width"] = "600",
[$"AppConfiguration:MainWindow:Top"] = "0",
[$"AppConfiguration:MainWindow:Left"] = "0",
};
Why the $ and why the [...]? Something to do with not using comma-separated key-value pairs?

The brackets are the new index initializers. It's just another way to initialize a dictionary that's a bit more readable than the collection initializers that have been there before.
As for the strings, when not using interpolations, it's just a simple string, nothing more. Presumably the compiler won't even generate the string.Format call for such strings. You can remove the $ just as well (it's the same with verbatim string literals when not using any of the different features).
One possible reason for the $"" strings might be, though, that it's a hint to either a source code preprocessor or static analysis tool that the string is somehow special. I've seen something similar used with verbatim string literals to prevent ReSharper from suggesting to translate the string for i18n. That's something you'd look for in your build environment, though.

Related

Parsing this special format file

I have a file that is formatted this way --
{2000}000000012199{3100}123456789*{3320}110009558*{3400}9876
54321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX
78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLAS
TX 73920**
Basically, the number in curly brackets denotes field, followed by the value for that field. For example, {2000} is the field for "Amount", and the value for it is 121.99 (implied decimal). {3100} is the field for "AccountNumber" and the value for it is 123456789*.
I am trying to figure out a way to split the file into "records" and each record would contain the record type (the value in the curly brackets) and record value, but I don't see how.
How do I do this without a loop going through each character in the input?
A different way to look at it.... The { character is a record delimiter, and the } character is a field delimiter. You can just use Split().
var input = #"{2000}000000012199{3100}123456789*{3320}110009558*{3400}987654321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**";
var rows = input.Split( new [] {"{"} , StringSplitOptions.RemoveEmptyEntries);
foreach (var row in rows)
{
var fields = row.Split(new [] { "}"}, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine("{0} = {1}", fields[0], fields[1]);
}
Output:
2000 = 000000012199
3100 = 123456789*
3320 = 110009558*
3400 = 987654321*
3600 = CTR
4200 = D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**
5000 = D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**
Fiddle
This regular expression should get you going:
Match a literal {
Match 1 or more digts ("a number")
Match a literal }
Match all characters that are not an opening {
\{\d+\}[^{]+
It assumes that the values itself cannot contain an opening curly brace. If that's the case, you need to be more clever, e.g. #"\{\d+\}(?:\\{|[^{])+" (there are likely better ways)
Create a Regex instance and have it match against the text. Each "field" will be a separate match
var text = #"{123}abc{456}xyz";
var regex = new Regex(#"\{\d+\}[^{]+", RegexOptions.Compiled);
foreach (var match in regex.Matches(text)) {
Console.WriteLine(match.Groups[0].Value);
}
This doesn't fully answer the question, but it was getting too long to be a comment, so I'm leaving it here in Community Wiki mode. It does, at least, present a better strategy that may lead to a solution:
The main thing to understand here is it's rare — like, REALLY rare — to genuinely encounter a whole new kind of a file format for which an existing parser doesn't already exist. Even custom applications with custom file types will still typically build the basic structure of their file around a generic format like JSON or XML, or sometimes an industry-specific format like HL7 or MARC.
The strategy you should follow, then, is to first determine exactly what you're dealing with. Look at the software that generates the file; is there an existing SDK, reference, or package for the format? Or look at the industry surrounding this data; is there a special set of formats related to that industry?
Once you know this, you will almost always find an existing parser ready and waiting, and it's usually as easy as adding a NuGet package. These parsers are genuinely faster, need less code, and will be less susceptible to bugs (because most will have already been found by someone else). It's just an all-around better way to address the issue.
Now what I see in the question isn't something I recognize, so it's just possible you genuinely do have a custom format for which you'll need to write a parser from scratch... but even so, it doesn't seem like we're to that point yet.
Here is how to do it in linq without slow regex
string x = "{2000}000000012199{3100}123456789*{3320}110009558*{3400}987654321*{3600}CTR{4200}D2343984*JOHN DOE*1232 STREET*DALLAS TX78302**{5000}D9210293*JANE DOE*1234 STREET*SUITE 201*DALLASTX 73920**";
var result =
x.Split('{',StringSplitOptions.RemoveEmptyEntries)
.Aggregate(new List<Tuple<string, string>>(),
(l, z) => { var az = z.Split('}');
l.Add(new Tuple<string, string>(az[0], az[1]));
return l;})
LinqPad output:

Any way to use string (without escaping manually) that contains double quotes

Let's say I want to assign a text (which contains many double quotes) into variable. However, the only way seems to manually escape:
string t = "Lorem \"Ipsum\" dummy......
//or//
string t = #"Lorem ""Ipsum"" dummy.....
Is there any way to avoid manual escaping, and instead use something universal (which I dont know in C#) keywoard/method to do that automatically? In PHP, it's untoldly simple, by just using single quote:
$t = 'Lorem "Ipsum" dummy .......
btw, please don't bomb me with critiques "Why do you need to use that" or etc. I need answer to the question what I ask.
I know this answer may not be satisfying, but C# sytnax simply won't allow you to do such thing (at the time of writing this answer).
I think the best solution is to use resources. Adding/removing and using strings from resources is super easy:
internal class Program
{
private static void Main(string[] args)
{
string myStringVariable = Strings.MyString;
Console.WriteLine(myStringVariable);
}
}
The Strings is the name of the resources file without the extension (resx):
MyString is the name of your string in the resources file:
I may be wrong, but I conjecture this is the simplest solution.
No. In C# syntax, the only way to define string literals is the use of the double quote " with optional modifiers # and/or $ in front. The single quote is the character literal delimiter, and cannot be used in the way PHP would allow - in any version, including the current 8.0.
Note that the PHP approach suffers from the need to escape ' as well, which is, especially in the English language, frequently used as the apostrophe.
To back that up, the EBNF of the string literal in current C# is still this:
regular_string_literal '"' { regular_string_literal_character } '"'
The only change in the compiler in version 8.0 was that now, the order of the prefix modifiers $ (interpolated) and # (verbatim) can be either #$ or $#; it used to matter annoyingly in earlier versions.
Alternatives:
Save it to a file and use File.ReadAllText for the assignment, or embed it as a managed ressource, then the compiler will provide a variable in the namespace of your choice with the verbatim text as its runtime value.
Or use single quotes (or any other special character of your choice), and go
var t = #"Text with 'many quotes' inside".Replace("'", #"""");
where the Replace part could be modeled as an extension to the String class for brevity.

How to localize a string in unity where different languages may have different grammars

I'm translating a Unity game and some of the lines go like
Unlock at XXXX
where "XXXX" is replaced at runtime by an arbitrary substring. Easy enough to replace the wildcards, but to translate the quote, I can't simply concatenate a + b, as some languages will have the value before or inside the string. I figured I needed to, effectively, de-replace it, ie isolate and keep the substring and translate whatever's around it.
Problem is that while I can easily do the second part, I can't think of any avenues for the first. I know to get the character index of what I'm looking for, but the value takes up an arbitrary number of characters, and I can't use whitespace since some languages don't use it. Can't use digit detection since not all of the values are going to be numbers. I tried asking Google, but I couldn't translate "find whatever replaces a wildcard" into something keyword-searchable.
In short, what I'm looking for is a way to find the "XXXX" (the easy part) and then find whatever replaces it in the string (the less-easy part).
Thanks in advance.
I eventually found a workaround, thanks to everybody's kind advice. I stored the substring and referred to it in a special translation method that does take in a value. Thanks for your kind help, everybody.
public static string TranslateWithValue (string text, string value, int language) {
string sauce = text.Replace (value, "XXXX");
sauce = Translate (sauce, language);
sauce = sauce.Replace ("XXXX", value);
return sauce;
}
Usually, I use string.Format in such cases. In your case, I'd declare 2 localizeable strings:
string unlockFormat = "Unlock at {0}";
string unlockValue = "next level";
When you need the unlock condition displayed, you can combine the strings like that:
string unlockCondition = string.Format(unlockFormat, unlockValue);
which will produce the string "Unlock at next level".
Both unlockFormat and unlockValue can be translated, and the translator can move {0} wherever needed.

Data structure for storing keywords and synonyms in C#?

I'm working on a project in C# that I need to store somewhere between 10 to 15 keywords and their synonyms for.
The first way I thought of to store these was using a 2d list something like List> so that it would look like:
keyword1 synonym1 synonym2
keyword2 synonym1
keyword3 synonym1 synonym2
etc.
What I started to think about was if i'm getting an input string and splitting it to search each word to see if its a keyword or a synonym of a keyword in the list will a 2d list be fine for this or will searching it be too slow?
Hopefully my question makes sense I can clarify anything if it's not clear just ask. Thanks!
will searching [the list] be too slow?
When you are talking about 10..15 keywords, it is hard to come up with an algorithm inefficient enough to make end-users notice the slowness. There's simply not enough data to slow down a modern CPU.
One approach would be to build a Dictionary<string,string> that maps every synonym to its "canonical" keyword. This would include the canonical version itself:
var keywords = new Dictionary<string,string> {
["keyword1"] = "keyword1"
, ["synonym1"] = "keyword1"
, ["synonym2"] = "keyword1"
, ["keyword2"] = "keyword2"
, ["synonym3"] = "keyword2"
, ["keyword3"] = "keyword3"
};
Note how both keywords and synonyms appear as keys, while only keywords appear as values. This lets you look up a keyword or synonym, and get back a guaranteed keyword.
I would probably use a Dictionary. Where the key is your synonym and the value is your key word. So you do a look up in the Dictionary for any word and get the actual key word you want. For example:
private Dictionary<string, string> synonymKeywordDict = new Dictionary<string, string>();
public SearchResult Search(IEnumerable<string> searchTerms)
{
var keywords = searchTerms.Select(x => synonymKeywordDict[x]).Distinct().ToList();
//keywords now contains your key words after being translated from any synonyms
}
Just in case I'm not clear enough the Dictionary would be loaded like so.
private void LoadDictionary()
{
//So our lookup doesn't fail on the key word itself.
synonymKeywordDict.Add("computer", "computer");
//Then all our synonyms
synonymKeywordDict.Add("desktop", "computer");
synonymKeywordDict.Add("PC", "computer");
}

.Net 4.0 JSON Serialization: Double quotes are changed to \"

I'm using System.Web.Script.Serialization.JavaScriptSerializer() to serialize dictionary object into JSON string. I need to send this JSON string to API sitting in the cloud. However, when we serialize it, serializer replaces all the double quotes with \"
For example -
Ideal json_string = {"k":"json", "data":"yeehaw"}
Serializer messed up json_string = {\"k\":\"json\",\"data\":\"yeehaw\" }
Any idea why it is doing so? And I also used external packages like json.net but it still doesn't fix the issues.
Code -
Dictionary<string, string> json_value = new Dictionary<string, string>();
json_value.Add("k", "json");
json_value.Add("data", "yeehaw");
var jsonSerializer = new System.Web.Script.Serialization.JavaScriptSerializer();
string json_string = jsonSerializer.Serialize(json_value);
I'm going to hazard the guess that you're looking in the IDE at a breakpoint. In which case, there is no problem here. What you are seeing is perfectly valid JSON; simply the IDE is using the escaped string notation to display it to you. The contents of the string, however, are your "ideal" string. It uses the escaped version for various reasons:
so that you can correctly see and identify non-text characters like tab, carriage-return, new-line, etc
so that strings with lots of newlines can be displayed in a horizontal-based view
so that it can be clear that it is a string, i.e. "foo with \" a quote in" (the outer-quotes tell you it is a string; if the inner quote wasn't escaped it would be confusing)
so that you can copy/paste the value into the editor or immediate-window (etc) without having to add escaping yourself
Make sure you're not double serializating the object. It happened to me some days ago.
What you're seeing is a escape character
Your JSON is a String and when you want to have " in a string you must use one of the following:
string alias = #"My alias is ""Tx3""";
or
string alias = "My alias is \"Tx3\"";
Update
Just to clarify. What I wanted say here is that your JSON is perfectly valid. You're seeing the special characters in the IDE and that is perfectly normal like Jon & Marc are pointing in their answers and comments. Problem lies somewhere else than those \ characters.

Categories