String Concat using constants - performance - c#

Assume I have the following string constants:
const string constString1 = "Const String 1";
const string constString2 = "Const String 2";
const string constString3 = "Const String 3";
const string constString4 = "Const String 4";
Now I can append the strings in two ways:
Option1:
string resultString = constString1 + constString2 + constString3 + constString4;
Option2:
string resultString = string.Format("{0}{1}{2}{3}",constString1,constString2,constString3,constString4);
Internally string.Format uses StringBuilder.AppendFormat. Now given the fact that I am appending constant strings, which of the options (option1 or option 2) is better with respect to performance and/or memory?

The first one will be done by the compiler (at least the Microsoft C# Compiler) (in the same way that the compiler does 1+2), the second one must be done at runtime. So clearly the first one is faster.
As an added benefit, in the first one the string is internalized, in the second one it isn't.
And String.Format is quite slow :-) (read this
http://msmvps.com/blogs/jon_skeet/archive/2008/10/06/formatting-strings.aspx). NOT "slow enough to be a problem", UNLESS all your program do all the day is format strings (MILLIONS of them, not TENS). Then you could probably to it faster Appending them to a StringBuilder.

The first variant will be best, but only when you are using constant strings.
There are two compilator optimizations (from the C# compiler, not the JIT compiler) that are in effect here. Lets take one example of a program
const string A = "Hello ";
const string B = "World";
...
string test = A + B;
First optimization is constant propagation that will change your code basically into this:
string test = "Hello " + "World";
Then a concatenation of literal strings (as they are now, due to the first optimization) optimization will kick in and change it to
string test = "Hello World";
So if you write any variants of the program shown above, the actual IL will be the same (or at least very similar) due to the optimizations done by the C# compiler.

Related

Why does interpolating a const string result in a compiler error?

Why does string interpolation in c# does not work with const strings? For example:
private const string WEB_API_ROOT = "/private/WebApi/";
private const string WEB_API_PROJECT = $"{WEB_API_ROOT}project.json";
From my point of view, everything is known at compile time. Or is that a feature that will be added later?
Compiler message:
The expression being assigned to 'DynamicWebApiBuilder.WEB_API_PROJECT' must be constant.
Thanks a lot!
Interpolated strings are simply converted to calls to string.Format. So your above line actually reads
private const string WEB_API_PROJECT = string.Format("{0}project.json", WEB_API_ROOT);
And this is not compile time constant as a method call is included.
On the other hand, string concatenation (of simple, constant string literals) can be done by the compiler, so this will work:
private const string WEB_API_ROOT = "/private/WebApi/";
private const string WEB_API_PROJECT = WEB_API_ROOT + "project.json";
or switch from const to static readonly:
private static readonly string WEB_API_PROJECT = $"{WEB_API_ROOT}project.json";
so the string is initialized (and string.Format called) at the first access to any member of the declaring type.
An additional explanation why string interpolation expressions are not considered constants is that they are not constant, even if all their inputs are constants. Specifically, they vary based on the current culture. Try executing the following code:
CultureInfo.CurrentCulture = CultureInfo.InvariantCulture;
Console.WriteLine($"{3.14}");
CultureInfo.CurrentCulture = new CultureInfo("cs-CZ");
Console.WriteLine($"{3.14}");
Its output is:
3.14
3,14
Note that the output is different, even though the string interpolation expression is the same in both cases. So, with const string pi = $"{3.14}", it wouldn't be clear what code should the compiler generate.
UPDATE: In C# 10/.Net 6, string interpolations that only contains const strings can be const. So the code in the question is not an error anymore.
There is a discussion in Roslyn project at roslyn that finalize the following conclusion:
Read the excerpt:
It's not a bug, it was explicitly designed to function like this. You not liking it doesn't make it a bug.
String.Format isn't needed for concatenating strings, but that's not
what you're doing.
You're interpolating them, and String.Format is
needed for that based on the spec and implementation of how
interpolation works in C#.
If you want to concatenate strings, go
right ahead and use the same syntax that has worked since C# 1.0.
Changing the implementation to behave differently based on usage would
produce unexpected results:
const string FOO = "FOO";
const string BAR = "BAR";
string foobar = $"{FOO}{BAR}";
const string FOOBAR = $"{FOO}{BAR}"; // illegal today
Debug.Assert(foobar == FOOBAR); // might not always be true
Even the statement:
private static readonly string WEB_API_PROJECT = $"{WEB_API_ROOT}project.json";
The compiler raise an error:
"The name 'WEB_API_ROOT' does not exist in the current context".
The variable 'WEB_API_ROOT' should be defined in the same context
So, for the question of OP: Why does string interpolation is not working with const strings?
Answer: It's by C# 6 specs. for more details read .NET Compiler Platform ("Roslyn") -String Interpolation for C#
C# 10 (currently in preview at time of writing) will include the ability to use const interpolated strings, as long as the usage does not involve scenarios where culture may affect the outcome (such as this example).
So if the interpolation is simply concatenating strings together, it will work at compile time.
const string Language = "C#";
const string Platform = ".NET";
const string Version = "10.0";
const string FullProductName = $"{Platform} - Language: {Language} Version: {Version}";
This is now allowed in VS 2019 version 16.9, if the preview language version is selected (set <LanguageVersion>preview</LanguageVersion> in your project file).
https://github.com/dotnet/csharplang/issues/2951#issuecomment-736722760
in C# 9.0 or earlier we can not allow to use const with interpolated strings.
if you want to merge constant strings together, you will have to use concatenation and not interpolation.
const string WEB_API_ROOT = "/private/WebApi/";
const string WEB_API_PROJECT = WEB_API_ROOT + "project.json";
but from C# 10.0 Allow const interpolated strings as features and enhancements to the C# language.
C# 10.0 feature available in .NET 6.0 framework, in that we can able to use it.
see below code, currently C# 10.0 (Preview 5)
const string WEB_API_ROOT = "/private/WebApi/";
const string WEB_API_PROJECT = $"{WEB_API_ROOT}project.json";
you can also checkout docs from official site What's new in C# 10.0
A constant used with string.Format would, by its nature, be intended to work with a specific number of arguments which each have a predetermined meaning.
In other words, if you create this constant:
const string FooFormat = "Foo named '{0}' was created on {1}.";
Then in order to use it you must have two arguments which are probably supposed to be a string and a DateTime.
So even before string interpolation we were in a sense using the constant as a function. In other words, instead of separating the constant, it might have made more sense to put it in a function instead, like this:
string FormatFooDescription(string fooName, DateTime createdDate) =>
string.Format("Foo named '{0}' was created on {1}.", fooName, createdDate);
It's still the same thing, except that the constant (string literal) is now located with the function and arguments that use it. They might as well be together, because the format string is useless for any other purpose. What's more, now you can see the intent of the arguments that are applied to the format string.
When we look at it that way, the similar use of string interpolation becomes obvious:
string FormatFooDescription(string fooName, DateTime createdDate) =>
$"Foo named '{fooName}' was created on {createdDate}.";
What if we have multiple format strings and we want to choose a particular one at runtime?
Instead of selecting which string to use, we could select a function:
delegate string FooDescriptionFunction(string fooName, DateTime createdDate);
Then we could declare implementations like this:
static FooDescriptionFunction FormatFoo { get; } = (fooName, createdDate) =>
$"Foo named '{fooName}' was created on {createdDate}.";
Or, better yet:
delegate string FooDescriptionFunction(Foo foo);
static FooDescriptionFunction FormatFoo { get; } = (foo) =>
$"Foo named '{foo.Name}' was created on {foo.CreatedDate}.";
}

C++ AVR append to const char *

In C# you'd have a string, to append to the string I'll do the following:
//C#
string str="";
str += "Hello";
str += " world!"
//So str is now 'Hello world!'
But in C++ for AVR I use a const char *. How could I append to it?
const char * str="";
str += "Hello world!"; //This doesn't work, I get some weird data.
str = str + "Hello world!"; //This doesn't work either
NOTE: I'm working in Atmel Studio 6 programming an avr so I think the functionality used in C++ by most people is unavailable to use because I get build failures as soon as I try some examples I've seen online. I don't have the String data type either.
you really should dig into some C Tutorial or book and read the chapter about strings.
const char * str=""; creates a pointer to an empty string in the (constant) data segment.
str += "Hello world!":
string processing dos not work like this in C
the memory the pointer points to is constant you should not be able to modify it
adding something to a pointer will change the location the pointer points to (and not the data)
since you are on an AVR you should avoid dynamic memory.
defining an empty string constant does not make sense.
little example:
#define MAX_LEN 100
char someBuf[MAX_LEN] = ""; // create buffer of length 100 preinitilized with empty string
const char c_helloWorld[] = "Hello world!"; // defining string constant
strcat(someBuf, c_helloWorld); // this adds content of c_helloWorld at the end of somebuf
strcat(someBuf, c_helloWorld); // this adds content of c_helloWorld at the end of somebuf
// someBuf now contains "Hello world!Hello world!"
Additional excurse/explanation:
since the avr has harvard arcitecture it cannot (at least not without circumstances) read the program memory. So if you use string literals (like "Hello world!") they require doubled space by default. one instance of them is in the flash memory and in startup code they will be copied to SRAM. depending of your AVR this may matter! you can work around this and only store them in program memory by declaring Pointer using PROGMEM attribute (or something similar) but now you need to explicitly read them from flash at runtime by yourself.
From what I know, strings in C# are immutable, so the line
str += " world!"
actually creates a new string whose value is that of the original string, with " world" appended, and then makes str refer to that new string. There are no longer any references to the old string, so it gets garbage collected eventually.
But C-style strings are mutable, and are meant to be modified in place unless you explicitly copy them. So in fact if you have a const char*, you cannot modify the string at all, since const T* means the T data pointed to by the pointer can't be modified. Instead, you have to make a new string,
// In C, omit the static_cast<char*>; this is only necessary in C++.
char* new_str = static_cast<char*>(malloc(strlen(str)
+ strlen("Hello world!")
+ 1));
strcpy(new_str, str);
strcat(new_str, "Hello world!");
str = new_str;
// remember to free(str) at some point!
This is cumbersome and not very expressive, so if you are using C++ the obvious solution is to use std::string instead. Unlike the C# string, the C++ string has value semantics and is not immutable, but it can be appended to in a straightforward fashion, unlike the C string:
std::string str = "";
str += "Hello world!";
Again, if you mark the original string const, you won't be able to append to it without creating a new string.

c# How to insert a variable into a string?

I would like to know the different ways of inserting a variable into a string, in C#.
I am currently trying to insert values into a json string that I am building:
Random rnd = new Random();
int ID = rnd.Next(1, 999);
string body = #"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""help here""}";
How could I add the "ID" to the string body?
In a typical string inserting scenario, I'd do one of these:
string body = string.Format("My ID is {0}", ID);
string body = "My ID is " + ID;
However, your string is apparently JSON serialized data. I'd expect that I'd want to parse that into a class in order to work with it.
var myObj = JsonConvert.DeserializeObject<MyClass>(someString);
myObj.TID = ID;
// maybe do other things with it, then if I need JSON again...
string body = JsonConvert.SerializeObject(myObj);
One reason to take this approach is to make sure that any data I put in still makes the JSON valid. For example, if my ID were, instead of an int, a string with characters that needed escaping, directly inserting "\"\n\"" would not be the right thing to do.
String interpolation is the easiest way these days:
int myIntValue = 123;
string myStringValue = "one two three";
string interpolatedString = $"my int is: {myIntValue}. My string is: {myStringValue}.";
Output would be "my int is: 123. My string is: one two three.".
You can experiment with this sample yourself, over here.
The $ special character identifies a string literal as an interpolated
string. An interpolated string is a string literal that might contain
interpolation expressions. When an interpolated string is resolved to
a result string, items with interpolation expressions are replaced by
the string representations of the expression results. This feature is available starting with C# 6.
You could try this:
string body = #"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""" + ID + #"""}";
You can also use string.Concat:
string body = string.Concat(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""", ID, #"""}");
There are a number of ways to inject values into strings, however it's easy to lose sight of encodings, and cause major breakage.
If you just want to inject a value into another string, you can use:
string concatenation
string building
string formatting
Concatenation:
The simplest and most common way to build strings is by simply concatenating them together with the + operator:
var foo = 5;
var bar = "example-" + foo;
Concatenation can be difficult to read which makes it easy to introduce bugs, but for most simple tasks is the right tool for the job.
In this case, it's a poor choice:
string body = #"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""" + ID.ToString() + #"""}";
String Building
The StringBuilder class is useful for building large strings particularly when built iteratively.
var sb = new StringBuilder();
for (var i = 0; i < 1000; i++) {
sb.Append(i.ToString());
sb.Append(" ");
}
var output = sb.ToString();
It can still be difficult to read and hard to debug, but for cases where you're joining lots of strings together, it's super efficient
In this case, it's a poor choice:
StringBuilder sb = new StringBuilder();
sb.Append(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""");
sb.Append(ID.ToString());
sb.Append(#"""}");
string body = sb.ToString();
String formatting
The string.Format method makes templating data into a string super easy and efficient. If you plan on reusing the same string over and over, using a format string makes it much easier to read and debug code, particularly when there are lots of replacements:
var foo = 5;
var bar = string.Format("example-{0}", foo);
Format strings can also automatically apply culturally accurate formatting to particular data types, so that a DateTime is appropriately displayed, or so that a number has the appropriate number of trailing zeros.
In this case, it's a poor choice:
string string.Format(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""{0}""}", ID);
The right choice
You're not dumping data into any old string. That's JSON encoded data. If you just concatenate/build/format in any old value, you can break your string. For example, if the ID variable contained a " character, you'd break the entire JSON dataset.
Additionally, the length of the string and necessary quotes make it super difficult to read, which makes it difficult to maintain. Good luck when you get around to needing to add another formatted value, it's going to be a pain to change any existing value or add in new dynamic ones.
Instead of writing a JSON literal, write an object and encode it to JSON:
var bodyData =
new
{
currency = "country",
gold = 1,
detail = "detailid-979095986",
tId = ID //here's where you set the ID
};
var jss = new JavaScriptSerializer();
var body = jss.Serialize(bodyData);
This code is much easier to modify when the data changes, and will actually encode your data correctly. You don't need to worry about all those annoying double quote characters any more either.
You can use the
String.Format(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""{0}""}", ID)
Since this is params object[], you can use as many {n} as you want.
Instead of using on string, you could concatenate strings together using +, which would allow you to insert text between the generated strings.
string body = #"***" + ID + #"***";

Difference in String concatenation performance

I know you should use a StringBuilder when concatenating strings but I was just wondering if there is a difference in concatenating string variables and string literals. So, is there a difference in performance in building s1, s2, and s3?
string foo = "foo";
string bar = "bar";
string s1 = "foo" + "bar";
string s2 = foo + "bar";
string s3 = foo + bar;
In the case you present, it's actually better to use the concatenation operator on the string class. This is because it can pre-compute the lengths of the strings and allocate the buffer once and do a fast copy of the memory into the new string buffer.
And this is the general rule for concatenating strings. When you have a set number of items that you want to concatenate together (be it 2, or 2000, etc) it's better to just concatenate them all with the concatenation operator like so:
string result = s1 + s2 + ... + sn;
It should be noted in your specific case for s1:
string s1 = "foo" + "bar";
The compiler sees that it can optimize the concatenation of string literals here and transforms the above into this:
string s1 = "foobar";
Note, this is only for the concatenation of two string literals together. So if you were to do this:
string s2 = foo + "a" + bar;
Then it does nothing special (but it still makes a call to Concat and precomputes the length). However, in this case:
string s2 = foo + "a" + "nother" + bar;
The compiler will translate that into:
string s2 = foo + "another" + bar;
If the number of strings that you are concatenating is variable (as in, a loop which you don't know beforehand how many elements there are in it), then the StringBuilder is the most efficient way of concatenating those strings, as you will always have to reallcate the buffer to account for the new string entries being added (of which you don't know how many are left).
The compiler can concatenate literals at compile time, so "foo" + "bar" get compiled to "foobar" directly, and there's no need to do anything at runtime.
Other than that, I doubt there's any significant difference.
Your "knowledge" is incorrect. You should sometimes use a StringBuilder when concatenating strings. In particular, you should do it when you can't perform the concatenation all in one experession.
In this case, the code is compiled as:
string foo = "foo";
string bar = "bar";
string s1 = "foobar";
string s2 = String.Concat(foo, "bar");
string s3 = String.Concat(foo, bar);
Using a StringBuilder would make any of this less efficient - in particular it would push the concatenation for s1 from compile time to execution time. For s2 and s3 it would force the creation of an extra object (the StringBuilder) as well as probably allocating a string which is unnecessarily large.
I have an article which goes into more detail on this.
There is no difference between s2 and s3. The compiler will take care of s1 for you, and concatenate it during compile time.
I'd say that this should decide compiler. Because all your string-building can be optimized as values is already known.
I guess StringBuilder pre-allocates space for appending more strings. As You know + is binary operator so there is no way to build concatenation of more than two strings at a time. Thus if you want to do s4 = s1 + s2 + s3 it will require building intermediate string (s1+s2) and only after that s4.

Automatic casting to string in C# and VB.NET

I could do this in C#..
int number = 2;
string str = "Hello " + number + " world";
..and str ends up as "Hello 2 world".
In VB.NET i could do this..
Dim number As Integer = 2
Dim str As String = "Hello " + number + " world"
..but I get an InvalidCastException "Conversion from string "Hello " to type 'Double' is not valid."
I am aware that I should use .ToString() in both cases, but whats going on here with the code as it is?
In VB I believe the string concatenation operator is & rather than + so try this:
Dim number As Integer = 2
Dim str As String = "Hello " & number & " world"
Basically when VB sees + I suspect it tries do numeric addition or use the addition operator defined in a type (or no doubt other more complicated things, based on options...) Note that System.String doesn't define an addition operator - it's all hidden in the compiler by calls to String.Concat. (This allows much more efficient concatenation of multiple strings.)
Visual Basic makes a distinction between the + and & operators. The & will make the conversion to a string if an expression is not a string.
&Operator (Visual Basic)
The + operator uses more complex evaluation logic to determine what to make the final cast into (for example it's affected by things like Option Strict configuration)
+Operator (Visual Basic)
I'd suggest to stay away from raw string concatenation, if possible.
Good alternatives are using string.format:
str = String.Format("Hello {0} workd", Number)
Or using the System.Text.StringBuilder class, which is also more efficient on larger string concatenations.
Both automatically cast their parameters to string.
The VB plus (+) operator is ambiguous.
If you don't have Option Explicit on, if my memory serves me right, it is possible to do this:
Dim str = 1 + "2"
and gets str as integer = 3.
If you explicitly want a string concatenation, use the ampersand operator
Dim str = "Hello " & number & " world"
And it'll happily convert number to string for you.
I think this behavior is left in for backward compatibility.
When you program in VB, always use an ampersand to concatenate strings.

Categories