In web app I am splitting strings and assigning to link names or to collections of strings. Is there a significant performance benefit to using stringbuilder for a web application?
EDIT: 2 functions: splitting up a link into 5-10 strings. THen repackaging into another string. Also I append one string at a time to a link everytime the link is clicked.
How many strings will you be concatenating? Do you know for sure how many there will be, or does it depend on how many records are in the database etc?
See my article on this subject for more details and guidelines - but basically, being in a web app makes no difference to how expensive string concatenation is vs using a StringBuilder.
EDIT: I'm afraid it's still not entirely clear from the question exactly what you're doing. If you've got a fixed set of strings to concatenate, and you can do it all in one go, then it's faster and probably more readable to do it using concatenation. For instance:
string faster = first + " " + second + " " + third + "; " + fourth;
string slower = new StringBuilder().Append(first)
.Append(" ")
.Append(second)
.Append(" ")
.Append(third)
.Append("; ")
.Append(fourth)
.ToString();
Another alternative is to use a format string of course. This may well be the slowest, but most readable:
string readable = string.Format("{0} {1} {2}; {3}",
first, second, third, fourth);
The part of your question mentioning "adding a link each time" suggests using a StringBuilder for that aspect though - anything which naturally leads to a loop is more efficient (for moderate to large numbers) using StringBuilder.
You should take a look at this excellent article by Jon Skeet about concatenating strings.
Yes, concatenating regular strings is expensive (really appending on string on to the end of another). Each time a string is changed, .net drops the old string and creates a new one with the new values. It is an immutable object.
EDIT:
Stringbuilder should be used with caution, and evaluated like any other approach. Sometimes connactenting two strings together will be more efficient, and should be evaluated on a case by case basis.
Atwood has an interesting article related to this.
Why would the performance be any different in a web application or a winforms application?
Using stringbuilder is a matter of good practice because of memory and object allocation, the rules apply no matter why you are building the code.
If you're making the string in a loop with a high number of iterations, then it's a good idea to use stringbuilder. Otherwise, string concatenation is your best bet.
FIRSTLY, Are you still writing this application? If yes then STOP Performance tuning!
SECONDLY, Prioritise Correctness over Speed. Readability is way more important in the long run for obvious reasons.
THIRDLY, WE don't know the exact situation and code you are writing. We can't really advise you if this micro optimisation is important to the performance of your code or not. MEASURE the difference. I highly recommend Red Gate's Ants Profiler.
Related
So a professor in university just told me that using concatenation on strings in C# (i.e. when you use the plus sign operator) creates memory fragmentation, and that I should use string.Format instead.
Now, I've searched a lot in stack overflow and I found a lot of threads about performance, which concatenating strings win hands down. (Some of them include this, this and this)
I can't find someone who talks about memory fragmentation though. I opened .NET's string.Format using ILspy and apparently it uses the same string builder than the string.Concat method does (which if I understand is what the + sign is overloaded to). In fact: it uses the code in string.Concat!
I found this article from 2007 but I doubt it's accurate today (or ever!). Apparently the compiler is smart enough to avoid that today, cause I can't seem to reproduce the issue. Both adding strings with string.format and plus signs end up using the same code internally. As said before, the string.Format uses the same code string.Concat uses.
So now I'm starting to doubt his claim. Is it true?
So a professor in university just told me that using concatenation on strings in C# (i.e. when you use the plus sign operator) creates memory fragmentation, and that I should use string.Format instead.
No, what you should do instead is do user research, set user-focussed real-world performance metrics, and measure the performance of your program against those metrics. When, and only when you find a performance problem, you should use the appropriate profiling tools to determine the cause of the performance issue. If the cause is "memory fragmentation" then address that by identifying the causes of the "fragmentation" and trying experiments to determine what techniques mitigate the effect.
Performance is not achieved by "tips and tricks" like "avoid string concatenation". Performance is achieved by applying engineering discipline to realistic problems.
To address your more specific problem: I have never heard the advice to eschew concatenation in favor of formatting for performance reasons. The advice usually given is to eschew iterated concatenation in favor of builders. Iterated concatenation is quadratic in time and space and creates collection pressure. Builders allocate unnecessary memory but are linear in typical scenarios. Neither creates fragmentation of the managed heap; iterated concatenation tends to produce contiguous blocks of garbage.
The number of times I've had a performance problem that came down to unnecessary fragmentation of a managed heap is exactly one; in an early version of Roslyn we had a pattern where we would allocate a small long lived object, then a small short lived object, then a small long lived object... several hundred thousand times in a row, and the resulting maximally fragmented heap caused user-impacting performance problems on collections; we determined this by careful measurement of the performance in the relevant scenarios, not by ad hoc analysis of the code from our comfortable chairs.
The usual advice is not to avoid fragmentation, but rather to avoid pressure. We found during the design of Roslyn that pressure was far more impactful on GC performance than fragmentation, once our aforementioned allocation pattern problem was fixed.
My advice to you is to either press your professor for an explanation, or to find a professor who has a more disciplined approach to performance metrics.
Now, all that said, you should use formatting instead of concatenation, but not for performance reasons. Rather, for code readability, localizability, and similar stylistic concerns. A format string can be made into a resource, it can be localized, and so on.
Finally, I caution you that if you are putting strings together in order to build something like a SQL query or a block of HTML to be served to a user, then you want to use none of these techniques. These applications of string building have serious security impacts when you get them wrong. Use libraries and tools specifically designed for construction of those objects, rather than rolling your own with strings.
The problem with string concatenation is that strings are immutable. string1 + string2 does not concatenate string2 onto string1, it creates a whole new string. Using a StringBuilder (or string.Format) does not have this problem. Internally, the StringBuilder holds a char[], which it over-allocates. Appending something to a StringBuilder does not create any new objects unless it runs out of room in the char[] (in which case it over-allocates a new one).
I ran a quick benchmark. I think it proves the point :)
StringBuilder sb = new StringBuilder();
string st;
Stopwatch sw;
sw = Stopwatch.StartNew();
for (int i = 0 ; i < 100000 ; i++)
{
sb.Append("a");
}
st = sb.ToString();
sw.Stop();
Debug.WriteLine($"Elapsed: {sw.Elapsed}");
st = "";
sw = Stopwatch.StartNew();
for (int i = 0 ; i < 100000 ; i++)
{
st = st + "a";
}
sw.Stop();
Debug.WriteLine($"Elapsed: {sw.Elapsed}");
The console output:
Elapsed: 00:00:00.0011883 (StringBuilder.Append())
Elapsed: 00:00:01.7791839 (+ operator)
Just started working with C# for the first time, and while looking through the tutorial, I found nothing on the difference between the Concatenation (console.writeline("Hello" + user) where user is a string variable) and the place holder (console.writeline("Hello {0}" , user) where user is a string variable) methods for output. Is there a difference or is it simply which way you find easier
Its not really specific to C#, lots of languages support both styles. The latter form is usually thought of as 'safer', but I can't quote any specific reason why. It is useful if the item needs to appear in more than 1 place, or if you want to save the format string as a constant. Take a look at this thread for more info: When is it better to use String.Format vs string concatenation?.
Using string formatters, as opposed to string concatenation, is almost entirely about readability. What they actually do, and even how they perform, is close enough to the same.
For such a simple case both look all right, but when you have a complex string with lots of values mixed in format strings can end up looking a lot nicer:
Here's a better example:
string output = "Hello " + username + ". I have spent " + executionTime + " seconds trying to figure out that the answer to life is: " + output;
vs
string output = string.Format("Hello {0}. I have spent {1} seconds trying to figure out that the answer to life is: {2}"
, username, executionTime, output);
As Matt said Place holding is considered as the safer approach then the simple concatenation, but I am not sure for that reasons(I need to explore on it). But yes one thing is sure that Place Holding is a bit costly operation then Concatenation in terms of Performance. Check this Blog entry "Formatting Strings" by Jon Skeet.
Although performance will be effected significantly only if you are using Place Holders for like thousands times or so.
why is doing the following so bad?
String val = null;
String someOtherValue = "hello"
val += someOtherValue;
It must be pretty bad, but why is that? I had this line in my program and it slowed everything down immensely!
I'm assuming it's because it keeps re-creating the string? Is this the only reason though?
That exact code is perfectly fine; the compiler will optimize it away.
Doing that in a loop can be slow, since that creates a separate (immutable) string object for each concatenation.
Instead, use StringBuilder.
Yes, the reason is that strings are immutable in C#, which means they can't be changed. The framework is forced to allocate a new string every time you do the +=
Try using StringBuilder instead..
The difference is very noticeable in long loops
I don't think that there's a sliver bullet that "this is better than that". It depends on the scenario where you need to concat the strings. Performance vs readability is also an issue here. Sometimes it's better to write a well readable code by compromising a little on the performance.
Referring the article from James Michael Hare
The fact is, the concattenation operator (+) has been optimized for
speed and looks the cleanest for joining together a known set of
strings in the simplest manner possible.
StringBuilder, on the other hand, excels when you need to build a
string of inderterminant length. Use it in those times when you are
looping till you hit a stop condition and building a result and it
won’t steer you wrong.
String.Format seems to be the looser from the stats, but consider
which of these is more readable. Yes, ignore the fact that you could
do this with ToString() on a DateTime.
Have a look on the article, it's worth reading.
There are several places where you can use the indexed placeholder syntax in C#, ie.
// Assume some object is available with 2 string properties
Console.Writeline("Hello {0}, today is {1}", obj.Username, obj.DayOfWeek);
Is that more efficient than using the string concatenation operator to build the string? ie.
Console.Writeline("Hello " + obj.Username + " today is " + obj.DayOfWeek);
Obviously the {0} ... {n} syntax is cleaner if you're doing something complicated -- but which code is more efficient (lower memory footprint and or execution time?)
Well the first version has to parse the string and interpret it, before doing the actual string concatenation. So one would expect the first method to be slower, and potentially more memory-intensive, no?
But unless you're doing vast amounts of string processing, it's unlikely to be an issue.
Don't think about it, use the formatting one. If you worry about memory/execution time with such a method, you have other problems.
I'm curious. The scenario is a web app/site with e.g. 100's of concurrent connections and many (20?) page loads per second.
If the app needs to server a formatted string
string.Format("Hello, {0}", username);
Will the "Hello, {0}" be interned? Or would it only be interned with
string hello = "Hello, {0}";
string.Format(hello, username);
Which, with regard to interning, would give better performance: the above or,
StringBuilder builder = new StringBuilder()
builder.Append("Hello, ");
builder.Append(username);
or even
string hello = "Hello, {0}";
StringBuilder builder = new StringBuilder()
builder.Append("Hello, ");
builder.Append(username);
So my main questions are:
1) Will a string.Format literal be interned
2) Is it worth setting a variable name for a stringbuilder for a quick lookup, or
3) Is the lookup itself quite heavy (if #1 above is a no)
I realise this would probably result in minuscule gains, but as I said I am curious.
There is a static method String.IsInterned(str) method. You could do some testing and find out!
http://msdn.microsoft.com/en-us/library/system.string.isinterned.aspx
String.Format actually uses a StringBuilder internally, so there is no reason to call it directly in your code. As far as interning of the literal is concerned, the two code versions are the same as the C# compiler will create a temporary variable to store the literal.
Finally, the effect of interning in a web page is negligible. Page rendering is essentially a heavy-duty string manipulation operation so the difference interning makes is negligible. You can achieve much greater performance benefits in a much easier way by using page and control caching.
Quick answer: run a 100k iterations and find out.
You can't beat
return "Hello, " + username;
if your scenario is really that simple.