How can i optimize this c# code? - c#

I have converted my Datatable to json string use the following method...
public string GetJSONString(DataTable Dt)
{
string[] StrDc = new string[Dt.Columns.Count];
string HeadStr = string.Empty;
for (int i = 0; i < Dt.Columns.Count; i++)
{
StrDc[i] = Dt.Columns[i].Caption;
HeadStr += "\"" + StrDc[i] + "\" : \"" + StrDc[i] + i.ToString() + "¾" + "\",";
}
HeadStr = HeadStr.Substring(0, HeadStr.Length - 1);
StringBuilder Sb = new StringBuilder();
Sb.Append("{\"" + Dt.TableName + "\" : [");
for (int i = 0; i < Dt.Rows.Count; i++)
{
string TempStr = HeadStr;
Sb.Append("{");
for (int j = 0; j < Dt.Columns.Count; j++)
{
if (Dt.Rows[i][j].ToString().Contains("'") == true)
{
Dt.Rows[i][j] = Dt.Rows[i][j].ToString().Replace("'", "");
}
TempStr = TempStr.Replace(Dt.Columns[j] + j.ToString() + "¾", Dt.Rows[i][j].ToString());
}
Sb.Append(TempStr + "},");
}
Sb = new StringBuilder(Sb.ToString().Substring(0, Sb.ToString().Length - 1));
Sb.Append("]}");
return Sb.ToString();
}
Is this fair enough or still there is margin for optimization to make it execute faster.... Any suggestion...

Before asking if you can optimise it to make it execute faster, the first question you need to ask yourself is, does it run fast enough for me? Premature optimisation is the curse of all of us (I know I've done it!). You could spend hours trying to micro-optimise this code, which might take it from taking, for example, 20ms to execute down to 15ms. Yes that'd be a reduction of 25%, but would 5ms really be worth 2 hours of your time? More importantly, would it provide enough of a benefit to your end users to warrant it?
Have you considered using the JsonSerializer from "Newtonsoft"? This may well be "quick enough", is fairly widely used and is thus more likely to be correct overall than anything I, or you, can write first time round.
Purely from a readability perspective (that may also allow the C# compiler / CLR to improve thing for you) you could consider changing long bits of string concatenation such as:
HeadStr += "\"" + StrDc[i] + "\" : \"" + StrDc[i] + i.ToString() + "¾" + "\",";
To:
HeadStr += string.Format("\"{0}\" : \"{0}{1}¾\",", strDc[i], i);
But for any changes you do make. Measure, Rinse, Repeat =)

There may well be ways of getting it to execute faster - but do you have any indication that you need it to execute faster? Do you have a good reason to believe this is a significant bottleneck in your code? If so, benchmark the code with some real data and profile the routine to work out where the time is going.

You could tidy up some bits:
Use string.Format() to avoid long x + y + z sequences. This may or may not make things faster (it would be marginal either way).
You usually don't need .toString() when concatenating.
You could also pass in the StringBuffer to be populated, so that the caller might have the opportunity to bundle up several such operations into a single StringBuffer.
These suggestions are focused more on tidiness than performance, which I think should be the real focus unless this code is presenting as a bottleneck in your profiling.

Why do you think it needs optimization? Is it really slow on some DataTables?
I'd just serialize DataTable with something like newton JSON serializer, if it's serializable at all.

Refactor your code, use a tool like ReSharper, JustCode etc to tidy it up a bit. Extract methods and use individual tests ( Test Driven Development-ish ) to find bottlenecks in your code and then tweak those.
But your first step should be: Refactor!

The problem with the code isn't speed, but that it's not cleaned up. I've done some clean-up, but you could probably do even more:
public string GetJSONString2(DataTable table)
{
StringBuilder headStrBuilder = new StringBuilder(table.Columns.Count * 5); //pre-allocate some space, default is 16 bytes
for (int i = 0; i < table.Columns.Count; i++)
{
headStrBuilder.AppendFormat("\"{0}\" : \"{0}{1}¾\",", table.Columns[i].Caption, i);
}
headStrBuilder.Remove(headStrBuilder.Length - 1, 1); // trim away last ,
StringBuilder sb = new StringBuilder(table.Rows.Count * 5); //pre-allocate some space
sb.Append("{\"");
sb.Append(table.TableName);
sb.Append("\" : [");
for (int i = 0; i < table.Rows.Count; i++)
{
string tempStr = headStrBuilder.ToString();
sb.Append("{");
for (int j = 0; j < table.Columns.Count; j++)
{
table.Rows[i][j] = table.Rows[i][j].ToString().Replace("'", "");
tempStr = tempStr.Replace(table.Columns[j] + j.ToString() + "¾", table.Rows[i][j].ToString());
}
sb.Append(tempStr + "},");
}
sb.Remove(sb.Length - 1, 1); // trim last ,
sb.Append("]}");
return sb.ToString();
}

I would suggest a different solution,if you are using .net 3.0 or 3.5
instead of doing this
Convert datatable into xml
use xmlserializer to convert the xml to your domain object
Using JavaScriptSerializer(System.Web.Extensions.dll) to serialize the domain object to json string.

Related

C# Extension method slower than chained Replace unless in tight loop. Why?

I have an extension method to remove certain characters from a string (a phone number) which is performing much slower than I think it should vs chained Replace calls. The weird bit, is that in a loop it overtakes the Replace thing if the loop runs for around 3000 iterations, and after that it's faster. Lower than that and chaining Replace is faster. It's like there's a fixed overhead to my code which Replace doesn't have. What could this be!?
Quick look. When only testing 10 numbers, mine takes about 0.3ms, while Replace takes only 0.01ms. A massive difference! But when running 5 million, mine takes around 1700ms while Replace takes about 2500ms.
Phone numbers will only have 0-9, +, -, (, )
Here's the relevant code:
Building test cases, I'm playing with testNums.
int testNums = 5_000_000;
Console.WriteLine("Building " + testNums + " tests");
Random rand = new Random();
string[] tests = new string[testNums];
char[] letters =
{
'0','1','2','3','4','5','6','7','8','9',
'+','-','(',')'
};
for(int t = 0; t < tests.Length; t++)
{
int length = rand.Next(5, 20);
char[] word = new char[length];
for(int c = 0; c < word.Length; c++)
{
word[c] = letters[rand.Next(letters.Length)];
}
tests[t] = new string(word);
}
Console.WriteLine("Tests built");
string[] stripped = new string[tests.Length];
Using my extension method:
Stopwatch stopwatch = Stopwatch.StartNew();
for (int i = 0; i < stripped.Length; i++)
{
stripped[i] = tests[i].CleanNumberString();
}
stopwatch.Stop();
Console.WriteLine("Clean: " + stopwatch.Elapsed.TotalMilliseconds + "ms");
Using chained Replace:
stripped = new string[tests.Length];
stopwatch = Stopwatch.StartNew();
for (int i = 0; i < stripped.Length; i++)
{
stripped[i] = tests[i].Replace(" ", string.Empty)
.Replace("-", string.Empty)
.Replace("(", string.Empty)
.Replace(")", string.Empty)
.Replace("+", string.Empty);
}
stopwatch.Stop();
Console.WriteLine("Replace: " + stopwatch.Elapsed.TotalMilliseconds + "ms");
Extension method in question:
public static string CleanNumberString(this string s)
{
Span<char> letters = stackalloc char[s.Length];
int count = 0;
for (int i = 0; i < s.Length; i++)
{
if (s[i] >= '0' && s[i] <= '9')
letters[count++] = s[i];
}
return new string(letters.Slice(0, count));
}
What I've tried:
I've run them around the other way. Makes a tiny difference, but not enough.
Make it a normal static method, which was significantly slower than extension. As a ref parameter was slightly slower, and in parameter was about the same as extension method.
Aggressive Inlining. Doesn't make any real difference. I'm in release mode, so I suspect the compiler inlines it anyway. Either way, not much change.
I have also looked at memory allocations, and that's as I expect. My one allocates on the managed heap only one string per iteration (the new string at the end) which Replace allocates a new object for each Replace. So the memory used by the Replace one is much, higher. But it's still faster!
Is it calling native C code and doing something crafty there? Is the higher memory usage triggering the GC and slowing it down (still doesn't explane the insanely fast time on only one or two iterations)
Any ideas?
(Yes, I know not to bother optimising things like this, it's just bugging me because I don't know why it's doing this)
After doing some benchmarks, I think can safely assert that your initial statement is wrong for the exact reason you mentionned in your deleted answer: the loading time of the method is the only thing that misguided you.
Here's the full benchmark on a simplified version of the problem:
static void Main(string[] args)
{
// Build string of n consecutive "ab"
int n = 1000;
Console.WriteLine("N: " + n);
char[] c = new char[n];
for (int i = 0; i < n; i+=2)
c[i] = 'a';
for (int i = 1; i < n; i += 2)
c[i] = 'b';
string s = new string(c);
Stopwatch stopwatch;
// Make sure everything is loaded
s.CleanNumberString();
s.Replace("a", "");
s.UnsafeRemove();
// Tests to remove all 'a' from the string
// Unsafe remove
stopwatch = Stopwatch.StartNew();
string a1 = s.UnsafeRemove();
stopwatch.Stop();
Console.WriteLine("Unsafe remove:\t" + stopwatch.Elapsed.TotalMilliseconds + "ms");
// Extension method
stopwatch = Stopwatch.StartNew();
string a2 = s.CleanNumberString();
stopwatch.Stop();
Console.WriteLine("Clean method:\t" + stopwatch.Elapsed.TotalMilliseconds + "ms");
// String replace
stopwatch = Stopwatch.StartNew();
string a3 = s.Replace("a", "");
stopwatch.Stop();
Console.WriteLine("String.Replace:\t" + stopwatch.Elapsed.TotalMilliseconds + "ms");
// Make sure the returned strings are identical
Console.WriteLine(a1.Equals(a2) && a2.Equals(a3));
Console.ReadKey();
}
public static string CleanNumberString(this string s)
{
char[] letters = new char[s.Length];
int count = 0;
for (int i = 0; i < s.Length; i++)
if (s[i] == 'b')
letters[count++] = 'b';
return new string(letters.SubArray(0, count));
}
public static T[] SubArray<T>(this T[] data, int index, int length)
{
T[] result = new T[length];
Array.Copy(data, index, result, 0, length);
return result;
}
// Taken from https://stackoverflow.com/a/2183442/6923568
public static unsafe string UnsafeRemove(this string s)
{
int len = s.Length;
char* newChars = stackalloc char[len];
char* currentChar = newChars;
for (int i = 0; i < len; ++i)
{
char c = s[i];
switch (c)
{
case 'a':
continue;
default:
*currentChar++ = c;
break;
}
}
return new string(newChars, 0, (int)(currentChar - newChars));
}
When ran with different values of n, it is clear that your extension method (or at least my somewhat equivalent version of it) has a logic that makes it faster than String.Replace(). In fact, it is more performant on either small or big strings:
N: 100
Unsafe remove: 0,0024ms
Clean method: 0,0015ms
String.Replace: 0,0021ms
True
N: 100000
Unsafe remove: 0,3889ms
Clean method: 0,5308ms
String.Replace: 1,3993ms
True
I highly suspect optimizations for the replacement of strings (not to be compared to removal) in String.Replace() to be the culprit here. I also added a method from this answer to have another comparison on removal of characters. That one's times behave similarly to your method but gets faster on higher values (80k+ on my tests) of n.
With all that being said, since your question is based on an assumption that we found was false, if you need more explanation on why the opposite is true (i.e. "Why is String.Replace() slower than my method"), plenty of in-depth benchmarks about string manipulation already do so.
I ran the clean method a couple more. interestingly, it is a lot faster than the Replace. Only the first time run was slower. Sorry that I couldn't explain why it's slower the first time but I ran more of the method then the result was expected.
Building 100 tests
Tests built
Replace: 0.0528ms
Clean: 0.4526ms
Clean: 0.0413ms
Clean: 0.0294ms
Replace: 0.0679ms
Replace: 0.0523ms
used dotnet core 2.1
So I've found with help from daehee Kim and Mat below that it's only the first iteration, but it's for the whole first loop. Every loop after there is ok.
I use the following line to force the JIT to do its thing and initialise this method:
RuntimeHelpers.PrepareMethod(typeof(CleanExtension).GetMethod("CleanNumberString", BindingFlags.Public | BindingFlags.Static).MethodHandle);
I find the JIT usually takes about 2-3ms to do its thing here (including Reflection time of about 0.1ms). Note that you should probably not be doing this because you're now getting the Reflection cost as well, and the JIT will be called right after this anyway, but it's probably a good idea for benchmarks to fairly compare.
The more you know!
My benchmark for a loop of 5000 iterations, repeated 5000 times with random strings and averaged is:
Clean: 0.41078ms
Replace: 1.4974ms

Extracting repeated for loop functionality into own method

Within one of my classes I have a number of methods which all have similar purposes; these methods convert different objects into valid JSON representation. Each method does slightly different things as the objects being fed into the method are different and therefore their JSON output will also be subtly different.
Within these methods a for loop exists, the purpose of this loop is to check whether or not the field being converted into JSON is the last one in the object, if it is not then a , will be placed after the converted JSON string, as is normal within JSON.
Below is an example of one of these for loops:
for (int i = 0; i < numberOfSections; i++)
{
if (i == numberOfSections - 1)
{
output += SectionToJson(root.Sections[i]);
}
else
{
output += SectionToJson(root.Sections[i]);
output += ",";
}
}
One thing to note here is that the call to the method (SectionToJson here) is different within each method. Therefore I have three different for loops doing almost the same thing but with different method calls inside their clauses.
I want to know whether or not there is a way that I can remove these ugly for loops from my three different methods and instead place their functionality inside a single method which can then be called from the three methods instead. However since the internal method call is different within each method, it makes it more difficult to place inside a single method.
I considered using the Func delegate to pass the required method through as a parameter to the new method, but this would not work as the parameters for the three internal methods are all different, and therefore I would need three different overrides of a single method. Which kinda defeats the point in removing the for loops in the first place.
Is there any other approach that I haven't considered that would help me achieve my goal here? I'm also trying to keep my list of parameters down, and would rather not go over three parameters in the new method. Preferably two.
The other two for loops in question are below.
for (int i = 0; i < numberOfQuestionsInBank; i++)
{
if (i == numberOfQuestionsInBank - 1)
{
output += QuestionPropertyToJson(questionBank.Properties[i]);
}
else
{
output += QuestionPropertyToJson(questionBank.Properties[i]);
output += ",";
}
}
for (int i = 0; i < numberOfSections; i++)
{
if (i == numberOfSections - 1)
{
requiredSections += "\"" + (i+1) + "\"";
}
else
{
requiredSections += "\"" + (i+1) + "\"";
requiredSections += ",";
}
}
Well, there is another way - use string.Join this is exactly what it does.
And of course with linq you can make it look quite nice:
string.Join(",", root.Sections.Select(SectionToJson))
string.Join accepts collections of strings, so you keep concentrate on your conversion to strings (for each element) and let it do the concatenation for you.
The real-world solution to your problem is to use a JSON serializer, but of course this is an exercise so we can treat it a little differently.
What we need to do is look at the code an find the parts that are most alike. Once we locate that part of the code, we need to make it even more alike. Once they are identical, we can remove the duplication. They both remain 'til they're both the same.
So first of all, let's change both functions slightly to move the part that is different from the part that is almost the same. You can see that the if-statement in both is starting to look very similar.
for (int i = 0; i < numberOfQuestionsInBank; i++)
{
output += QuestionPropertyToJson(questionBank.Properties[i]);
if (i != numberOfQuestionsInBank - 1)
{
output += ",";
}
}
And
for (int i = 0; i < numberOfSections; i++)
{
requiredSections += "\"" + (i + 1) + "\"";
if (i != numberOfSections - 1)
{
requiredSections += ",";
}
}
Now let's consider what would happen if the naming was the same. Note: I'm not saying that you should make your naming the same everywhere as this would make your code less expressive - but we can do this exercise in our heads...
for (int i = 0; i < recordCount; i++)
{
output += QuestionPropertyToJson(questionBank.Properties[i]);
if (i != recordCount - 1)
{
output += ",";
}
}
And
for (int i = 0; i < recordCount; i++)
{
output += "\"" + (i + 1) + "\"";
if (i != recordCount - 1)
{
output += ",";
}
}
The section:
if (i != recordCount - 1)
{
output += ",";
}
Is now identical in both... we could create a function for that. There are a few ways to do that, this is just one of them:
public string ConditionalComma(int recordCount, int i)
{
if (i != recordCount - 1)
{
return ",";
}
return string.Empty;
}
That means our methods now look like this (I'll keep the in-head naming):
for (int i = 0; i < recordCount; i++)
{
output += QuestionPropertyToJson(questionBank.Properties[i]) + ConditionalComma(recordCount, i);
}
And
for (int i = 0; i < recordCount; i++)
{
output += "\"" + (i + 1) + "\"" + ConditionalComma(recordCount, i);
}
So we have managed to extract the differences and remove the duplication in a sensible way.
That's probably far enough for this exercise, but feel free to ask questions.

Performance issues with nested loops and string concatenations

Can someone please explain why this code is taking so long to run (i.e. >24 hours):
The number of rows is 5000, whilst the number of columns is 2000 (i.e. Approximately 10m loops).
Is there a better way to do this????
for (int i = 0; i < m.rows; i++)
{
for (int j = 0; j < m.cols; j++)
{
textToWrite += m[i, j].ToString() + ",";
}
//remove the final comma.
textToWrite = textToWrite.Substring(0,textToWrite.Length-2);
textToWrite += Environment.NewLine;
}
Yes, the += operator is not very efficient. Use StringBuilder instead.
In the .NET framework a string is immutable, which means it cannot be modified in place. This means the += operator has to create a new string every time, which means allocating memory, copying the value of the existing string and writing it to the new location. It's ok for one or two concatenations, but as soon as you put it in a loop you need to use an alternative.
http://support.microsoft.com/kb/306822
You'll see a massive performance improvement by using the following code:
var textToWriteBuilder = new StringBuilder();
for (int i = 0; i < m.rows; i++)
{
for (int j = 0; j < m.cols; j++)
{
textToWriteBuilder.Append(m[i, j].ToString() + ",");
}
// I've modified the logic on the following line, I assume you want to
// concatenate the value instead of overwriting it as you do in your question.
textToWriteBuilder.Append(textToWriteBuilder.Substring(0, textToWriteBuilder.Length - 2));
textToWriteBuilder.Append(Environment.NewLine);
}
string textToWrite = textToWriteBuilder.ToString();
Because you are creating tons of strings.
You should use StringBuilder for this.
StringBuilder sb = new StringBuildeR();
for (int i = 0; i < m.rows; i++)
{
bool first = true;
for (int j = 0; j < m.cols; j++)
{
sb.Append(m[i, j]);
if (first)
{
first = false;
}
else
{
sb.Append(",");
}
}
sb.AppendLine();
}
string output = sb.ToString();
Your code is taking so long because you're appending strings, creating thousands of new temporary strings as you go. The memory manager needs to find memory for these strings (which increase in memory requirements, as they get longer) and the operation copies the characters you have so far (the number of which increases with every iteration) to the newest string.
The alternative is to use a single StringBuilder, on which you call Append() to append more efficiently and, finally, ToString() when you're done to get the finalized string that you want to use.
The biggest issue I see with this is the fact you're using textToWrite as a string.
As strings are immutable so each time the string is changed new memory must be reserved copied from the previous version.
A far more efficient approach is to use the StringBuilder class which is designed for exactly this type of scenario. For example:
StringBuilder sb = new StringBuilder();
for (int i = 0; i < m.rows; i++)
{
for (int j = 0; j < m.cols; j++)
{
sb.Append(m[i, j].ToString());
if(j < m.cols - 1) // don't add a comma on the last element
{
sb.Append(",");
}
}
sb.AppendLine();
}
Supposing that textToWrite is a String, you should use StringBuilder instead. String is immutable and it is very ineffective to add small parts.
Ideally you would initialize StringBuilder with a reasonable size (see doc).
Use a StringBuilder instead of several million concatenations.
If you concatenate 2 strings, this means the system allocates new memory to contain both of them, and then copies both in. A zillion large memory allocations and copy actions become slow very fast.
What StringBuilder does is reduce this immensely by allocating 'in advance', thus only having to grow the buffer a few times and just copying it in, eliminating the by far slowest factor of your loop.
Assume the matrix is of size MxM and has N elements. You are building the string in a way that takes O(N^2) (or O(M^4)) in the number of iterations. Each operation must copy what's already there. The issue is not some constant-factor overhead like temporary strings.
Use StringBuilder.
String concatenation is more efficient for small number of concatenated strings. For a dynamic number of strings, use StringBuilder.
The reason that it takes so long to run is because you are using string concatenation to create a string. For each iteration it will copy the entire string to a new string, so in the end you will have copied strings that adds up to several million times the final string.
Use a StringBuilder to create the string:
StringBuilder textToWrite = new StringBuilder();
for (int i = 0; i < m.rows; i++)
{
for (int j = 0; j < m.cols; j++)
{
if (j > 0) textToWrite.Append(',');
textToWrite.Append(m[i, j]);
}
textToWrite.AppendLine();
}

StringBuilder performance in C#?

I have a StringBuilder object where I am adding some strings like follows:
I want to know which one is better approach here, first one is this:
StringBuilder sb = new StringBuilder();
sb.Append("Hello" + "How" + "are" + "you");
and the second one is:
StringBuilder sb = new StringBuilder();
sb.Append("Hello").Append("How").Append("are").Append("you");
In your current example, the string literals:
"Hello" + "How" + "are" + "you"
Will be compiled into one constant string literal by the compiler, so it is technically faster than:
sb.Append("Hello").Append("How").Append("are").Append("you");
However, were you to use string variables:
sb.Append(s1 + s2 + s3 + s4);
Then the latter would be faster as the former could potentially create a series of strings (because of the concatenation) before passing the final string into the Append method, whereas the latter would avoid the extra string creations (but trades off extra method calls and internal buffer resizing).
Update: For further clarity, in this exact situation where there are only 4 items being concatenated, the compiler will emit a call to String.Concat(string, string, string, string), which knowing the length and number of strings will be more efficient than StringBuilder.
The first will be more efficient. The compiler will convert it to the following single call:
StringBuilder sb = new StringBuilder();
sb.Append("HelloHowareyou");
Measuring the performance
The best way to know which is faster is to measure it. I'll get straight to the point: here are the results (smaller times means faster):
sb.Append("Hello" + "How" + "are" + "you") : 11.428s
sb.Append("Hello").Append("How").Append("are").Append("you"): 15.314s
sb.Append(a + b + c + d) : 21.970s
sb.Append(a).Append(b).Append(c).Append(d) : 15.529s
The number given is the number of seconds to perform the operation 100 million times in a tight loop.
Conclusions
The fastest is using string literals and +.
But if you have variables, using Append is faster than +. The first version is slower because of an extra call to String.Concat.
In case you want to test this yourself, here's the program I used to get the above timings:
using System;
using System.Text;
public class Program
{
public static void Main()
{
DateTime start, end;
int numberOfIterations = 100000000;
start = DateTime.UtcNow;
for (int i = 0; i < numberOfIterations; ++i)
{
StringBuilder sb = new StringBuilder();
sb.Append("Hello" + "How" + "are" + "you");
}
end = DateTime.UtcNow;
DisplayResult("sb.Append(\"Hello\" + \"How\" + \"are\" + \"you\")", start, end);
start = DateTime.UtcNow;
for (int i = 0; i < numberOfIterations; ++i)
{
StringBuilder sb = new StringBuilder();
sb.Append("Hello").Append("How").Append("are").Append("you");
}
end = DateTime.UtcNow;
DisplayResult("sb.Append(\"Hello\").Append(\"How\").Append(\"are\").Append(\"you\")", start, end);
string a = "Hello";
string b = "How";
string c = "are";
string d = "you";
start = DateTime.UtcNow;
for (int i = 0; i < numberOfIterations; ++i)
{
StringBuilder sb = new StringBuilder();
sb.Append(a + b + c + d);
}
end = DateTime.UtcNow;
DisplayResult("sb.Append(a + b + c + d)", start, end);
start = DateTime.UtcNow;
for (int i = 0; i < numberOfIterations; ++i)
{
StringBuilder sb = new StringBuilder();
sb.Append(a).Append(b).Append(c).Append(d);
}
end = DateTime.UtcNow;
DisplayResult("sb.Append(a).Append(b).Append(c).Append(d)", start, end);
Console.ReadLine();
}
private static void DisplayResult(string name, DateTime start, DateTime end)
{
Console.WriteLine("{0,-60}: {1,6:0.000}s", name, (end - start).TotalSeconds);
}
}
String constants will be concatenated at compile time by the compiler. If you are concatenating no more than four string expressions, the compiler will emit a call to String.Concat
s + t + u + v ==> String.Concat(s, t, u, v)
This performs faster than StringBuilder, as StringBuilder might have to resize its internal buffer, while Concat can calculate the total resulting length in advance. If you know the maximum length of the resulting string in advance, however, you can initialize the StringBuilder by specifying an initial working buffer size
var sb = new StringBuilder(initialBufferSize);
StringBuilder is often used in a loop and other dynamic scenarios and performs faster than s += t in such cases.
In the first case the compiler will construct a single string, so you'll only call Append once. However, I doubt this will make much of a difference. What did your measurements show?
The second one is the better approach. Strings are immutable meaning that when you use sb.Append("Hello" + "How" + "Are" + "You") you are creating multiple copies of the string
e.g.
"Hello"
then
"HelloHow"
then
"HelloHowAre"
etc.
The second piece of code is much more performant
edit: Of course this doesn't take into consideration compiler optimisations, but it's best to use the class as intended
Ok as people have pointed out since these are literals the compiler takes care of optimising these operations away - but my point is that doing string concatenation is something that StringBuilder tries to avoid
For instance, looping several times as such:
var someString = "";
foreach (var s in someListOfStrings)
{
someString += s;
}
Is not as good as doing:
var sb = new StringBuilder();
foreach(var s in someListOfStrings)
{
sb.Append(s);
}
sb.ToString();
As this will likely be much quicker since, as I said before, strings are immutable
I assumed the OP was talking about using concatenation in general since
sb.Append("Hello" + "How");
Seems completely pointless when
sb.Append("HelloHow");
Would be more logical...?
It seems to me that in the OPs mind, the placeholder text would eventually become a shedload of variables...

.NET String performance question

Is it better, from a performance standpoint, to use "Example1"? I'm assuming that "Example2" would create a new string on the heap in each iteration while "Example1" would not...
Example1:
StringBuilder letsCount = new StringBuilder("Let's count! ");
string sep = ", ";
for(int i=; i< 100; i++)
{
letsCount.Append(i + sep);
}
Example2:
StringBuilder letsCount = new StringBuilder("Let's count! ");
for(int i=; i< 100; i++)
{
letsCount.Append(i + ", ");
}
The .NET CLR is much smarter than that. It "interns" string literals so that there is only one instance.
It's also worth noting that if you were truly concerned about string concatenation, you would want to turn the single Append call into two append calls. The reality, however, is that the overhead of two calls probably outweighs any minor concatenation cost. In either case, it's probably nearly immeasurable except in very controlled conditions.
They are identical.
Actually a much faster way to do it would be
string letsCount = "Let's count! ";
string[] numbers = new string[100];
for(int i=0; i< 100; i++)
{
numbers[i]=i+", ";
}
String.Join(letsCount, numbers);
See here

Categories