What's the most efficient way to concatenate strings?
Rico Mariani, the .NET Performance guru, had an article on this very subject. It's not as simple as one might suspect. The basic advice is this:
If your pattern looks like:
x = f1(...) + f2(...) + f3(...) + f4(...)
that's one concat and it's zippy, StringBuilder probably won't help.
If your pattern looks like:
if (...) x += f1(...)
if (...) x += f2(...)
if (...) x += f3(...)
if (...) x += f4(...)
then you probably want StringBuilder.
Yet another article to support this claim comes from Eric Lippert where he describes the optimizations performed on one line + concatenations in a detailed manner.
The StringBuilder.Append() method is much better than using the + operator. But I've found that, when executing 1000 concatenations or less, String.Join() is even more efficient than StringBuilder.
StringBuilder sb = new StringBuilder();
sb.Append(someString);
The only problem with String.Join is that you have to concatenate the strings with a common delimiter.
Edit: as #ryanversaw pointed out, you can make the delimiter string.Empty.
string key = String.Join("_", new String[]
{ "Customers_Contacts", customerID, database, SessionID });
There are 6 types of string concatenations:
Using the plus (+) symbol.
Using string.Concat().
Using string.Join().
Using string.Format().
Using string.Append().
Using StringBuilder.
In an experiment, it has been proved that string.Concat() is the best way to approach if the words are less than 1000(approximately) and if the words are more than 1000 then StringBuilder should be used.
For more information, check this site.
string.Join() vs string.Concat()
The string.Concat method here is equivalent to the string.Join method invocation with an empty separator. Appending an empty string is fast, but not doing so is even faster, so the string.Concat method would be superior here.
From Chinh Do - StringBuilder is not always faster:
Rules of Thumb
When concatenating three dynamic string values or less, use traditional string concatenation.
When concatenating more than three dynamic string values, use StringBuilder.
When building a big string from several string literals, use either the # string literal or the inline + operator.
Most of the time StringBuilder is your best bet, but there are cases as shown in that post that you should at least think about each situation.
If you're operating in a loop, StringBuilder is probably the way to go; it saves you the overhead of creating new strings regularly. In code that'll only run once, though, String.Concat is probably fine.
However, Rico Mariani (.NET optimization guru) made up a quiz in which he stated at the end that, in most cases, he recommends String.Format.
Here is the fastest method I've evolved over a decade for my large-scale NLP app. I have variations for IEnumerable<T> and other input types, with and without separators of different types (Char, String), but here I show the simple case of concatenating all strings in an array into a single string, with no separator. Latest version here is developed and unit-tested on C# 7 and .NET 4.7.
There are two keys to higher performance; the first is to pre-compute the exact total size required. This step is trivial when the input is an array as shown here. For handling IEnumerable<T> instead, it is worth first gathering the strings into a temporary array for computing that total (The array is required to avoid calling ToString() more than once per element since technically, given the possibility of side-effects, doing so could change the expected semantics of a 'string join' operation).
Next, given the total allocation size of the final string, the biggest boost in performance is gained by building the result string in-place. Doing this requires the (perhaps controversial) technique of temporarily suspending the immutability of a new String which is initially allocated full of zeros. Any such controversy aside, however...
...note that this is the only bulk-concatenation solution on this page which entirely avoids an extra round of allocation and copying by the String constructor.
Complete code:
/// <summary>
/// Concatenate the strings in 'rg', none of which may be null, into a single String.
/// </summary>
public static unsafe String StringJoin(this String[] rg)
{
int i;
if (rg == null || (i = rg.Length) == 0)
return String.Empty;
if (i == 1)
return rg[0];
String s, t;
int cch = 0;
do
cch += rg[--i].Length;
while (i > 0);
if (cch == 0)
return String.Empty;
i = rg.Length;
fixed (Char* _p = (s = new String(default(Char), cch)))
{
Char* pDst = _p + cch;
do
if ((t = rg[--i]).Length > 0)
fixed (Char* pSrc = t)
memcpy(pDst -= t.Length, pSrc, (UIntPtr)(t.Length << 1));
while (pDst > _p);
}
return s;
}
[DllImport("MSVCR120_CLR0400", CallingConvention = CallingConvention.Cdecl)]
static extern unsafe void* memcpy(void* dest, void* src, UIntPtr cb);
I should mention that this code has a slight modification from what I use myself. In the original, I call the cpblk IL instruction from C# to do the actual copying. For simplicity and portability in the code here, I replaced that with P/Invoke memcpy instead, as you can see. For highest performance on x64 (but maybe not x86) you may want to use the cpblk method instead.
From this MSDN article:
There is some overhead associated with
creating a StringBuilder object, both
in time and memory. On a machine with
fast memory, a StringBuilder becomes
worthwhile if you're doing about five
operations. As a rule of thumb, I
would say 10 or more string operations
is a justification for the overhead on
any machine, even a slower one.
So if you trust MSDN go with StringBuilder if you have to do more than 10 strings operations/concatenations - otherwise simple string concat with '+' is fine.
Try this 2 pieces of code and you will find the solution.
static void Main(string[] args)
{
StringBuilder s = new StringBuilder();
for (int i = 0; i < 10000000; i++)
{
s.Append( i.ToString());
}
Console.Write("End");
Console.Read();
}
Vs
static void Main(string[] args)
{
string s = "";
for (int i = 0; i < 10000000; i++)
{
s += i.ToString();
}
Console.Write("End");
Console.Read();
}
You will find that 1st code will end really quick and the memory will be in a good amount.
The second code maybe the memory will be ok, but it will take longer... much longer.
So if you have an application for a lot of users and you need speed, use the 1st. If you have an app for a short term one user app, maybe you can use both or the 2nd will be more "natural" for developers.
Cheers.
It's also important to point it out that you should use the + operator if you are concatenating string literals.
When you concatenate string literals or string constants by using the + operator, the compiler creates a single string. No run time concatenation occurs.
How to: Concatenate Multiple Strings (C# Programming Guide)
Adding to the other answers, please keep in mind that StringBuilder can be told an initial amount of memory to allocate.
The capacity parameter defines the maximum number of characters that can be stored in the memory allocated by the current instance. Its value is assigned to the Capacity property. If the number of characters to be stored in the current instance exceeds this capacity value, the StringBuilder object allocates additional memory to store them.
If capacity is zero, the implementation-specific default capacity is used.
Repeatedly appending to a StringBuilder that hasn't been pre-allocated can result in a lot of unnecessary allocations just like repeatedly concatenating regular strings.
If you know how long the final string will be, can trivially calculate it, or can make an educated guess about the common case (allocating too much isn't necessarily a bad thing), you should be providing this information to the constructor or the Capacity property. Especially when running performance tests to compare StringBuilder with other methods like String.Concat, which do the same thing internally. Any test you see online which doesn't include StringBuilder pre-allocation in its comparisons is wrong.
If you can't make any kind of guess about the size, you're probably writing a utility function which should have its own optional argument for controlling pre-allocation.
Following may be one more alternate solution to concatenate multiple strings.
String str1 = "sometext";
string str2 = "some other text";
string afterConcate = $"{str1}{str2}";
string interpolation
Another solution:
inside the loop, use List instead of string.
List<string> lst= new List<string>();
for(int i=0; i<100000; i++){
...........
lst.Add(...);
}
return String.Join("", lst.ToArray());;
it is very very fast.
The most efficient is to use StringBuilder, like so:
StringBuilder sb = new StringBuilder();
sb.Append("string1");
sb.Append("string2");
...etc...
String strResult = sb.ToString();
#jonezy: String.Concat is fine if you have a couple of small things. But if you're concatenating megabytes of data, your program will likely tank.
System.String is immutable. When we modify the value of a string variable then a new memory is allocated to the new value and the previous memory allocation released. System.StringBuilder was designed to have concept of a mutable string where a variety of operations can be performed without allocation separate memory location for the modified string.
I've tested all the methods in this page and at the end I've developed my solution that is the fastest and less memory expensive.
Note: tested in Framework 4.8
[MemoryDiagnoser]
public class StringConcatSimple
{
private string
title = "Mr.", firstName = "David", middleName = "Patrick", lastName = "Callan";
[Benchmark]
public string FastConcat()
{
return FastConcat(
title, " ",
firstName, " ",
middleName, " ",
lastName);
}
[Benchmark]
public string StringBuilder()
{
var stringBuilder =
new StringBuilder();
return stringBuilder
.Append(title).Append(' ')
.Append(firstName).Append(' ')
.Append(middleName).Append(' ')
.Append(lastName).ToString();
}
[Benchmark]
public string StringBuilderExact24()
{
var stringBuilder =
new StringBuilder(24);
return stringBuilder
.Append(title).Append(' ')
.Append(firstName).Append(' ')
.Append(middleName).Append(' ')
.Append(lastName).ToString();
}
[Benchmark]
public string StringBuilderEstimate100()
{
var stringBuilder =
new StringBuilder(100);
return stringBuilder
.Append(title).Append(' ')
.Append(firstName).Append(' ')
.Append(middleName).Append(' ')
.Append(lastName).ToString();
}
[Benchmark]
public string StringPlus()
{
return title + ' ' + firstName + ' ' +
middleName + ' ' + lastName;
}
[Benchmark]
public string StringFormat()
{
return string.Format("{0} {1} {2} {3}",
title, firstName, middleName, lastName);
}
[Benchmark]
public string StringInterpolation()
{
return
$"{title} {firstName} {middleName} {lastName}";
}
[Benchmark]
public string StringJoin()
{
return string.Join(" ", title, firstName,
middleName, lastName);
}
[Benchmark]
public string StringConcat()
{
return string.
Concat(new String[]
{ title, " ", firstName, " ",
middleName, " ", lastName });
}
}
Yes, it use unsafe
public static unsafe string FastConcat(string str1, string str2, string str3, string str4, string str5, string str6, string str7)
{
var capacity = 0;
var str1Length = 0;
var str2Length = 0;
var str3Length = 0;
var str4Length = 0;
var str5Length = 0;
var str6Length = 0;
var str7Length = 0;
if (str1 != null)
{
str1Length = str1.Length;
capacity = str1Length;
}
if (str2 != null)
{
str2Length = str2.Length;
capacity += str2Length;
}
if (str3 != null)
{
str3Length = str3.Length;
capacity += str3Length;
}
if (str4 != null)
{
str4Length = str4.Length;
capacity += str4Length;
}
if (str5 != null)
{
str5Length = str5.Length;
capacity += str5Length;
}
if (str6 != null)
{
str6Length = str6.Length;
capacity += str6Length;
}
if (str7 != null)
{
str7Length = str7.Length;
capacity += str7Length;
}
string result = new string(' ', capacity);
fixed (char* dest = result)
{
var x = dest;
if (str1Length > 0)
{
fixed (char* src = str1)
{
Unsafe.CopyBlock(x, src, (uint)str1Length * 2);
x += str1Length;
}
}
if (str2Length > 0)
{
fixed (char* src = str2)
{
Unsafe.CopyBlock(x, src, (uint)str2Length * 2);
x += str2Length;
}
}
if (str3Length > 0)
{
fixed (char* src = str3)
{
Unsafe.CopyBlock(x, src, (uint)str3Length * 2);
x += str3Length;
}
}
if (str4Length > 0)
{
fixed (char* src = str4)
{
Unsafe.CopyBlock(x, src, (uint)str4Length * 2);
x += str4Length;
}
}
if (str5Length > 0)
{
fixed (char* src = str5)
{
Unsafe.CopyBlock(x, src, (uint)str5Length * 2);
x += str5Length;
}
}
if (str6Length > 0)
{
fixed (char* src = str6)
{
Unsafe.CopyBlock(x, src, (uint)str6Length * 2);
x += str6Length;
}
}
if (str7Length > 0)
{
fixed (char* src = str7)
{
Unsafe.CopyBlock(x, src, (uint)str7Length * 2);
}
}
}
return result;
}
You can edit the method and adapt it to your case. For example you can make it something like
public static unsafe string FastConcat(string str1, string str2, string str3 = null, string str4 = null, string str5 = null, string str6 = null, string str7 = null)
For just two strings, you definitely do not want to use StringBuilder. There is some threshold above which the StringBuilder overhead is less than the overhead of allocating multiple strings.
So, for more that 2-3 strings, use DannySmurf's code. Otherwise, just use the + operator.
It really depends on your usage pattern.
A detailed benchmark between string.Join, string,Concat and string.Format can be found here: String.Format Isn't Suitable for Intensive Logging
(This is actually the same answer I gave to this question)
It would depend on the code.
StringBuilder is more efficient generally, but if you're only concatenating a few strings and doing it all in one line, code optimizations will likely take care of it for you. It's important to think about how the code looks too: for larger sets StringBuilder will make it easier to read, for small ones StringBuilder will just add needless clutter.
Just out of curiousity (not really expecting a measurable result) which of the following codes are better in case of performance?
private void ReplaceChar(ref string replaceMe) {
if (replaceMe.Contains('a')) {
replaceMe=replaceMe.Replace('a', 'b');
}
}
private void ReplaceString(ref string replaceMe) {
if (replaceMe.Contains("a")) {
replaceMe=replaceMe.Replace("a", "b");
}
}
In the first example I use char, while in the second using strings in Contains() and Replace()
Would the first one have better performance because of the less memory-consuming "char" or does the second perform better, because the compiler does not have to cast in this operation?
(Or is this all nonsense, cause the CLR generates the same code in both variations?)
If you have two horses and want to know which is faster...
String replaceMe = new String('a', 10000000) +
new String('b', 10000000) +
new String('a', 10000000);
Stopwatch sw = new Stopwatch();
sw.Start();
// String replacement
if (replaceMe.Contains("a")) {
replaceMe = replaceMe.Replace("a", "b");
}
// Char replacement
//if (replaceMe.Contains('a')) {
// replaceMe = replaceMe.Replace('a', 'b');
//}
sw.Stop();
Console.Write(sw.ElapsedMilliseconds);
I've got 60 ms for Char replacement and 500 ms for String one (Core i5 3.2GHz, 64-bit, .Net 4.6). So
replaceMe = replaceMe.Replace('a', 'b')
is about 9 times faster
We can't know for sure without testing the code since most of the replacing is done inside the CLR and it heavily optimized.
What we can say is this: replacing a char has some performance benefits since the code is simpler and the outcome is more predictable: replacing a char will always yield the same number of characters as the original for example.
In the performance of the replacing itself doesn't matter too much. In a tight loop, the allocation and garbage collection of the 'old' string will have a bigger impact than the replacement itself.
In my project I am looping across a dataview result.
string html =string.empty;
DataView dV = data.DefaultView;
for(int i=0;i< dV.Count;i++)
{
DataRowView rv = dV[i];
html += rv.Row["X"].Tostring();
}
Number of rows in dV will alway be 3 or 4.
Is it better to use the string concat += opearator or StringBuilder for this case and why?
I would use StringBuilder here, just because it describes what you're doing.
For a simple concatenation of 3 or 4 strings, it probably won't make any significant difference, and string concatenation may even be slightly faster - but if you're wrong and there are lots of rows, StringBuilder will start getting much more efficient, and it's always more descriptive of what you're doing.
Alternatively, use something like:
string html = string.Join("", dv.Cast<DataRowView>()
.Select(rv => rv.Row["X"]));
Note that you don't have any sort of separator between the strings at the moment. Are you sure that's what you want? (Also note that your code doesn't make a lot of sense at the moment - you're not using i in the loop. Why?)
I have an article about string concatenation which goes into more detail about why it's worth using StringBuilder and when.
EDIT: For those who doubt that string concatenation can be faster, here's a test - with deliberately "nasty" data, but just to prove it's possible:
using System;
using System.Diagnostics;
using System.Text;
class Test
{
static readonly string[] Bits = {
"small string",
"string which is a bit longer",
"stirng which is longer again to force yet another copy with any luck"
};
static readonly int ExpectedLength = string.Join("", Bits).Length;
static void Main()
{
Time(StringBuilderTest);
Time(ConcatenateTest);
}
static void Time(Action action)
{
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
// Make sure it's JITted
action();
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < 10000000; i++)
{
action();
}
sw.Stop();
Console.WriteLine("{0}: {1} millis", action.Method.Name,
(long) sw.Elapsed.TotalMilliseconds);
}
static void ConcatenateTest()
{
string x = "";
foreach (string bit in Bits)
{
x += bit;
}
// Force a validation to prevent dodgy optimizations
if (x.Length != ExpectedLength)
{
throw new Exception("Eek!");
}
}
static void StringBuilderTest()
{
StringBuilder builder = new StringBuilder();
foreach (string bit in Bits)
{
builder.Append(bit);
}
string x = builder.ToString();
// Force a validation to prevent dodgy optimizations
if (x.Length != ExpectedLength)
{
throw new Exception("Eek!");
}
}
}
Results on my machine (compiled with /o+ /debug-):
StringBuilderTest: 2245 millis
ConcatenateTest: 989 millis
I've run this several times, including reversing the order of the tests, and the results are consistent.
StringBuilder is recommended.. why dont you do an analysis for yourself and then decide what is the best for you..
var stopWatch=new StopWatch();
stopWatch.Start();
string html =string.empty;
DataView dV = data.DefaultView;
for(int i=0;i< dV.Count;i++)
{
html += dV.Row["X"].Tostring();
}
stopWatch.Stop();
Console.Write(stopWatch.EllapsedMilliseconds());
var stopWatch=new StopWatch();
stopWatch.Start();
string html =new StringBuilder();
DataView dV = data.DefaultView;
for(int i=0;i< dV.Count;i++)
{
html.Append(dV.Row["X"].ToString());
}
var finalHtml=html.ToString();
stopWatch.Stop();
Console.Write(stopWatch.EllapsedMilliseconds());
From the Documentation:
The String class is preferable for a concatenation operation if a
fixed number of String objects are concatenated. In that case, the
individual concatenation operations might even be combined into a
single operation by the compiler.
A StringBuilder object is preferable for a concatenation operation if
an arbitrary number of strings are concatenated; for example, if a
loop concatenates a random number of strings of user input.
So in your case i would say the String is better.
EDIT:
This is a no end disscussion, anyway i would recommend you to check how many opaeration do you have in average and test the performance for each one of them to compare results.
Check this nice link regarding this issue including some performance test code.
StringBuilder for sure. String are immutable remember !
EDIT: For 3-4 rows, concatenation will be a preferred choice as Jon Skeet has said in his answer
StringBuilder is recommended. It is mutable. It should place much less stress on the memory allocator :-)
A string instance is immutable. You cannot change it after it was
created.
Any operation that appears to change the string instead returns a new instance.
stringbuilder is what you are looking for. In general, if there is a function for some job try to utilize it instead of writing some procedure which does pretty much the same job.
I've written a class for processing strings and I have the following problem: the string passed in can come with spaces at the beginning and at the end of the string.
I need to trim the spaces from the strings and convert them to lower case letters. My code so far:
var searchStr = wordToSearchReplacemntsFor.ToLower();
searchStr = searchStr.Trim();
I couldn't find any function to help me in StringBuilder. The problem is that this class is supposed to process a lot of strings as quickly as possible. So I don't want to be creating 2 new strings for each string the class processes.
If this isn't possible, I'll go deeper into the processing algorithm.
Try method chaining.
Ex:
var s = " YoUr StRiNg".Trim().ToLower();
Cyberdrew has the right idea. With string being immutable, you'll be allocating memory during both of those calls regardless. One thing I'd like to suggest, if you're going to call string.Trim().ToLower() in many locations in your code, is to simplify your calls with extension methods. For example:
public static class MyExtensions
{
public static string TrimAndLower(this String str)
{
return str.Trim().ToLower();
}
}
Here's my attempt. But before I would check this in, I would ask two very important questions.
Are sequential "String.Trim" and "String.ToLower" calls really impacting the performance of my app? Would anyone notice if this algorithm was twice as slow or twice as fast? The only way to know is to measure the performance of my code and compare against pre-set performance goals. Otherwise, micro-optimizations will generate micro-performance gains.
Just because I wrote an implementation that appears faster, doesn't mean that it really is. The compiler and run-time may have optimizations around common operations that I don't know about. I should compare the running time of my code to what already exists.
static public string TrimAndLower(string str)
{
if (str == null)
{
return null;
}
int i = 0;
int j = str.Length - 1;
StringBuilder sb;
while (i < str.Length)
{
if (Char.IsWhiteSpace(str[i])) // or say "if (str[i] == ' ')" if you only care about spaces
{
i++;
}
else
{
break;
}
}
while (j > i)
{
if (Char.IsWhiteSpace(str[j])) // or say "if (str[j] == ' ')" if you only care about spaces
{
j--;
}
else
{
break;
}
}
if (i > j)
{
return "";
}
sb = new StringBuilder(j - i + 1);
while (i <= j)
{
// I was originally check for IsUpper before calling ToLower, probably not needed
sb.Append(Char.ToLower(str[i]));
i++;
}
return sb.ToString();
}
If the strings use only ASCII characters, you can look at the C# ToLower Optimization. You could also try a lookup table if you know the character set ahead of time
So first of all, trim first and replace second, so you have to iterate over a smaller string with your ToLower()
other than that, i think your best algorithm would look like this:
Iterate over the string once, and check
whether there's any upper case characters
whether there's whitespace in beginning and end (and count how many chars you're talking about)
if none of the above, return the original string
if upper case but no whitespace: do ToLower and return
if whitespace:
allocate a new string with the right size (original length - number of white chars)
fill it in while doing the ToLower
You can try this:
public static void Main (string[] args) {
var str = "fr, En, gB";
Console.WriteLine(str.Replace(" ","").ToLower());
}
I am trying to create a utility method to perform mail merge-like functionality on a template file. Since strings are immutable I'm unsure if I've written it properly - can somebody take a glance and give me feedback?
public static string LoadTemplateFile(string fileName,
NameValueCollection mergeFields)
{
string result = System.IO.File.ReadAllText(fileName);
if (mergeFields != null)
{
for (int index = 0; index < mergeFields.Count; index++)
{
result = result.Replace(mergeFields.Keys[index],
mergeFields[index]);
}
}
return result;
}
You'd probably do better to use a StringBuilder instead of a string.
public static string LoadTemplateFile(
string fileName, NameValueCollection mergeFields)
{
System.Text.StringBuilder result = new System.Text.StringBuilder(
System.IO.File.ReadAllText(fileName));
if (mergeFields != null)
{
for (int index = 0; index < mergeFields.Count; index++)
{
result.Replace(mergeFields.Keys[index],
mergeFields[index]);
}
}
return result.ToString();
}
It looks like you are attempting to
Read a file from disk
Do a search / replace based on a provided name / value map
If that's the case then yes this will work just fine.
The only real feedback I have is that depending on the number of replacement name / value pairs, you're going to be creating a lot of temporary strings. This is probably fine for small files but once you start loading relatively large files into your application you may see an appreciable difference.
A better approach would be to use a StringBuilder and do the Replace calls on that object. It would reduce the unnecessary creation of temporary strings.
use StringBuilder instead of string. that is my only advice, its way faster.