Related
What's the most efficient way to concatenate strings?
Rico Mariani, the .NET Performance guru, had an article on this very subject. It's not as simple as one might suspect. The basic advice is this:
If your pattern looks like:
x = f1(...) + f2(...) + f3(...) + f4(...)
that's one concat and it's zippy, StringBuilder probably won't help.
If your pattern looks like:
if (...) x += f1(...)
if (...) x += f2(...)
if (...) x += f3(...)
if (...) x += f4(...)
then you probably want StringBuilder.
Yet another article to support this claim comes from Eric Lippert where he describes the optimizations performed on one line + concatenations in a detailed manner.
The StringBuilder.Append() method is much better than using the + operator. But I've found that, when executing 1000 concatenations or less, String.Join() is even more efficient than StringBuilder.
StringBuilder sb = new StringBuilder();
sb.Append(someString);
The only problem with String.Join is that you have to concatenate the strings with a common delimiter.
Edit: as #ryanversaw pointed out, you can make the delimiter string.Empty.
string key = String.Join("_", new String[]
{ "Customers_Contacts", customerID, database, SessionID });
There are 6 types of string concatenations:
Using the plus (+) symbol.
Using string.Concat().
Using string.Join().
Using string.Format().
Using string.Append().
Using StringBuilder.
In an experiment, it has been proved that string.Concat() is the best way to approach if the words are less than 1000(approximately) and if the words are more than 1000 then StringBuilder should be used.
For more information, check this site.
string.Join() vs string.Concat()
The string.Concat method here is equivalent to the string.Join method invocation with an empty separator. Appending an empty string is fast, but not doing so is even faster, so the string.Concat method would be superior here.
From Chinh Do - StringBuilder is not always faster:
Rules of Thumb
When concatenating three dynamic string values or less, use traditional string concatenation.
When concatenating more than three dynamic string values, use StringBuilder.
When building a big string from several string literals, use either the # string literal or the inline + operator.
Most of the time StringBuilder is your best bet, but there are cases as shown in that post that you should at least think about each situation.
If you're operating in a loop, StringBuilder is probably the way to go; it saves you the overhead of creating new strings regularly. In code that'll only run once, though, String.Concat is probably fine.
However, Rico Mariani (.NET optimization guru) made up a quiz in which he stated at the end that, in most cases, he recommends String.Format.
Here is the fastest method I've evolved over a decade for my large-scale NLP app. I have variations for IEnumerable<T> and other input types, with and without separators of different types (Char, String), but here I show the simple case of concatenating all strings in an array into a single string, with no separator. Latest version here is developed and unit-tested on C# 7 and .NET 4.7.
There are two keys to higher performance; the first is to pre-compute the exact total size required. This step is trivial when the input is an array as shown here. For handling IEnumerable<T> instead, it is worth first gathering the strings into a temporary array for computing that total (The array is required to avoid calling ToString() more than once per element since technically, given the possibility of side-effects, doing so could change the expected semantics of a 'string join' operation).
Next, given the total allocation size of the final string, the biggest boost in performance is gained by building the result string in-place. Doing this requires the (perhaps controversial) technique of temporarily suspending the immutability of a new String which is initially allocated full of zeros. Any such controversy aside, however...
...note that this is the only bulk-concatenation solution on this page which entirely avoids an extra round of allocation and copying by the String constructor.
Complete code:
/// <summary>
/// Concatenate the strings in 'rg', none of which may be null, into a single String.
/// </summary>
public static unsafe String StringJoin(this String[] rg)
{
int i;
if (rg == null || (i = rg.Length) == 0)
return String.Empty;
if (i == 1)
return rg[0];
String s, t;
int cch = 0;
do
cch += rg[--i].Length;
while (i > 0);
if (cch == 0)
return String.Empty;
i = rg.Length;
fixed (Char* _p = (s = new String(default(Char), cch)))
{
Char* pDst = _p + cch;
do
if ((t = rg[--i]).Length > 0)
fixed (Char* pSrc = t)
memcpy(pDst -= t.Length, pSrc, (UIntPtr)(t.Length << 1));
while (pDst > _p);
}
return s;
}
[DllImport("MSVCR120_CLR0400", CallingConvention = CallingConvention.Cdecl)]
static extern unsafe void* memcpy(void* dest, void* src, UIntPtr cb);
I should mention that this code has a slight modification from what I use myself. In the original, I call the cpblk IL instruction from C# to do the actual copying. For simplicity and portability in the code here, I replaced that with P/Invoke memcpy instead, as you can see. For highest performance on x64 (but maybe not x86) you may want to use the cpblk method instead.
From this MSDN article:
There is some overhead associated with
creating a StringBuilder object, both
in time and memory. On a machine with
fast memory, a StringBuilder becomes
worthwhile if you're doing about five
operations. As a rule of thumb, I
would say 10 or more string operations
is a justification for the overhead on
any machine, even a slower one.
So if you trust MSDN go with StringBuilder if you have to do more than 10 strings operations/concatenations - otherwise simple string concat with '+' is fine.
Try this 2 pieces of code and you will find the solution.
static void Main(string[] args)
{
StringBuilder s = new StringBuilder();
for (int i = 0; i < 10000000; i++)
{
s.Append( i.ToString());
}
Console.Write("End");
Console.Read();
}
Vs
static void Main(string[] args)
{
string s = "";
for (int i = 0; i < 10000000; i++)
{
s += i.ToString();
}
Console.Write("End");
Console.Read();
}
You will find that 1st code will end really quick and the memory will be in a good amount.
The second code maybe the memory will be ok, but it will take longer... much longer.
So if you have an application for a lot of users and you need speed, use the 1st. If you have an app for a short term one user app, maybe you can use both or the 2nd will be more "natural" for developers.
Cheers.
It's also important to point it out that you should use the + operator if you are concatenating string literals.
When you concatenate string literals or string constants by using the + operator, the compiler creates a single string. No run time concatenation occurs.
How to: Concatenate Multiple Strings (C# Programming Guide)
Adding to the other answers, please keep in mind that StringBuilder can be told an initial amount of memory to allocate.
The capacity parameter defines the maximum number of characters that can be stored in the memory allocated by the current instance. Its value is assigned to the Capacity property. If the number of characters to be stored in the current instance exceeds this capacity value, the StringBuilder object allocates additional memory to store them.
If capacity is zero, the implementation-specific default capacity is used.
Repeatedly appending to a StringBuilder that hasn't been pre-allocated can result in a lot of unnecessary allocations just like repeatedly concatenating regular strings.
If you know how long the final string will be, can trivially calculate it, or can make an educated guess about the common case (allocating too much isn't necessarily a bad thing), you should be providing this information to the constructor or the Capacity property. Especially when running performance tests to compare StringBuilder with other methods like String.Concat, which do the same thing internally. Any test you see online which doesn't include StringBuilder pre-allocation in its comparisons is wrong.
If you can't make any kind of guess about the size, you're probably writing a utility function which should have its own optional argument for controlling pre-allocation.
Following may be one more alternate solution to concatenate multiple strings.
String str1 = "sometext";
string str2 = "some other text";
string afterConcate = $"{str1}{str2}";
string interpolation
Another solution:
inside the loop, use List instead of string.
List<string> lst= new List<string>();
for(int i=0; i<100000; i++){
...........
lst.Add(...);
}
return String.Join("", lst.ToArray());;
it is very very fast.
The most efficient is to use StringBuilder, like so:
StringBuilder sb = new StringBuilder();
sb.Append("string1");
sb.Append("string2");
...etc...
String strResult = sb.ToString();
#jonezy: String.Concat is fine if you have a couple of small things. But if you're concatenating megabytes of data, your program will likely tank.
System.String is immutable. When we modify the value of a string variable then a new memory is allocated to the new value and the previous memory allocation released. System.StringBuilder was designed to have concept of a mutable string where a variety of operations can be performed without allocation separate memory location for the modified string.
I've tested all the methods in this page and at the end I've developed my solution that is the fastest and less memory expensive.
Note: tested in Framework 4.8
[MemoryDiagnoser]
public class StringConcatSimple
{
private string
title = "Mr.", firstName = "David", middleName = "Patrick", lastName = "Callan";
[Benchmark]
public string FastConcat()
{
return FastConcat(
title, " ",
firstName, " ",
middleName, " ",
lastName);
}
[Benchmark]
public string StringBuilder()
{
var stringBuilder =
new StringBuilder();
return stringBuilder
.Append(title).Append(' ')
.Append(firstName).Append(' ')
.Append(middleName).Append(' ')
.Append(lastName).ToString();
}
[Benchmark]
public string StringBuilderExact24()
{
var stringBuilder =
new StringBuilder(24);
return stringBuilder
.Append(title).Append(' ')
.Append(firstName).Append(' ')
.Append(middleName).Append(' ')
.Append(lastName).ToString();
}
[Benchmark]
public string StringBuilderEstimate100()
{
var stringBuilder =
new StringBuilder(100);
return stringBuilder
.Append(title).Append(' ')
.Append(firstName).Append(' ')
.Append(middleName).Append(' ')
.Append(lastName).ToString();
}
[Benchmark]
public string StringPlus()
{
return title + ' ' + firstName + ' ' +
middleName + ' ' + lastName;
}
[Benchmark]
public string StringFormat()
{
return string.Format("{0} {1} {2} {3}",
title, firstName, middleName, lastName);
}
[Benchmark]
public string StringInterpolation()
{
return
$"{title} {firstName} {middleName} {lastName}";
}
[Benchmark]
public string StringJoin()
{
return string.Join(" ", title, firstName,
middleName, lastName);
}
[Benchmark]
public string StringConcat()
{
return string.
Concat(new String[]
{ title, " ", firstName, " ",
middleName, " ", lastName });
}
}
Yes, it use unsafe
public static unsafe string FastConcat(string str1, string str2, string str3, string str4, string str5, string str6, string str7)
{
var capacity = 0;
var str1Length = 0;
var str2Length = 0;
var str3Length = 0;
var str4Length = 0;
var str5Length = 0;
var str6Length = 0;
var str7Length = 0;
if (str1 != null)
{
str1Length = str1.Length;
capacity = str1Length;
}
if (str2 != null)
{
str2Length = str2.Length;
capacity += str2Length;
}
if (str3 != null)
{
str3Length = str3.Length;
capacity += str3Length;
}
if (str4 != null)
{
str4Length = str4.Length;
capacity += str4Length;
}
if (str5 != null)
{
str5Length = str5.Length;
capacity += str5Length;
}
if (str6 != null)
{
str6Length = str6.Length;
capacity += str6Length;
}
if (str7 != null)
{
str7Length = str7.Length;
capacity += str7Length;
}
string result = new string(' ', capacity);
fixed (char* dest = result)
{
var x = dest;
if (str1Length > 0)
{
fixed (char* src = str1)
{
Unsafe.CopyBlock(x, src, (uint)str1Length * 2);
x += str1Length;
}
}
if (str2Length > 0)
{
fixed (char* src = str2)
{
Unsafe.CopyBlock(x, src, (uint)str2Length * 2);
x += str2Length;
}
}
if (str3Length > 0)
{
fixed (char* src = str3)
{
Unsafe.CopyBlock(x, src, (uint)str3Length * 2);
x += str3Length;
}
}
if (str4Length > 0)
{
fixed (char* src = str4)
{
Unsafe.CopyBlock(x, src, (uint)str4Length * 2);
x += str4Length;
}
}
if (str5Length > 0)
{
fixed (char* src = str5)
{
Unsafe.CopyBlock(x, src, (uint)str5Length * 2);
x += str5Length;
}
}
if (str6Length > 0)
{
fixed (char* src = str6)
{
Unsafe.CopyBlock(x, src, (uint)str6Length * 2);
x += str6Length;
}
}
if (str7Length > 0)
{
fixed (char* src = str7)
{
Unsafe.CopyBlock(x, src, (uint)str7Length * 2);
}
}
}
return result;
}
You can edit the method and adapt it to your case. For example you can make it something like
public static unsafe string FastConcat(string str1, string str2, string str3 = null, string str4 = null, string str5 = null, string str6 = null, string str7 = null)
For just two strings, you definitely do not want to use StringBuilder. There is some threshold above which the StringBuilder overhead is less than the overhead of allocating multiple strings.
So, for more that 2-3 strings, use DannySmurf's code. Otherwise, just use the + operator.
It really depends on your usage pattern.
A detailed benchmark between string.Join, string,Concat and string.Format can be found here: String.Format Isn't Suitable for Intensive Logging
(This is actually the same answer I gave to this question)
It would depend on the code.
StringBuilder is more efficient generally, but if you're only concatenating a few strings and doing it all in one line, code optimizations will likely take care of it for you. It's important to think about how the code looks too: for larger sets StringBuilder will make it easier to read, for small ones StringBuilder will just add needless clutter.
I want an efficient way of grouping strings whilst keeping duplicates and order.
Something like this
1100110002200 -> 101020
I tried this previously
_case.GroupBy(c => c).Select(g => g.Key)
but I got 102
But this gives me what I want, I just want to optimize it, so I wouldn't have to scour the entire list each time
static List<char> group(string _case)
{
var groups = new List<char>();
for (int i = 0; i < _case.Length; i++)
{
if (groups.LastOrDefault() != _case[i])
groups.Add(_case[i]);
}
return groups;
}
While I like the elegant solution of rshepp, it turns out that the very basic code can run even 5 times faster than that.
public static string Simplify2(string str)
{
if (string.IsNullOrEmpty(str)) { return str; }
StringBuilder sb = new StringBuilder();
char last = str[0];
sb.Append(last);
foreach (char c in str)
{
if (last != c)
{
sb.Append(c);
last = c;
}
}
return sb.ToString();
}
You could create a method that loops each character and checks the previous character for equality. If they aren't the same, append/yield return the character. This is pretty easy to do with Linq.
public static string Simplify(string str)
{
return string.Concat(str.Where((c, i) => i == 0 || c != str[i - 1]));
}
Usage:
string simplified = Simplify("1100110002200");
// 101020
In my testing, my method and yours are roughly equal in speed, mine being insignificantly slower after 10 million executions (4260ms vs 4241ms).
However, my method returns the result as a string whereas yours doesn't. If you need to convert your result back to a string (which is likely) then my method is indeed much faster/more efficient (4260ms vs 6569ms).
This question already has answers here:
How can I generate random alphanumeric strings?
(36 answers)
Closed 2 years ago.
My ASP.NET application requires me to generate a huge number of random strings such that each contain at least 1 alphabetic and numeric character and should be alphanumeric on the whole.
For this my logic is to generate the code again if the random string is numeric:
public static string GenerateCode(int length)
{
if (length < 2 || length > 32)
{
throw new RSGException("Length cannot be less than 2 or greater than 32.");
}
string newcode = Guid.NewGuid().ToString("n").Substring(0, length).ToUpper();
return newcode;
}
public static string GenerateNonNumericCode(int length)
{
string newcode = string.Empty;
try
{
newcode = GenerateCode(length);
}
catch (Exception)
{
throw;
}
while (IsNumeric(newcode))
{
return GenerateNonNumericCode(length);
}
return newcode;
}
public static bool IsNumeric(string str)
{
bool isNumeric = false;
try
{
long number = Convert.ToInt64(str);
isNumeric = true;
}
catch (Exception)
{
isNumeric = false;
}
return isNumeric;
}
While debugging, it is working properly but when I ask it to create 10,000 random strings, its not able to handle it properly. When I export that data to Excel, I find at least 20 strings on an average that are numeric.
Is it a problem with my code or C#? - Mine.
If anyone's looking for code,
public static string GenerateCode(int length)
{
if (length < 2)
{
throw new A1Exception("Length cannot be less than 2.");
}
var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
var random = new Random();
var result = new string(
Enumerable.Repeat(chars, length)
.Select(s => s[random.Next(s.Length)])
.ToArray());
return result;
}
public static string GenerateAlphaNumericCode(int length)
{
string newcode = string.Empty;
try
{
newcode = GenerateCode(length);
while (!IsAlphaNumeric(newcode))
{
newcode = GenerateCode(length);
}
}
catch (Exception)
{
throw;
}
return newcode;
}
public static bool IsAlphaNumeric(string str)
{
bool isAlphaNumeric = false;
Regex reg = new Regex("[0-9A-Z]+");
isAlphaNumeric = reg.IsMatch(str);
return isAlphaNumeric;
}
Thanks to all for your ideas.
If you want to stick with the Guid as the generator, you could always validate using a Regex
This will only return true if at least one alpha is present
Regex reg = new Regex("[a-zA-Z]+");
Then just use the IsMatch method to see if your string is valid
That way you don't need the (IMHO rather ugly) try..catch around the Convert.
Update : I see your subsequent comment about actually making your code slower. Are you instantiating the Regex object only once, or every time that the test is being done? If the latter then this will be rather inefficient, and you should consider using a "lazy-loaded" property on your class, e.g.
private Regex reg;
private Regex AlphaRegex
{
get
{
if (reg == null) reg = new Regex("[a-zA-Z]+");
return reg;
}
}
Then just use AlphaRegex.IsMatch() in your method. I would expect this to make a difference.
use name space then using System.Linq; use normal string
check whether the string consist at lest one character or number.
using System.Linq;
string StrCheck = "abcd123";
check the string has characters ---> StrCheck.Any(char.IsLetter)
check the string has numbers ---> StrCheck.Any(char.IsDigit)
if (StrCheck.Any(char.IsLetter) && StrCheck.Any(char.IsDigit))
{
//statement goes here.....
}
sorry for the late reply ...
I didn't quite understand what you want in the string except letters (abc etc) - lets say numbers.
You can generate a random character as following:
Random r = new Random();
r.Next('a', 'z'); //For lowercase
r.Next('A', 'Z'); //For capitals
//or you can convert lowercase to capital:
char c = 'k' + ('A' - 'a');
If you want to create a string:
var s = new StringBuilder();
for(int i = 0; i < length; ++i)
s.Append((char)r.Next('a', 'z' + 1)); //Changed to char
return s.ToString();
Note: I don't know ASP.NET so much, so I just act like it's C#.
To answer your question strictly, using your existing code: there is a problem with your recursion logic, which can be avoided by not using recursion (there is absolutely no reason to use recursion in GenerateNonNumericCode). Do the following instead:
public static string GenerateNonNumericCode(int length)
{
string newcode = GenerateCode(length);
while (IsNumeric(newcode))
{
newcode = GenerateCode(length);
}
return newcode;
}
Other General Notes
Your code is very inefficient--throwing exceptions is expensive, so using try/catch in a loop is therefore slow and pointless. As others have suggested, regex makes more sense (System.Text.RegularExpressions namespace).
Is it a problem with my code or C#?
When in doubt, the problem is almost never C#.
So, I would change the code to this:
static Random r = new Random();
public static string GenerateNonNumericCodeFaster(int length) {
var firstLength = r.Next(0, length - 1);
var secondLength = length - 1 - firstLength;
return GenerateCode(firstLength)
+ (char) r.Next((int)'A', (int)'G')
+ GenerateCode(secondLength);
}
You can keep your GenerateCode function as is. Everything else you toss out. The idea here of course is, rather than testing if the string contains an alphabetic character, you just explicitly PUT one in. In my tests, using this code could generate 10,000 8 character strings in 0.0172963 seconds compared to your code which takes around 52 seconds. So, yeah, this is about 3000 times faster :)
I were asked to do an StringToInt / Int.parse function on the white board in an job interview last week and did not perform very good but I came up with some sort of solution. Later when back home I made one in Visual Studion and I wonder if there are any better solution than mine below.
Have not bothred with any more error handling except checking that the string only contains digits.
private int StrToInt(string tmpString)
{
int tmpResult = 0;
System.Text.Encoding ascii = System.Text.Encoding.ASCII;
byte[] tmpByte = ascii.GetBytes(tmpString);
for (int i = 0; i <= tmpString.Length-1; i++)
{
// Check whatever the Character is an valid digit
if (tmpByte[i] > 47 && tmpByte[i] <= 58)
// Here I'm using the lenght-1 of the string to set the power and multiply this to the value
tmpResult += (tmpByte[i] - 48) * ((int)Math.Pow(10, (tmpString.Length-i)-1));
else
throw new Exception("Non valid character in string");
}
return tmpResult;
}
I'll take a contrarian approach.
public int? ToInt(this string mightBeInt)
{
int convertedInt;
if (int.TryParse(mightBeInt, out convertedInt))
{
return convertedInt;
}
return null;
}
After being told that this wasn't the point of the question, I'd argue that the question tests C coding skills, not C#. I'd further argue that treating strings as arrays of characters is a very bad habit in .NET, because strings are unicode, and in any application that might be globalized, making any assumption at all about character representations will get you in trouble, sooner or later. Further, the framework already provides a conversion method, and it will be more efficient and reliable than anything a developer would toss off in such a hurry. It's always a bad idea to re-invent framework functionality.
Then I would point out that by writing an extension method, I've created a very useful extension to the string class, something that I would actually use in production code.
If that argument loses me the job, I probably wouldn't want to work there anyway.
EDIT: As a couple of people have pointed out, I missed the "out" keyword in TryParse. Fixed.
Converting to a byte array is unnecessary, because a string is already an array of chars. Also, magic numbers such as 48 should be avoided in favor of readable constants such as '0'. Here's how I'd do it:
int result = 0;
for (int i = str.Length - 1, factor = 1; i >= 0; i--, factor *= 10)
result += (str[i] - '0') * factor;
For each character (starting from the end), add its numeric value times the correct power of 10 to the result. The power of 10 is calculated by multiplying it with 10 repeatedly, instead of unnecessarily using Math.Pow.
I think your solution is reasonably ok, but instead of doing math.pow, I would do:
tmpResult = 10 * tmpResult + (tmpByte[i] - 48);
Also, check the length against the length of tmpByte rather than tmpString. Not that it normally should matter, but it is rather odd to loop over one array while checking the length of another.
And, you could replace the for loop with a foreach statement.
If you want a simple non-framework using implementation, how 'bout this:
"1234".Aggregate(0, (s,c)=> c-'0'+10*s)
...and a note that you'd better be sure that the string consists solely of decimal digits before using this method.
Alternately, use an int? as the aggregate value to deal with error handling:
"12x34".Aggregate((int?)0, (s,c)=> c>='0'&&c<='9' ? c-'0'+10*s : null)
...this time with the note that empty strings evaluate to 0, which may not be most appropriate behavior - and no range checking or negative numbers are supported; both of which aren't hard to add but require unpretty looking wordy code :-).
Obviously, in practice you'd just use the built-in parsing methods. I actually use the following extension method and a bunch of nearly identical siblings in real projects:
public static int? ParseAsInt32(this string s, NumberStyles style, IFormatProvider provider) {
int val;
if (int.TryParse(s, style, provider, out val)) return val;
else return null;
}
Though this could be expressed slightly shorter using the ternary ? : operator doing so would mean relying on side-effects within an expression, which isn't a boon to readability in my experience.
Just because i like Linq:
string t = "1234";
var result = t.Select((c, i) => (c - '0') * Math.Pow(10, t.Length - i - 1)).Sum();
I agree with Cyclon Cat, they probably want someone who will utilize existing functionality.
But I would write the method a little bit different.
public int? ToInt(this string mightBeInt)
{
int number = 0;
if (Int32.TryParse(mightBeInt, out number))
return number;
return null;
}
Int32.TryParse does not allow properties to be given as out parameter.
I was asked this question over 9000 times on interviews :) This version is capable of handling negative numbers and handles other conditions very well:
public static int ToInt(string s)
{
bool isNegative = false, gotAnyDigit = false;
int result = 0;
foreach (var ch in s ?? "")
{
if(ch == '-' && !(gotAnyDigit || isNegative))
{
isNegative = true;
}
else if(char.IsDigit(ch))
{
result = result*10 + (ch - '0');
gotAnyDigit = true;
}
else
{
throw new ArgumentException("Not a number");
}
}
if (!gotAnyDigit)
throw new ArgumentException("Not a number");
return isNegative ? -result : result;
}
and a couple of lazy tests:
[TestFixture]
public class Tests
{
[Test]
public void CommonCases()
{
foreach (var sample in new[]
{
new {e = 123, s = "123"},
new {e = 110, s = "000110"},
new {e = -011000, s = "-011000"},
new {e = 0, s = "0"},
new {e = 1, s = "1"},
new {e = -2, s = "-2"},
new {e = -12223, s = "-12223"},
new {e = int.MaxValue, s = int.MaxValue.ToString()},
new {e = int.MinValue, s = int.MinValue.ToString()}
})
{
Assert.AreEqual(sample.e, Impl.ToInt(sample.s));
}
}
[Test]
public void BadCases()
{
var samples = new[] { "1231a", null, "", "a", "-a", "-", "12-23", "--1" };
var errCount = 0;
foreach (var sample in samples)
{
try
{
Impl.ToInt(sample);
}
catch(ArgumentException)
{
errCount++;
}
}
Assert.AreEqual(samples.Length, errCount);
}
}
Suppose I have a stringbuilder in C# that does this:
StringBuilder sb = new StringBuilder();
string cat = "cat";
sb.Append("the ").Append(cat).(" in the hat");
string s = sb.ToString();
would that be as efficient or any more efficient as having:
string cat = "cat";
string s = String.Format("The {0} in the hat", cat);
If so, why?
EDIT
After some interesting answers, I realised I probably should have been a little clearer in what I was asking. I wasn't so much asking for which was quicker at concatenating a string, but which is quicker at injecting one string into another.
In both cases above I want to inject one or more strings into the middle of a predefined template string.
Sorry for the confusion
NOTE: This answer was written when .NET 2.0 was the current version. This may no longer apply to later versions.
String.Format uses a StringBuilder internally:
public static string Format(IFormatProvider provider, string format, params object[] args)
{
if ((format == null) || (args == null))
{
throw new ArgumentNullException((format == null) ? "format" : "args");
}
StringBuilder builder = new StringBuilder(format.Length + (args.Length * 8));
builder.AppendFormat(provider, format, args);
return builder.ToString();
}
The above code is a snippet from mscorlib, so the question becomes "is StringBuilder.Append() faster than StringBuilder.AppendFormat()"?
Without benchmarking I'd probably say that the code sample above would run more quickly using .Append(). But it's a guess, try benchmarking and/or profiling the two to get a proper comparison.
This chap, Jerry Dixon, did some benchmarking:
http://jdixon.dotnetdevelopersjournal.com/string_concatenation_stringbuilder_and_stringformat.htm
Updated:
Sadly the link above has since died. However there's still a copy on the Way Back Machine:
http://web.archive.org/web/20090417100252/http://jdixon.dotnetdevelopersjournal.com/string_concatenation_stringbuilder_and_stringformat.htm
At the end of the day it depends whether your string formatting is going to be called repetitively, i.e. you're doing some serious text processing over 100's of megabytes of text, or whether it's being called when a user clicks a button now and again. Unless you're doing some huge batch processing job I'd stick with String.Format, it aids code readability. If you suspect a perf bottleneck then stick a profiler on your code and see where it really is.
From the MSDN documentation:
The performance of a concatenation operation for a String or StringBuilder object depends on how often a memory allocation occurs. A String concatenation operation always allocates memory, whereas a StringBuilder concatenation operation only allocates memory if the StringBuilder object buffer is too small to accommodate the new data. Consequently, the String class is preferable for a concatenation operation if a fixed number of String objects are concatenated. In that case, the individual concatenation operations might even be combined into a single operation by the compiler. A StringBuilder object is preferable for a concatenation operation if an arbitrary number of strings are concatenated; for example, if a loop concatenates a random number of strings of user input.
I ran some quick performance benchmarks, and for 100,000 operations averaged over 10 runs, the first method (String Builder) takes almost half the time of the second (String Format).
So, if this is infrequent, it doesn't matter. But if it is a common operation, then you may want to use the first method.
I would expect String.Format to be slower - it has to parse the string and then concatenate it.
Couple of notes:
Format is the way to go for user-visible strings in professional applications; this avoids localization bugs
If you know the length of the resultant string beforehand, use the StringBuilder(Int32) constructor to predefine the capacity
I think in most cases like this clarity, and not efficiency, should be your biggest concern. Unless you're crushing together tons of strings, or building something for a lower powered mobile device, this probably won't make much of a dent in your run speed.
I've found that, in cases where I'm building strings in a fairly linear fashion, either doing straight concatenations or using StringBuilder is your best option. I suggest this in cases where the majority of the string that you're building is dynamic. Since very little of the text is static, the most important thing is that it's clear where each piece of dynamic text is being put in case it needs updated in the future.
On the other hand, if you're talking about a big chunk of static text with two or three variables in it, even if it's a little less efficient, I think the clarity you gain from string.Format makes it worth it. I used this earlier this week when having to place one bit of dynamic text in the center of a 4 page document. It'll be easier to update that big chunk of text if its in one piece than having to update three pieces that you concatenate together.
If only because string.Format doesn't exactly do what you might think, here is a rerun of the tests 6 years later on Net45.
Concat is still fastest but really it's less than 30% difference. StringBuilder and Format differ by barely 5-10%. I got variations of 20% running the tests a few times.
Milliseconds, a million iterations:
Concatenation: 367
New stringBuilder for each key: 452
Cached StringBuilder: 419
string.Format: 475
The lesson I take away is that the performance difference is trivial and so it shouldn't stop you writing the simplest readable code you can. Which for my money is often but not always a + b + c.
const int iterations=1000000;
var keyprefix= this.GetType().FullName;
var maxkeylength=keyprefix + 1 + 1+ Math.Log10(iterations);
Console.WriteLine("KeyPrefix \"{0}\", Max Key Length {1}",keyprefix, maxkeylength);
var concatkeys= new string[iterations];
var stringbuilderkeys= new string[iterations];
var cachedsbkeys= new string[iterations];
var formatkeys= new string[iterations];
var stopwatch= new System.Diagnostics.Stopwatch();
Console.WriteLine("Concatenation:");
stopwatch.Start();
for(int i=0; i<iterations; i++){
var key1= keyprefix+":" + i.ToString();
concatkeys[i]=key1;
}
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Console.WriteLine("New stringBuilder for each key:");
stopwatch.Restart();
for(int i=0; i<iterations; i++){
var key2= new StringBuilder(keyprefix).Append(":").Append(i.ToString()).ToString();
stringbuilderkeys[i]= key2;
}
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Console.WriteLine("Cached StringBuilder:");
var cachedSB= new StringBuilder(maxkeylength);
stopwatch.Restart();
for(int i=0; i<iterations; i++){
var key2b= cachedSB.Clear().Append(keyprefix).Append(":").Append(i.ToString()).ToString();
cachedsbkeys[i]= key2b;
}
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Console.WriteLine("string.Format");
stopwatch.Restart();
for(int i=0; i<iterations; i++){
var key3= string.Format("{0}:{1}", keyprefix,i.ToString());
formatkeys[i]= key3;
}
Console.WriteLine(stopwatch.ElapsedMilliseconds);
var referToTheComputedValuesSoCompilerCantOptimiseTheLoopsAway= concatkeys.Union(stringbuilderkeys).Union(cachedsbkeys).Union(formatkeys).LastOrDefault(x=>x[1]=='-');
Console.WriteLine(referToTheComputedValuesSoCompilerCantOptimiseTheLoopsAway);
String.Format uses StringBuilder internally, so logically that leads to the idea that it would be a little less performant due to more overhead. However, a simple string concatenation is the fastest method of injecting one string between two others, by a significant degree. This evidence was demonstrated by Rico Mariani in his very first Performance Quiz, years ago. Simple fact is that concatenations, when the number of string parts is known (without limitation — you could concatenate a thousand parts, as long as you know it's always 1000 parts), are always faster than StringBuilder or String.Format. They can be performed with a single memory allocation and a series of memory copies. Here is the proof.
And here is the actual code for some String.Concat methods, which ultimately call FillStringChecked, which uses pointers to copy memory (extracted via Reflector):
public static string Concat(params string[] values)
{
int totalLength = 0;
if (values == null)
{
throw new ArgumentNullException("values");
}
string[] strArray = new string[values.Length];
for (int i = 0; i < values.Length; i++)
{
string str = values[i];
strArray[i] = (str == null) ? Empty : str;
totalLength += strArray[i].Length;
if (totalLength < 0)
{
throw new OutOfMemoryException();
}
}
return ConcatArray(strArray, totalLength);
}
public static string Concat(string str0, string str1, string str2, string str3)
{
if (((str0 == null) && (str1 == null)) && ((str2 == null) && (str3 == null)))
{
return Empty;
}
if (str0 == null)
{
str0 = Empty;
}
if (str1 == null)
{
str1 = Empty;
}
if (str2 == null)
{
str2 = Empty;
}
if (str3 == null)
{
str3 = Empty;
}
int length = ((str0.Length + str1.Length) + str2.Length) + str3.Length;
string dest = FastAllocateString(length);
FillStringChecked(dest, 0, str0);
FillStringChecked(dest, str0.Length, str1);
FillStringChecked(dest, str0.Length + str1.Length, str2);
FillStringChecked(dest, (str0.Length + str1.Length) + str2.Length, str3);
return dest;
}
private static string ConcatArray(string[] values, int totalLength)
{
string dest = FastAllocateString(totalLength);
int destPos = 0;
for (int i = 0; i < values.Length; i++)
{
FillStringChecked(dest, destPos, values[i]);
destPos += values[i].Length;
}
return dest;
}
private static unsafe void FillStringChecked(string dest, int destPos, string src)
{
int length = src.Length;
if (length > (dest.Length - destPos))
{
throw new IndexOutOfRangeException();
}
fixed (char* chRef = &dest.m_firstChar)
{
fixed (char* chRef2 = &src.m_firstChar)
{
wstrcpy(chRef + destPos, chRef2, length);
}
}
}
So, then:
string what = "cat";
string inthehat = "The " + what + " in the hat!";
Enjoy!
Oh also, the fastest would be:
string cat = "cat";
string s = "The " + cat + " in the hat";
It really depends. For small strings with few concatenations, it's actually faster just to append the strings.
String s = "String A" + "String B";
But for larger string (very very large strings), it's then more efficient to use StringBuilder.
In both cases above I want to inject one or more strings into the middle of a predefined template string.
In which case, I would suggest String.Format is the quickest because it is design for that exact purpose.
It really depends on your usage pattern.
A detailed benchmark between string.Join, string,Concat and string.Format can be found here: String.Format Isn't Suitable for Intensive Logging
I would suggest not, since String.Format was not designed for concatenation, it was design for formatting the output of various inputs such as a date.
String s = String.Format("Today is {0:dd-MMM-yyyy}.", DateTime.Today);