I have the following intentionally trivial function:
void ReplaceSome(ref string text)
{
StringBuilder sb = new StringBuilder(text);
sb[5] = 'a';
text = sb.ToString();
}
It appears to be inefficient to convert this to a StringBuilder to index into and replace some of the characters only to copy it back to the ref'd param. Is it possible to index directly into the text param as an L-Value?
Or how else can I improve this?
C# strings are "immutable," which means that they can't be modified. If you have a string, and you want a similar but different string, you must create a new string. Using a StringBuilder as you do above is probably as easy a method as any.
Armed with Reflector and the decompiled IL - On a pure LOC basis then the StringBuilder approach is definitely the most efficient. Eg tracing the IL calls that StringBuilder makes internally vs the IL calls for String::Remove and String::Insert etc.
I couldn't be bothered testing the memory overhead of each approach, but would imagine it would be in line with reflector results - the StringBuilder approach would be the best.
I think the fact the StringBuilder has a set memory size using the constructor
StringBuilder sb = new StringBuilder(text);
would help overall too.
Like others have mentioned, it would come down to readability vs efficiency...
text = text.Substring(0, 4) + "a" + text.Substring(5);
Not dramatically different than your StringBuilder solution, but slightly more concise than the Remove(), Insert() answer.
I don't know if this is more efficient, but it works. Either way you'll have to recreate the string after each change since they're immutable.
string test = "hello world";
Console.WriteLine(test);
test = test.Remove(5, 1);
test = test.Insert(5, "z");
Console.WriteLine(test);
Or if you want it more concise:
string test = "hello world".Remove(5, 1).Insert(5, "z");
Related
I'm currently using C#, but I believe the question applies to more languages.
I have a method, which takes a string value, and throws an exception, if it's too big. I want to unit test it that the exception is correct.
int vlen = Encoding.UTF8.GetByteCount(value);
if (vlen < 0 || 0x0FFFFFFF < vlen)
throw new ArgumentException("Valid UTF8 encodded value length is up to 256MB!", "value");
What is the best way to generate such a string? Should I just have a file of that size? Should I create such a file every time running unit tests?
string has a constructor that lest you specify a length and a characer to repeat:
string longString = new string('a',0x0FFFFFFF + 1);
You can simply use a StringBuilder:
StringBuilder builder = new StringBuilder();
builder.Append('a', 0x10000000);
string s = builder.ToString();
//Console.WriteLine(s.Length);
YourMethodToTest(s);
This takes no measurable time at my machine and I'm sure there won't be a serious performance issue on your machine either.
What I would usually use in Unit Tests is a package called AutoFixture.
With that you can do the following to generate a large string:
string.Join(string.Empty, Fixture.CreateMany<char>(length))
Looking for better algorithm / technique for replacing strings in a string variable. I have to loop through an unknown number of database records and for each one, I need to replace some text in a string variable. Right now it looks like this, but there has to be a better way:
using (eds ctx = new eds())
{
string targetText = "This is a sample string with words that will get replaced based on data pulled from the database";
List<parameter> lstParameters = ctx.ciParameters.ToList();
foreach (parameter in lstParameters)
{
string searchKey = parameter.searchKey;
string newValue = parameter.value;
targetText = targetText.Replace(searchKey, newValue);
}
}
From my understanding this is not good because I'm over writing the targetText variable, over and over in the loop. However, I'm not sure how structure the find and replace...
Appreciate any feedback.
there has to be a better way
Strings are immutable - you can't "change" them - all you can do is create a new string and replace the variable value (which is not as bad as you think). You could try using a StringBuilder as other suggest, but it's not 100% guaranteed to improve your performance.
You could change your algorithm to loop through the "words" in targetText, see if there's a match in parameters , take the "replacement" value and build up a new string, but I suspect the extra lookups will cost more than recreating the string value multiple times.
In any case, two important principles of performance improvement should be considered:
Start with the slowest part of your app first - you may see some improvement but if it does not improve the overall performance significantly then it doesn't matter that much
The only way to know if a particular change will improve your performance (and by how much) is to try it both ways and measure it.
StringBuilder will have less memory overhead and better performance, especially on large strings. String.Replace() vs. StringBuilder.Replace()
using (eds ctx = new eds())
{
string targetText = "This is a sample string with words that will get replaced based on data pulled from the database";
var builder = new StringBuilder(targetText);
List<parameter> lstParameters = ctx.ciParameters.ToList();
foreach (parameter in lstParameters)
{
string searchKey = parameter.searchKey;
string newValue = parameter.value;
targetText = builder.Replace(searchKey, newValue);
}
}
Actually, there is a better answer, assuming you're doing a large number of replacements. You can use a StringBuilder. As you know, strings are immutable. So as you said, you're creating strings over and over again in your loop.
If you convert your string to a StringBuilder
StringBuilder s = new StringBuilder(s, s.Length*2); // Adjust the capacity based on how much bigger you think the string will get due to replacements. The more accurate your estimate, the better this will perform.
foreach (parameter in lstParameters)
{
s.Replace(parameter.searchKey, parameter.value);
}
string targetString = s.ToString();
Now a caveat, if your list only has 2-3 items in it, this might not be any better. The answer to this question provides a nice analysis of the performance improvement you can expect to see.
Say I have a StringBuilder object
var sb = new StringBuilder();
And an arbritrary array of strings
var s = new []{"a","b","c"};
Is this the 'quickest' way to insert them into the stringbuilder instance?
sb.Append(string.join(string.empty, s));
Or does StringBuilder have a function I have overlooked?
Edit: Sorry I dont know how many items sb will contain, or how many items may be in each String[].
If you mean by "quickest" most performant than better use:
for(int i = 0; i < myArrayLen; i++)
sb.Append(myArray[i]);
string.Concat(...) should be faster than string.Join("", ...). Also, this depends on what else you're doing with your StringBuilder. If you're only performing a few concatenations then it can be faster not to use it.
More context always helps!
Believe it or not, but string.Concat is faster than StringBuilder when the strings are 4/5.
This article discuss the question very well.
To do this in one line without a loop, you could do this:
sb.Append(String.Join(Environment.NewLine, s));
and will also work where s is any type of
IEnumerable<string>
I believe you already have the correct answer:
sb.Append(string.join(string.empty, s))
Im still learning in C#, and there is one thing i cant really seem to find the answer to.
If i have a string that looks like this "abcdefg012345", and i want to make it look like "ab-cde-fg-012345"
i tought of something like this:
string S1 = "abcdefg012345";
string S2 = S1.Insert(2, "-");
string S3 = S2.Insert(6, "-");
string S4 = S3.Insert.....
...
..
Now i was looking if it would be possible to get this al into 1 line somehow, without having to make all those strings.
I assume this would be possible somehow ?
Whether or not you can make this a one-liner (you can), it will always cause multiple strings to be created, due to the immutability of the String in .NET
If you want to do this somewhat efficiently, without creating multiple strings, you could use a StringBuilder. An extension method could also be useful to make it easier to use.
public static class StringExtensions
{
public static string MultiInsert(this string str, string insertChar, params int[] positions)
{
StringBuilder sb = new StringBuilder(str.Length + (positions.Length*insertChar.Length));
var posLookup = new HashSet<int>(positions);
for(int i=0;i<str.Length;i++)
{
sb.Append(str[i]);
if(posLookup.Contains(i))
sb.Append(insertChar);
}
return sb.ToString();
}
}
Note that this example initialises StringBuilder to the correct length up-front, therefore avoiding the need to grow the StringBuilder.
Usage: "abcdefg012345".MultiInsert("-",2,5); // yields "abc-def-g012345"
Live example: http://rextester.com/EZPQ89741
string S1 = "abcdefg012345".Insert(2, "-").Insert(6, "-")..... ;
If the positions for the inserted strings are constant you could consider using string.Format() method. For example:
string strTarget = String.Format("abc{0}def{0}g012345","-");
string s = "abcdefg012345";
foreach (var index in [2, 6, ...]
{
s = s.Insert(index, "-");
}
I like this
StringBuilder sb = new StringBuilder("abcdefg012345");
sb.Insert(6, '-').Insert(2, '-').ToString();
String s1 = "abcdefg012345";
String seperator = "-";
s1 = s1.Insert(2, seperator).Insert(6, seperator).Insert(9, seperator);
Chaining them like that keeps your line count down. This works because the Insert method returns the string value of s1 with the parameters supplied, then the Insert function is being called on that returned string and so on.
Also it's worth noting that String is a special immutable class so each time you set a value to it, it is being recreated. Also worth noting that String is a special type that allows you to set it to a new instance with calling the constructor on it, the first line above will be under the hood calling the constructor with the text in the speech marks.
Just for the sake of completion and to show the use of the lesser known Aggregate function, here's another one-liner:
string result = new[] { 2, 5, 8, 15 }.Aggregate("abcdefg012345", (s, i) => s.Insert(i, "-"));
result is ab-cd-ef-g01234-5. I wouldn't recommend this variant, though. It's way too hard to grasp on first sight.
Edit: this solution is not valid, anyway, as the "-" will be inserted at the index of the already modified string, not at the positions wrt to the original string. But then again, most of the answers here suffer from the same problem.
You should use a StringBuilder in this case as Strings objects are immutable and your code would essentially create a completely new string for each one of those operations.
http://msdn.microsoft.com/en-us/library/2839d5h5(v=vs.71).aspx
Some more information available here:
http://www.dotnetperls.com/stringbuilder
Example:
namespace ConsoleApplication10
{
class Program
{
static void Main(string[] args)
{
StringBuilder sb = new StringBuilder("abcdefg012345");
sb.Insert(2, '-');
sb.Insert(6, '-');
Console.WriteLine(sb);
Console.Read();
}
}
}
If you really want it on a single line you could simply do something like this:
StringBuilder sb = new StringBuilder("abcdefg012345").Insert(2, '-').Insert(6, '-');
Assume I have the following string constants:
const string constString1 = "Const String 1";
const string constString2 = "Const String 2";
const string constString3 = "Const String 3";
const string constString4 = "Const String 4";
Now I can append the strings in two ways:
Option1:
string resultString = constString1 + constString2 + constString3 + constString4;
Option2:
string resultString = string.Format("{0}{1}{2}{3}",constString1,constString2,constString3,constString4);
Internally string.Format uses StringBuilder.AppendFormat. Now given the fact that I am appending constant strings, which of the options (option1 or option 2) is better with respect to performance and/or memory?
The first one will be done by the compiler (at least the Microsoft C# Compiler) (in the same way that the compiler does 1+2), the second one must be done at runtime. So clearly the first one is faster.
As an added benefit, in the first one the string is internalized, in the second one it isn't.
And String.Format is quite slow :-) (read this
http://msmvps.com/blogs/jon_skeet/archive/2008/10/06/formatting-strings.aspx). NOT "slow enough to be a problem", UNLESS all your program do all the day is format strings (MILLIONS of them, not TENS). Then you could probably to it faster Appending them to a StringBuilder.
The first variant will be best, but only when you are using constant strings.
There are two compilator optimizations (from the C# compiler, not the JIT compiler) that are in effect here. Lets take one example of a program
const string A = "Hello ";
const string B = "World";
...
string test = A + B;
First optimization is constant propagation that will change your code basically into this:
string test = "Hello " + "World";
Then a concatenation of literal strings (as they are now, due to the first optimization) optimization will kick in and change it to
string test = "Hello World";
So if you write any variants of the program shown above, the actual IL will be the same (or at least very similar) due to the optimizations done by the C# compiler.