Looking for better algorithm / technique for replacing strings in a string variable. I have to loop through an unknown number of database records and for each one, I need to replace some text in a string variable. Right now it looks like this, but there has to be a better way:
using (eds ctx = new eds())
{
string targetText = "This is a sample string with words that will get replaced based on data pulled from the database";
List<parameter> lstParameters = ctx.ciParameters.ToList();
foreach (parameter in lstParameters)
{
string searchKey = parameter.searchKey;
string newValue = parameter.value;
targetText = targetText.Replace(searchKey, newValue);
}
}
From my understanding this is not good because I'm over writing the targetText variable, over and over in the loop. However, I'm not sure how structure the find and replace...
Appreciate any feedback.
there has to be a better way
Strings are immutable - you can't "change" them - all you can do is create a new string and replace the variable value (which is not as bad as you think). You could try using a StringBuilder as other suggest, but it's not 100% guaranteed to improve your performance.
You could change your algorithm to loop through the "words" in targetText, see if there's a match in parameters , take the "replacement" value and build up a new string, but I suspect the extra lookups will cost more than recreating the string value multiple times.
In any case, two important principles of performance improvement should be considered:
Start with the slowest part of your app first - you may see some improvement but if it does not improve the overall performance significantly then it doesn't matter that much
The only way to know if a particular change will improve your performance (and by how much) is to try it both ways and measure it.
StringBuilder will have less memory overhead and better performance, especially on large strings. String.Replace() vs. StringBuilder.Replace()
using (eds ctx = new eds())
{
string targetText = "This is a sample string with words that will get replaced based on data pulled from the database";
var builder = new StringBuilder(targetText);
List<parameter> lstParameters = ctx.ciParameters.ToList();
foreach (parameter in lstParameters)
{
string searchKey = parameter.searchKey;
string newValue = parameter.value;
targetText = builder.Replace(searchKey, newValue);
}
}
Actually, there is a better answer, assuming you're doing a large number of replacements. You can use a StringBuilder. As you know, strings are immutable. So as you said, you're creating strings over and over again in your loop.
If you convert your string to a StringBuilder
StringBuilder s = new StringBuilder(s, s.Length*2); // Adjust the capacity based on how much bigger you think the string will get due to replacements. The more accurate your estimate, the better this will perform.
foreach (parameter in lstParameters)
{
s.Replace(parameter.searchKey, parameter.value);
}
string targetString = s.ToString();
Now a caveat, if your list only has 2-3 items in it, this might not be any better. The answer to this question provides a nice analysis of the performance improvement you can expect to see.
Related
I would like to know the different ways of inserting a variable into a string, in C#.
I am currently trying to insert values into a json string that I am building:
Random rnd = new Random();
int ID = rnd.Next(1, 999);
string body = #"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""help here""}";
How could I add the "ID" to the string body?
In a typical string inserting scenario, I'd do one of these:
string body = string.Format("My ID is {0}", ID);
string body = "My ID is " + ID;
However, your string is apparently JSON serialized data. I'd expect that I'd want to parse that into a class in order to work with it.
var myObj = JsonConvert.DeserializeObject<MyClass>(someString);
myObj.TID = ID;
// maybe do other things with it, then if I need JSON again...
string body = JsonConvert.SerializeObject(myObj);
One reason to take this approach is to make sure that any data I put in still makes the JSON valid. For example, if my ID were, instead of an int, a string with characters that needed escaping, directly inserting "\"\n\"" would not be the right thing to do.
String interpolation is the easiest way these days:
int myIntValue = 123;
string myStringValue = "one two three";
string interpolatedString = $"my int is: {myIntValue}. My string is: {myStringValue}.";
Output would be "my int is: 123. My string is: one two three.".
You can experiment with this sample yourself, over here.
The $ special character identifies a string literal as an interpolated
string. An interpolated string is a string literal that might contain
interpolation expressions. When an interpolated string is resolved to
a result string, items with interpolation expressions are replaced by
the string representations of the expression results. This feature is available starting with C# 6.
You could try this:
string body = #"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""" + ID + #"""}";
You can also use string.Concat:
string body = string.Concat(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""", ID, #"""}");
There are a number of ways to inject values into strings, however it's easy to lose sight of encodings, and cause major breakage.
If you just want to inject a value into another string, you can use:
string concatenation
string building
string formatting
Concatenation:
The simplest and most common way to build strings is by simply concatenating them together with the + operator:
var foo = 5;
var bar = "example-" + foo;
Concatenation can be difficult to read which makes it easy to introduce bugs, but for most simple tasks is the right tool for the job.
In this case, it's a poor choice:
string body = #"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""" + ID.ToString() + #"""}";
String Building
The StringBuilder class is useful for building large strings particularly when built iteratively.
var sb = new StringBuilder();
for (var i = 0; i < 1000; i++) {
sb.Append(i.ToString());
sb.Append(" ");
}
var output = sb.ToString();
It can still be difficult to read and hard to debug, but for cases where you're joining lots of strings together, it's super efficient
In this case, it's a poor choice:
StringBuilder sb = new StringBuilder();
sb.Append(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""");
sb.Append(ID.ToString());
sb.Append(#"""}");
string body = sb.ToString();
String formatting
The string.Format method makes templating data into a string super easy and efficient. If you plan on reusing the same string over and over, using a format string makes it much easier to read and debug code, particularly when there are lots of replacements:
var foo = 5;
var bar = string.Format("example-{0}", foo);
Format strings can also automatically apply culturally accurate formatting to particular data types, so that a DateTime is appropriately displayed, or so that a number has the appropriate number of trailing zeros.
In this case, it's a poor choice:
string string.Format(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""{0}""}", ID);
The right choice
You're not dumping data into any old string. That's JSON encoded data. If you just concatenate/build/format in any old value, you can break your string. For example, if the ID variable contained a " character, you'd break the entire JSON dataset.
Additionally, the length of the string and necessary quotes make it super difficult to read, which makes it difficult to maintain. Good luck when you get around to needing to add another formatted value, it's going to be a pain to change any existing value or add in new dynamic ones.
Instead of writing a JSON literal, write an object and encode it to JSON:
var bodyData =
new
{
currency = "country",
gold = 1,
detail = "detailid-979095986",
tId = ID //here's where you set the ID
};
var jss = new JavaScriptSerializer();
var body = jss.Serialize(bodyData);
This code is much easier to modify when the data changes, and will actually encode your data correctly. You don't need to worry about all those annoying double quote characters any more either.
You can use the
String.Format(#"{""currency"":""country"",""gold"":1,""detail"":""detailid-979095986"",""tId"":""{0}""}", ID)
Since this is params object[], you can use as many {n} as you want.
Instead of using on string, you could concatenate strings together using +, which would allow you to insert text between the generated strings.
string body = #"***" + ID + #"***";
Not Quite what the title suggests, what i need is a way to count a string backwards like
string i = "3027"
i[0] = label1.Text
Result = 7 not 3 is there a way?
not sure if you need my code or not its not really important.
You can reverse the string using a number of approaches including
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
http://www.dotnetperls.com/reverse-string
then access the portion of the reversed string that you are interested in.
Note that you cannot assign to i[0] as shown in your example code because strings are immutable in C# (why). If you want to construct a string a bit at a time, it is often most efficient to use StringBuilder.
Im still learning in C#, and there is one thing i cant really seem to find the answer to.
If i have a string that looks like this "abcdefg012345", and i want to make it look like "ab-cde-fg-012345"
i tought of something like this:
string S1 = "abcdefg012345";
string S2 = S1.Insert(2, "-");
string S3 = S2.Insert(6, "-");
string S4 = S3.Insert.....
...
..
Now i was looking if it would be possible to get this al into 1 line somehow, without having to make all those strings.
I assume this would be possible somehow ?
Whether or not you can make this a one-liner (you can), it will always cause multiple strings to be created, due to the immutability of the String in .NET
If you want to do this somewhat efficiently, without creating multiple strings, you could use a StringBuilder. An extension method could also be useful to make it easier to use.
public static class StringExtensions
{
public static string MultiInsert(this string str, string insertChar, params int[] positions)
{
StringBuilder sb = new StringBuilder(str.Length + (positions.Length*insertChar.Length));
var posLookup = new HashSet<int>(positions);
for(int i=0;i<str.Length;i++)
{
sb.Append(str[i]);
if(posLookup.Contains(i))
sb.Append(insertChar);
}
return sb.ToString();
}
}
Note that this example initialises StringBuilder to the correct length up-front, therefore avoiding the need to grow the StringBuilder.
Usage: "abcdefg012345".MultiInsert("-",2,5); // yields "abc-def-g012345"
Live example: http://rextester.com/EZPQ89741
string S1 = "abcdefg012345".Insert(2, "-").Insert(6, "-")..... ;
If the positions for the inserted strings are constant you could consider using string.Format() method. For example:
string strTarget = String.Format("abc{0}def{0}g012345","-");
string s = "abcdefg012345";
foreach (var index in [2, 6, ...]
{
s = s.Insert(index, "-");
}
I like this
StringBuilder sb = new StringBuilder("abcdefg012345");
sb.Insert(6, '-').Insert(2, '-').ToString();
String s1 = "abcdefg012345";
String seperator = "-";
s1 = s1.Insert(2, seperator).Insert(6, seperator).Insert(9, seperator);
Chaining them like that keeps your line count down. This works because the Insert method returns the string value of s1 with the parameters supplied, then the Insert function is being called on that returned string and so on.
Also it's worth noting that String is a special immutable class so each time you set a value to it, it is being recreated. Also worth noting that String is a special type that allows you to set it to a new instance with calling the constructor on it, the first line above will be under the hood calling the constructor with the text in the speech marks.
Just for the sake of completion and to show the use of the lesser known Aggregate function, here's another one-liner:
string result = new[] { 2, 5, 8, 15 }.Aggregate("abcdefg012345", (s, i) => s.Insert(i, "-"));
result is ab-cd-ef-g01234-5. I wouldn't recommend this variant, though. It's way too hard to grasp on first sight.
Edit: this solution is not valid, anyway, as the "-" will be inserted at the index of the already modified string, not at the positions wrt to the original string. But then again, most of the answers here suffer from the same problem.
You should use a StringBuilder in this case as Strings objects are immutable and your code would essentially create a completely new string for each one of those operations.
http://msdn.microsoft.com/en-us/library/2839d5h5(v=vs.71).aspx
Some more information available here:
http://www.dotnetperls.com/stringbuilder
Example:
namespace ConsoleApplication10
{
class Program
{
static void Main(string[] args)
{
StringBuilder sb = new StringBuilder("abcdefg012345");
sb.Insert(2, '-');
sb.Insert(6, '-');
Console.WriteLine(sb);
Console.Read();
}
}
}
If you really want it on a single line you could simply do something like this:
StringBuilder sb = new StringBuilder("abcdefg012345").Insert(2, '-').Insert(6, '-');
I have the following intentionally trivial function:
void ReplaceSome(ref string text)
{
StringBuilder sb = new StringBuilder(text);
sb[5] = 'a';
text = sb.ToString();
}
It appears to be inefficient to convert this to a StringBuilder to index into and replace some of the characters only to copy it back to the ref'd param. Is it possible to index directly into the text param as an L-Value?
Or how else can I improve this?
C# strings are "immutable," which means that they can't be modified. If you have a string, and you want a similar but different string, you must create a new string. Using a StringBuilder as you do above is probably as easy a method as any.
Armed with Reflector and the decompiled IL - On a pure LOC basis then the StringBuilder approach is definitely the most efficient. Eg tracing the IL calls that StringBuilder makes internally vs the IL calls for String::Remove and String::Insert etc.
I couldn't be bothered testing the memory overhead of each approach, but would imagine it would be in line with reflector results - the StringBuilder approach would be the best.
I think the fact the StringBuilder has a set memory size using the constructor
StringBuilder sb = new StringBuilder(text);
would help overall too.
Like others have mentioned, it would come down to readability vs efficiency...
text = text.Substring(0, 4) + "a" + text.Substring(5);
Not dramatically different than your StringBuilder solution, but slightly more concise than the Remove(), Insert() answer.
I don't know if this is more efficient, but it works. Either way you'll have to recreate the string after each change since they're immutable.
string test = "hello world";
Console.WriteLine(test);
test = test.Remove(5, 1);
test = test.Insert(5, "z");
Console.WriteLine(test);
Or if you want it more concise:
string test = "hello world".Remove(5, 1).Insert(5, "z");
I have an Address object that has properties AddressLine1, AddressLine2, Suburb, State, ZipCode. (there are more but this is enough for the example). Also, each of these properties are strings. And I'm using C# 3.0.
I'm wanting to represent it as a string but as I'm doing that I'm finding that I'm creating a method with a high cyclomatic complexity due to all the if-statements...
Assuming the string assigned to each property is the same as the name of the property (ie AddressLine1 = "AddressLine1")...I want the address to be represented as follows:
"AddressLine1 AddressLine2 Suburb State ZipCode".
Now, the original way I had done this was by a simple String.Format()
String.Format("{0} {1} {2} {3} {4}", address.AddressLine1,
address.AddressLine2, address.Suburb, address.State, address.ZipCode);
This is all well and good until I've found that some of these fields can be empty...in particular AddressLine2. The result is additional unneeded spaces...which are particularly annoying when you get several in a row.
To get around this issue, and the only solution I can think of, I've had to build the string manually and only add the address properties to the string if they are not null or empty.
ie
string addressAsString = String.Empty;
if (!String.IsNullOrEmpty(address.AddressLine1))
{
addressAsString += String.Format("{0}", address.AddressLine1);
}
if(!String.IsNullOrEmpty(address.AddressLine2))
{
addressAsString += String.Format(" {0}", address.AddressLine2);
}
etc....
Is there a more elegant and/or concise way to achieve this that I'm not thinking about? My solution just feels messy and bloated...but I can't think of a better way to do it...
For all I know, this is my only option given what I want to do...but I just thought I'd throw this out there to see if someone with more experience than me knows of a better way. If theres no better option then oh well...but if there is, then I'll get to learn something I didn't know before.
Thanks in advance!
This possibly isn't the most efficient way, but it is concise. You could put the items you want into an array and filter out the ones without a significant value, then join them.
var items = new[] { line1, line2, suburb, state, ... };
var values = items.Where(s => !string.IsNullOrEmpty(s));
var addr = string.Join(" ", values.ToArray());
Probably more efficient, but somewhat harder to read, would be to aggregate the values into a StringBuilder, e.g.
var items = new[] { line1, line2, suburb, state, ... };
var values = items.Where(s => !string.IsNullOrEmpty(s));
var builder = new StringBuilder(128);
values.Aggregate(builder, (b, s) => b.Append(s).Append(" "));
var addr = builder.ToString(0, builder.Length - 1);
I'd probably lean towards something like the first implementation as it's much simpler and more maintainable code, and then if the performance is a problem consider something more like the second one (if it does turn out to be faster...).
(Note this requires C# 3.0, but you don't mention your language version, so I'm assuming this is OK).
When putting strings together I recommend you use the StringBuilder class. Reason for this is that a System.String is immutable, so every change you make to a string simply returns a new string.
If you want to represent an object in text, it might be a good idea to override the ToString() method and put your implementation in there.
Last but not least, with Linq in C# 3.5 you can join those together like Greg Beech just did here, but instead of using string.Join() use:
StringBuilder sb = new StringBuilder();
foreach (var item in values) {
sb.Append(item);
sb.Append(" ");
}
Hope this helps.
I would recommend overriding the ToString method, and take an IFormatProvider implementation that defines your custom types.
See MSDN at http://msdn.microsoft.com/en-us/library/system.iformatprovider.aspx for information on implementing IFormatProvider.
You could then code such as this:
address.ToString("s"); // short address
address.ToString("whatever"); // whatever custom format you define.
Definitely not the easiest way to do it, but the cleanest IMHO. An example of such implementation is the DateTime class.
Cheers
Ash
First, an address requires certain information, so you should not allow an address that does not have, say, a zip-code. You can also make sure that they are never null by initialize each property to an empty string, or by requiring each property as arguments to the constructor. This way you know that you are dealing with valid data already, so you can just output a formatted string with no need for the IsNullOrEmpty checks.
I had a similar issue building a multi-line contact summary that could have potentially had several empty fields. I pretty much did the same thing you did, except I used a string builder as to not keep concatenating to a string over and over.
If your data isn't completely reliable, you'll need to have that logic somewhere - I'd suggest that you create another read-only property (call it something like FormattedAddress) that does that logic for you. That way, you don't have to change any of the code it, at some point, you clean up or otherwise change the rules.
And +1 for the suggestion to use a stringbuilder, as opposed to concatenating strings.
I know this is really old, but I sensed a general solution and implemented mine this way:
private static string GetFormattedAddress(
string address1,
string address2,
string city,
string state,
string zip)
{
var addressItems =
new []
{
new[] { address1, "\n" },
new[] { address2, "\n" },
new[] { city, ", " },
new[] { state, " " },
new[] { zip, null }
};
string suffix = null;
var sb = new StringBuilder(128);
foreach (var item in addressItems)
{
if (!string.IsNullOrWhiteSpace(item[0]))
{
// Append the last item's suffix
sb.Append(suffix);
// Append the address component
sb.Append(item[0]);
// Cache the suffix
suffix = item[1];
}
}
return sb.ToString();
}