Variable scope and performance

Variable scope and performance - c#

Below is the code i am using
private void TestFunction()
{
foreach (MySampleClass c in dictSampleClass)
{
String sText = c.VAR1 + c.VAR2 + c.VAR3
PerformSomeTask(sText,c.VAR4);
}
}
My friend has suggesed to change to (to improve performance. dictSampleClass is a dictionary. It has 10K objects)
private void TestFunction()
{
String sText="";
foreach (MySampleClass c in dictSampleClass)
{
sText = c.VAR1 + c.VAR2 + c.VAR3
PerformSomeTask(sText,c.VAR4);
}
}
My Question is, "Does above change improve performance? if yes, how?"
WOW thats more than expected response. Most guys said "C# compiler would take care of that". So what about c compiler??

The compiler has intelligence to move variable declarations into/out of loops where required. In your example however, you are using a string, which is immutable. By declaring it outside I believe you are trying to "create once, use many", however strings are created each time they are modified so you can't achieve that.
Not to sound harse, but this is a premature optimisation, and likely a failing one - due to string immutability.
If the collection is large, go down the route of appending many strings into a StringBuilder - separated by a delimiter. Then split the string on this delimiter and iterate the array to add them, instead of concatenating them and adding them in one loop.
StringBuilder sb = new StringBuilder();
foreach (MySampleClass c in dictSampleClass)
{
sb.Append(c.VAR1);
sb.Append(c.VAR2);
sb.Append(c.VAR3);
sb.Append("|");
}
string[] results = sb.ToString().Split('|');
for (int i = 0; i < dictSampleClass.Count; i++)
{
string s = results[i];
MySampleClass c = dictSampleClass[i];
PerformSomeTask(s,c.VAR4);
}
I infer no benefits to using this code - most likely doesn't even work!
UPDATE: in light of the fast performance of string creation from multiple strings, if PerformSomeTask is your bottleneck, try to break the iteration of the strings into multiple threads - it won't improve the efficiency of the code but you'll be able to utilise multiple cores.

Run the two functions through reflector and have a look at the generated assembly. It might give some insights but at best, the performance improval would be minimal.

I'd go for this instead:
private void TestFunction()
{
foreach (MySampleClass c in dictSampleClass)
{
PerformSomeTask(c.VAR1 + c.VAR2 + c.VAR3, c.VAR4);
}
}
There's still probably no real performance benefit, but it removes creating a variable that you don't really need.

No, it does not. Things like these are handled by the C# compiler which is very inteligent nad you basicaly do not need to care about these details.
Aways use code that promotes readability.
You can check this by disassembling the program.

Ii will not improve performance. But on another note: Assumed improvements are error prone. You have to measure you application and only optimise the slow part if needed. I don't believe that loop is your appliacations's bottleneck.

Related

Variable replacement in string

Developing an application in c# to replace variables across strings.
Need suggestions how to do that very efficiently?
private string MyStrTr(string source, string frm, string to)
{
char[] input = source.ToCharArray();
bool[] replaced = new bool[input.Length];
for (int j = 0; j < input.Length; j++)
replaced[j] = false;
for (int i = 0; i < frm.Length; i++)
{
for(int j = 0; j<input.Length;j++)
if (replaced[j] == false && input[j]==frm[i])
{
input[j] = to[i];
replaced[j] = true;
}
}
return new string(input);
}
Above code works fine but each variables need to traverse across string according to variable count.
Exact requirement.
parent.name = am;
parent.number = good;
I {parent.name} a {parent.number} boy.
output should be I am a good boy.
think like source will be huge.
For example if i have 5 different variable need to traverse full string 5 times.
Need a suggestions how to process the variables parellely during first time traversal?

I think you're suffering from premature optimization. Write the simplest thing first and see if it works for you. If you don't suffer a performance problem, then you're done. Don't waste time trying to make it faster when you don't know how fast it is, or if it's the cause of your performance problem.
By the way, Facebook's Terms of Service, the entire HTML page that includes a lot of Javascript, is only 164 kilobytes. That's not especially large.
String.Replace should work quite well, even if you have multiple strings to replace. That is, you can write:
string result = source.Replace("{parent.name}", "am");
result = result.Replace("{parent.number}", "good");
// more replacements here
return result;
That will exercise the garbage collector a little bit, but it shouldn't be a problem unless you have a truly massive page or a whole mess of replacements.
You can potentially save yourself some garbage collection by converting the string to a StringBuilder, and calling StringBuilder.Replace multiple times. I honestly don't know, though, whether that will have any appreciable effect. I don't know how StringBuilder.Replace is implemented.
There is a way to make this faster, by writing code that will parse the string and do all the replacements in a single pass. It's a lot of code, though. You have to build a state machine from the multiple search strings and go through the source text one character at a time. It's doable, but it's a difficult enough task that you probably don't want to do it unless the simple method flat doesn't work quickly enough.

C# big string array to string

I have a string array of about 20,000,000 values.
And i need to convert it to a string
I've tried:
string data = "";
foreach (var i in tm)
{
data = data + i;
}
But that takes too long time
does someone know a faster way?

Try StringBuilder:
StringBuilder sb = new StringBuilder();
foreach (var i in tm)
{
sb.Append(i);
}
To get the resulting String use ToString():
string result = sb.ToString();

The answer is going to depend on the size of the output string and the amount of memory you have available and usable. The hard limit on string length appears to be 2^31-1 (int.MaxValue) characters, occupying just over 4GB of memory. Whether you can actually allocate that is dependent on your framework version, etc. If you're going to be producing a larger output then you can't put it into a single string anyway.
You've already discovered that naive concatenation is going to be tragically slow. The problem is that every pass through the loop creates a new string, then immediately discards it on the next iteration. This is going to fill up memory pretty quickly, forcing the Garbage Collector to work overtime finding old strings to clear out of memory, not to mention the amount of memory fragmentation and all that stuff that modern programmers don't pay much attention to.
A StringBuiler, is a reasonable solution. Internally it allocates blocks of characters that it then stitches together at the end using pointers and memory copies. Saves a lot of hassles that way and is quite speedy.
As for String.Join... it uses a StringBuilder. So does String.Concat although it is certainly quicker when not inserting separator characters.
For simplicity I would use String.Concat and be done with it.
But then I'm not much for simplicity.
Here's an untested and possibly horribly slow answer using LINQ. When I get time I'll test it and see how it performs, but for now:
string result = new String(lines.SelectMany(l => (IEnumerable<char>)l).ToArray());
Obviously there is a potential overflow here since the ToArray call can potentially create an array larger than the String constructor can handle. Try it out and see if it's as quick as String.Concat.

So you can do it in LINQ, like such.
string data = tm.Aggregate("", (current, i) => current + i);
Or you can use the string.Join function
string data = string.Join("", tm);

Cant check it right now but I'm curious on how this option would perform:
var data = String.Join(string.Empty, tm);
Is Join optimized and ignores concatenation a with String.Empty?

For this big data unfortunately memory based methods will fail and this will be a real headache for GC. For this operation create a file and put every string in it. Like this:
using (StreamWriter sw = new StreamWriter("some_file_to_write.txt")){
for (int i=0; i<tm.Length;i++)
sw.Write(tm[i]);
}
Try to avoid using "var" on this performance demanding approach. Correction: "var" does not effect perfomance. "dynamic" does.

how to convert part of the string to int/float/vector3 etc. without creating a temp string?

in C#, I have a string like this:
"1 3.14 (23, 23.2, 43,88) 8.27"
I need to convert this string to other types according to the value like int/float/vector3, now i have some code like this:
public static int ReadInt(this string s, ref string op)
{
s = s.Trim();
string ss = "";
int idx = s.IndexOf(" ");
if (idx > 0)
{
ss = s.Substring(0, idx);
op = s.Substring(idx);
}
else
{
ss = s;
op = "";
}
return Convert.ToInt32(ss);
}
this will read the first int value out, i have some similar functions to read float vector3 etc. but the problem is : in my application, i have to do this a lot because i received the string from some plugin and i need to do it every single frame, so i created a lot of strings which caused a lot GC will impact the performance, is their a way i can do similar stuff without creating temp strings?

Generation 0 objects such as those created here may well not impact performance too much, as they are relatively cheap to collect. I would change from using Convert to calling int.Parse() with the invariant culture before I started worrying about the GC overhead of the extra strings.
Also, you don't really need to create a new string to accomplish the Trim() behavior. After all, you're scanning and indexing the string anyway. Just do your initial scan for whitespace, and then for the space delimiter between ss and op, so you get just the substrings you need. Right now you're creating 50% more string instances than you really need.
All that said, no...there's not anything built into the basic .NET framework that would parse a substring without actually creating a new string instance. You would have to write your own parsing routines to accomplish that.
You should measure the actual real-world performance impact first, to make sure these substrings really are a significant issue.
I don't know what the "some plugin" is or how you have to handle the input from it, but I would not be surprised to hear that the overhead in acquiring the original input string(s) for this scenario swamps the overhead of the substrings for parsing.

Redundant/Better Performance Code VS Optimized/Less Performance Code

In my case, I'm using C#, but the concept of the question would apply to Java as well. Hopefully the answer would be generic enough to cover both languages. Otherwise it's better to split the question into two.
I've always thought of which one is a better practice.
Does the compiler take care of enhancing the 'second' code so its performance would be as good as the 'first' code?
Could it be worked around to get a 'better performance' and 'optimized' code at the same time?
Redundant/Better Performance Code:
string name = GetName(); // returned string could be empty
List<string> myListOfStrings = GetListOfStrings();
if(string.IsNullOrWhiteSpace(name)
{
foreach(string s in myListOfStrings)
Console.WriteLine(s);
}
else
{
foreach(string s in myListOfStrings)
Console.WriteLine(s + " (Name is: " + name);
}
Optimized/Less Performance Code:
string name = GetName(); // returned string could be empty
List<string> myListOfStrings = GetListOfStrings();
foreach(string s in myListOfStrings)
Console.WriteLine(string.IsNullOrWhiteSpace(name) ? s : s + " (Name is: " + name);
Obviously the execution time of the 'first' code is less because it executes the condition 'string.IsNullOrWhiteSpace(name)' only once per loop. Whereas the 'second' code (which is nicer) executes the condition on every iteration.
Please consider a long loop execution time not a short one because I know that when it is short, the performance won't differ.

Does the compiler take care of enhancing the 'second' code so its performance would be as good as the 'first' code?
No, it cannot.
It doesn't know that the boolean expression will not change between iterations of the loop. It's possible for the code to not return the same value each time, so it is forced to perform the check in each iteration.
It's also possible that the boolean expression could have side effects. In this case it doesn't, but there's no way for the compiler to know that. It's important that such side effects would be performed in order to meet the specs, so it needs to execute the check in each iteration.
So, the next question you need to ask is, in a case such as this, is it important to perform the optimization that you've mentioned? In any situation I can imagine for the exact code you showed, probably not. The check is simply going to be so fast that it's almost certainly not going to be a bottleneck. If there are performance problems there are almost certainly bigger fish.
That said, with only a few changes to the example it can be made to matter. If the boolean expression itself is computationally expensive (i.e. it is the result of a database call, a web service call, some expensive CPU computation, etc.) then it could be a performance optimization that matters. Another case to consider is what would happen if the boolean expression had side effects. What if it was a MoveNext call on an IEnumerator? If it was important that it only be executed exactly once because you don't want the side effects to happen N times then that makes this a very important issue.
There are several possible solutions in such a case.
The easiest is most likely to just compute the boolean expression once and then store it in a variable:
bool someValue = ComputeComplexBooleanValue();
foreach(var item in collection)
{
if(someValue)
doStuff(item);
else
doOtherStuff(item);
}
If you want to execute the boolean value 0-1 times (i.e. avoid calling it even once in the event that the collection is empty) then we can use Lazy to lazily compute the value, but ensure it's still only computed at most one time:
var someValue = new Lazy<bool>(() => ComputeComplexBooleanValue());
foreach (var item in collection)
{
if (someValue.Value)
doStuff(item);
else
doOtherStuff(item);
}

You should always go the way that is easier to understand and maintain first. This means reducing duplicate code to absolute minumum (DRY). In addition this kind of micro optimization is not that important for many systems. Also note that shorter code is not always better.
I think I would go with somehting like this:
string name = GetName(); // returned string could be empty
bool nameIsEmpty = string.IsNullOrWhiteSpace(name);
foreach (string s in GetListOfStrings()) {
string messageAddition = "";
if (!nameIsEmpty) {
messageAddition = " (Name is: " + name + ")";
}
Console.WriteLine(s + messageAddition);
// more code which uses the computed value..
// otherwise the condition can be moved out the loop
}
I find an extra if statement easier to read than the ?: operator within a method call but this might be a personal taste.
If you want to improve performance later you should profile your application and start optimizing the slowest sections of code first. Maybe your GetListOfStrings() method is so slow that the performance of the other code is totally irrelevant. If you measured that duplicating the loop improves the perfomance by a significant value you can think about changing it.

String Concatenation Vs String Builder Append

So...I have this scenario where I have a Foreach loop that loops through a List of Checkboxes to check which are selected. For every selected checkbox, I have to do a pretty long string concatenation, involving 30 different strings of an average length of 20 characters, and then send it out as a HTTP request. 2 of the strings are dependant on the index/value of the checkbox selected.
The length of the List of Checkboxes is also variable depending upon the user's data. I would say the average length of the List would be 20, but it can go up to 50-60. So the worst case scenario would be performing the whole string concatenation 60 or so times.
For now I'm doing it with simple string concatenation via the '+' operator, but I'm wondering if it would be faster to do it with Stringbuilder. Of course, that means I'd have to either create a Stringbuilder object within the loop, or create it before the loop and call Stringbuilder.Remove at the end of it after sending out the HTTP request.
I appreciate any insights anybody can share regarding this issue.
EDIT
Thanks for all the replies everybody, so from what I've gathered, the best way for me to go about doing this would be something like:
StringBuilder sb = new StringBuilder();
foreach (CheckBox item in FriendCheckboxList)
{
if (item.Checked)
{
sb.Append(string1);
sb.Append(string2);
sb.Append(string3);
.
.
.
sb.Append(stringLast);
SendRequest(sb.ToString());
sb.Length = 0;
}
}

Use StringBuilder. That's what it's for.
Strings are immutable. String concatenation creates a new string, needing more memory, and is generally considered slow:
string a = "John" + " " + "Saunders";
This creates a string "John ", then creates another string "John Saunders", then finally, assigns that to "a". The "John " is left for garbage collection.
string a = "John";
a += " ";
a += "Saunders";
This is about the same, as "John" is replaced by a new string "John ", which is replaced by a new string "John Saunders". The originals are left to be garbage collected.
On the other hand, StringBuilder is designed to be appended, removed, etc.
Example:
StringBuilder sb = new StringBuilder();
for (int i=0; i<n; i++)
{
sb.Length = 0;
sb.Append(field1[i]);
sb.Append(field2[i]);
...
sb.Append(field30[i]);
// Do something with sb.ToString();
}

This topic has been analysed to death over the years. The end result is that if you are doing a small, known number of concatenations, use '+', otherwise use stringbuilder. From what you've said, concatenate with '+' should be faster. There are a gazillion (give or take) sites out there analysing this - google it.
For the size of string you are talking about, it's negligible anyway.
EDIT: on second thought, SB is probably faster. But like I said, who cares?

I know this has been answered, but I wanted to point out that I actually think the "blindly accepted as gospel" approach of always using StringBuilder is sometimes wrong, and this is one case.
For background, see this blog entry: http://geekswithblogs.net/johnsperfblog/archive/2005/05/27/40777.aspx
The short of it is, for this particular case, as described, you will see better performance by avoiding StringBuilder and making use of the + operator thusly:
foreach (CheckBox item in FriendCheckboxList)
{
if (item.Checked)
{
string request = string1 +
string2 +
string3 +
.
.
.
stringLast;
SendRequest(request);
}
}
The reason is that the C# compiler (as of .NET 1.1), will convert that statement into a single IL call to String.Concat passing an array of Strings as an argument. The blog entry does an excellent job outlining the implementation details of String.Concat, but suffice to say, it is extremely efficient for this case.

If your asking this question, chances are you should use StringBuilder for many reasons, but i'll provide two.
When you use string concatenation it has to allocate a new buffer and and copy the data in the other string into the new string variable. So you are going to incur many repeated allocations. Which in the end ends up fragmenting the memory, using up heap space, and making more work for the Garbage collector.
The StringBuilder on the other hand pre-allocates a buffer and as you add strings to it doesn't need to keep re-allocating (assuming initial buffer is large enough). Which increases performance and is far less taxing on memory.
As developers we should try to anticipate future growth. Let's say that your list grows substantially over time and then all of a sudden starts performing slowly. If you can prevent this with little effort now, why wouldn't you do it?

In general I would recommend to use a StringBuilder.
Have you tested this and checked the performance? Is the performance an issue vs how long it will take you to rewrite the code?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.