For loop to create Tasks going over end condition [duplicate] - c#

This question already has answers here:
For loop goes out of range [duplicate]
(3 answers)
Closed 8 years ago.
I have a for loop to create a number of Tasks that are perameterised:
int count = 16;
List<Tuple<ulong, ulong>> brackets = GetBrackets(0L, (ulong)int.MaxValue, count);
Task[] tasks = new Task[count];
s.Reset();
s.Start();
for(int i = 0; i < count; i++)
{
tasks[i] = Task.Run(() => TestLoop(brackets[i].Item1, brackets[i].Item2));
}
Task.WaitAll(tasks);
s.Stop();
times.Add(count, s.Elapsed);
However, when this runs, an exception is thrown by the line inside the For loop, that brackets[i] does not exist, because i at that point is 16, even though the loop is set to run while i < count.
If I change the line to this:
tasks[i] = new Task(() => TestLoop(brackets[0].Item1, brackets[0].Item2));
Then no error is thrown. Also, if I walk through the loop with breakpoints, no issue is thrown.
For repro, I also include GetBrackets, which just breaks a number range into blocks:
private List<Tuple<ulong, ulong>> GetBrackets(ulong start, ulong end, int threads)
{
ulong all = (end - start);
ulong block = (ulong)(all / (ulong)threads);
List<Tuple<ulong, ulong>> brackets = new System.Collections.Generic.List<Tuple<ulong, ulong>>();
ulong last = 0;
for (int i=0; i < threads; i++)
{
brackets.Add(new Tuple<ulong, ulong>(last, (last + block - 1)));
last += block;
}
// Hack
brackets[brackets.Count - 1] = new Tuple<ulong, ulong>(
brackets[brackets.Count - 1].Item1, end);
return brackets;
}
Could anyone shed some light on this?

(This is a duplicate of similar posts, but they're often quite hard to find and the symptoms often differ slightly.)
The problem is that you're capturing the variable i in your loop:
for(int i = 0; i < count; i++)
{
tasks[i] = Task.Run(() => TestLoop(brackets[i].Item1, brackets[i].Item2));
}
You've got a single i variable, and the lambda expression captures it - so by the time your task actually starts executing the code in the lambda expression, it probably won't have the same value as it did before. You need to introduce a separate variable inside the loop, so that each iteration captures a different variable:
for (int i = 0; i < count; i++)
{
int index = i;
tasks[i] = Task.Run(() => TestLoop(brackets[index].Item1, brackets[index].Item2));
}
Alternatively, use LINQ to create the task array:
var tasks = brackets.Select(t => Task.Run(() => TestLoop(t.Item1, t.Item2));
.ToArray(); // Or ToList

Related

C# - Code optimization to get all substrings from a string

I was working on a code snippet to get all substrings from a given string.
Here is the code that I use
var stringList = new List<string>();
for (int length = 1; length < mainString.Length; length++)
{
for (int start = 0; start <= mainString.Length - length; start++)
{
var substring = mainString.Substring(start, length);
stringList.Add(substring);
}
}
It looks not so great to me, with two for loops. Is there any other way that I can achieve this with better time complexity.
I am stuck on the point that, for getting a substring, I will surely need two loops. Is there any other way I can look into ?
The number of substrings in a string is O(n^2), so one loop inside another is the best you can do. You are correct in your code structure.
Here's how I would've phrased your code:
void Main()
{
var stringList = new List<string>();
string s = "1234";
for (int i=0; i <s.Length; i++)
for (int j=i; j < s.Length; j++)
stringList.Add(s.Substring(i,j-i+1));
}
You do need 2 for loops
Demo here
var input = "asd sdf dfg";
var stringList = new List<string>();
for (int i = 0; i < input.Length; i++)
{
for (int j = i; j < input.Length; j++)
{
var substring = input.Substring(i, j-i+1);
stringList.Add(substring);
}
}
foreach(var item in stringList)
{
Console.WriteLine(item);
}
Update
You cannot improve on the iterations.
However you can improve performance, by using fixed arrays and pointers
In some cases you can significantly increase execution speed by reducing object allocations. In this case by using a single char[] and ArraySegment<of char> to process substrings. This will also lead to use of less address space and decrease in garbage collector load.
Relevant excerpt from Using the StringBuilder Class in .NET page on Microsoft Docs:
The String object is immutable. Every time you use one of the methods in the System.String class, you create a new string object in memory, which requires a new allocation of space for that new object. In situations where you need to perform repeated modifications to a string, the overhead associated with creating a new String object can be costly.
Example implementation:
static List<ArraySegment<char>> SubstringsOf(char[] value)
{
var substrings = new List<ArraySegment<char>>(capacity: value.Length * (value.Length + 1) / 2 - 1);
for (int length = 1; length < value.Length; length++)
for (int start = 0; start <= value.Length - length; start++)
substrings.Add(new ArraySegment<char>(value, start, length));
return substrings;
}
For more information check Fundamentals of Garbage Collection page on Microsoft Docs, what is the use of ArraySegment class? discussion on StackOverflow, ArraySegment<T> Structure page on MSDN and List<T>.Capacity page on MSDN.
Well, O(n**2) time complexity is inevitable, however you can try impove space consumption. In many cases, you don't want all the substrings being materialized, say, as a List<string>:
public static IEnumerable<string> AllSubstrings(string value) {
if (value == null)
yield break; // Or throw ArgumentNullException
for (int length = 1; length < value.Length; ++length)
for (int start = 0; start <= value.Length - length; ++start)
yield return value.Substring(start, length);
}
For instance, let's count all substrings in "abracadabra" which start from a and longer than 3 characters. Please, notice that all we have to do is to loop over susbstrings without saving them into a list:
int count = AllSubstrings("abracadabra")
.Count(item => item.StartsWith("a") && item.Length > 3);
If for any reason you want a List<string>, just add .ToList():
var stringList = AllSubstrings(mainString).ToList();

C# threading passing integer argument passes bad number

Good day to everyone.
Today I made up a school project with one thing bothering me.
My problem is, that I am passing argument to Thread function and when I print it to console via Console.WriteLine, it shows bad numbers.
for (i = 0; i < 10; i++) autari[i] = new Thread(() => autar(i));
for (i = 0; i < 10; i++) motorkari[i] = new Thread(() => motorkar(i + 10));
When I start them in same cycles, their functions do this:
static void motorkar(int id)
{
Console.WriteLine("motorkar {0}", id);
...
It is not the order problem, but when I pass for example 0. Visual studio in Debug writes to console number 2 and without Debug it writes 1.
What can be the problem? I know that I can solve this by setting string name, but I am confused with this.
This is due to the compiler creating you a closure under the hood. If you change the code around to the below you should get your expected output
for (i = 0; i < 10; i++)
{
var local = i;
autari[i] = new Thread(() => autar(local))
}
for (i = 0; i < 10; i++)
{
var local = i + 10;
motorkari[i] = new Thread(() => motorkar(local))
}

How To Make This For Loop Faster

I have the problem that this for loop takes so much time to complete.
I want a faster way to complete it.
ArrayList arrayList = new ArrayList();
byte[] encryptedBytes = null;
for (int i = 0; i < iterations; i++)
{
encryptedBytes = Convert.FromBase64String(inputString.Substring(base64BlockSize * i,
base64BlockSize));
arrayList.AddRange(rsaCryptoServiceProvider.Decrypt(encryptedBytes, true));
}
The iterations variable sometimes is larger than 100,000 and that takes like for ever.
Did you consider running the decryption process in a parallel loop. Your input strings have to be prepared first in a regular loop, but that's a quick process. Then you run the decryption in Parallel.For:
var inputs = new List<string>();
var result = new string[(inputString.Length / 64) - 1];
// Create inputs from the input string.
for (int i = 0; i < iterations; ++i)
{
inputs.Add(inputString.Substring(base64BlockSize * i, base64BlockSize));
}
Parallel.For(0, iterations, i =>
{
var encryptedBytes = Convert.FromBase64String(inputs[i]);
result[i] = rsaCryptoServiceProvider.Decrypt(encryptedBytes, true);
});
I assumed the result returned is a string but if that's not the case then you have to adjust the type for the concurrent bag collection.

ThreadPool behaves different for debug mode and runtime

I want to use ThreadPool to complete long running jobs in less time. My methods
does more jobs of course but I prepared a simple example for you to understand
my situation. If I run this application it throws ArgumentOutOfRangeException on the commented line. Also it shows that i is equal to 10. How can it enter the for loop if it is 10?
If I don't run the application and debug this code it does not throw exception and works fine.
public void Test()
{
List<int> list1 = new List<int>();
List<int> list2 = new List<int>();
for (int i = 0; i < 10; i++) list1.Add(i);
for (int i = 0; i < 10; i++) list2.Add(i);
int toProcess = list1.Count;
using (ManualResetEvent resetEvent = new ManualResetEvent(false))
{
for (int i = 0; i < list1.Count; i++)
{
ThreadPool.QueueUserWorkItem(
new WaitCallback(delegate(object state)
{
// ArgumentOutOfRangeException with i=10
Sum(list1[i], list2[i]);
if (Interlocked.Decrement(ref toProcess) == 0)
resetEvent.Set();
}), null);
}
resetEvent.WaitOne();
}
MessageBox.Show("Done");
}
private void Sum(int p, int p2)
{
int sum = p + p2;
}
What is the problem here?
The problem is that i==10, but your lists have 10 items (i.e. a maximum index of 9).
This is because you have a race condition over a captured variable that is being changed before your delegate runs. Will the next iteration of the loop increment the value before the delegate runs, or will your delegate run before the loop increments the value? It's all down to the timing of that specific run.
Your instinct is that i will have a value of 0-9. However, when the loop reaches its termination, i will have a value of 10. Because the delegate captures i, the value of i may well be used after the loop has terminated.
Change your loop as follows:
for (int i = 0; i < list1.Count; i++)
{
var idx=i;
ThreadPool.QueueUserWorkItem(
new WaitCallback(delegate(object state)
{
// ArgumentOutOfRangeException with i=10
Sum(list1[idx], list2[idx]);
if (Interlocked.Decrement(ref toProcess) == 0)
resetEvent.Set();
}), null);
}
Now your delegate is getting a "private", independent copy of i instead of referring to a single, changing value that is shared between all invocations of the delegate.
I wouldn't worry too much about the difference in behaviour between debug and non-debug modes. That's the nature of race conditions.
What is the problem here?
Closure. You're capturing the i variable which isn't doing what you expect it to do.
You'll need to create a copy inside your for loop:
var currentIndex = i:
Sum(list1[currentIndex], list2[currentIndex]);

Aggregation of parallel for does not capture all iterations

I have code that works great using a simple For loop, but I'm trying to speed it up. I'm trying to adapt the code to use multiple cores and landed on Parallel For.
At a high level, I'm collecting the results from CalcRoutine for several thousand accounts and storing the results in an array with 6 elements. I'm then re-running this process 1,000 times. The order of the elements within each 6 element array is important, but the order for the final 1,000 iterations of these 6 element arrays is not important. When I run the code using a For loop, I get a 6,000 element long list. However, when I try the Parallel For version, I'm getting something closer to 600. I've confirmed that the line "return localResults" gets called 1,000 times, but for some reason not all 6 element arrays get added to the list TotalResults. Any insight as to why this isn't working would be greatly appreciated.
object locker = new object();
Parallel.For(0, iScenarios, () => new double[6], (int k, ParallelLoopState state, double[] localResults) =>
{
List<double> CalcResults = new List<double>();
for (int n = iStart; n < iEnd; n++)
{
CalcResults.AddRange(CalcRoutine(n, k));
}
localResults = this.SumOfResults(CalcResults);
return localResults;
},
(double[] localResults) =>
{
lock (locker)
{
TotalResults.AddRange(localResults);
}
});
EDIT: Here's the "non parallel" version:
for (int k = 0; k < iScenarios; k++)
{
CalcResults.Clear();
for (int n = iStart; n < iEnd; n++)
{
CalcResults.AddRange(CalcRoutine(n, k));
}
TotalResults.AddRange(SumOfResults(CalcResults));
}
The output for 1 scenario is a list of 6 doubles, 2 scenarios is a list of 12 doubles, ... n scenarios 6n doubles.
Also per one of the questions, I checked the number of times "TotalResults.AddRange..." gets called, and it's not the full 1,000 times. Why wouldn't this be called each time? With the lock, shouldn't each thread wait for this section to become available?
Check the documentation for Parallel.For
These initial states are passed to the first body invocations on each task. Then, every subsequent body invocation returns a possibly modified state value that is passed to the next body invocation. Finally, the last body invocation on each task returns a state value that is passed to the localFinally delegate
But your body delegate is ignoring the incoming value of localResults which the previous iteration within this task returned. Having the loop state being an array makes it tricky to write a correct version. This will work but looks messy:
//EDIT - Create an array of length 0 here V for input to first iteration
Parallel.For(0, iScenarios, () => new double[0],
(int k, ParallelLoopState state, double[] localResults) =>
{
List<double> CalcResults = new List<double>();
for (int n = iStart; n < iEnd; n++)
{
CalcResults.AddRange(CalcRoutine(n, k));
}
localResults = localResults.Concat(
this.SumOfResults(CalcResults)
).ToArray();
return localResults;
},
(double[] localResults) =>
{
lock (locker)
{
TotalResults.AddRange(localResults);
}
});
(Assuming Linq's enumerable extensions are in scope, for Concat)
I'd suggest using a different data structure (e.g. a List<double> rather than double[]) for the state that more naturally allows more elements to be added to it - but that would mean changing SumOfResults that you've not shown. Or just keep it all a bit more abstract:
Parallel.For(0, iScenarios, Enumerable.Empty<double>(),
(int k, ParallelLoopState state, IEnumerable<double> localResults) =>
{
List<double> CalcResults = new List<double>();
for (int n = iStart; n < iEnd; n++)
{
CalcResults.AddRange(CalcRoutine(n, k));
}
return localResults.Concat(this.SumOfResults(CalcResults));
},
(IEnumerable<double> localResults) =>
{
lock (locker)
{
TotalResults.AddRange(localResults);
}
});
(If it had worked the way you seem to have assumed, why would they have you provide two separate delegates, if all it did, on the return from body, was to immediately invoke localFinally with the return value?)
Try this:
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
class Program
{
static void Main(string[] args)
{
var iScenarios = 6;
var iStart = 0;
var iEnd = 1000;
var totalResults = new List<double>();
Parallel.For(0, iScenarios, k => {
List<double> calcResults = new List<double>();
for (int n = iStart; n < iEnd; n++)
calcResults.AddRange(CalcRoutine(n, k));
lock (totalResults)
{
totalResults.AddRange(calcResults);
}
});
}
static IEnumerable<double> CalcRoutine(int a, int b)
{
yield return 0;
}
static double[] SumOfResults(IEnumerable<double> source)
{
return source.ToArray();
}
}

Categories