Below is an example
public class Printer
{
// Lock token.
private object threadLock = new object();
public void PrintNumbers()
{
// Use the lock token.
lock (threadLock)
{
...
}
}
}
but I still don't get the concept of a thread token, why is it necessary? is a thread token same thing as Semaphore in C? But for C programs, a Semaphore is just a integer?
lock is a mutex, and works like POSIX pthread_mutex_lock and pthread_mutex_unlock in C.
Only one piece of code is allowed to acquire a lock on a given object at once, so it's a way of synchronizing threads (not necessarily the best way, but that's a much more detailed and highly contextual answer).
As an example, the following code runs a couple threads at the same time:
one increments each element of an array of numbers by 10,
the other prints the contents of the array
var numbers = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Task.WhenAll(
Task.Run(() =>
{
for (var i = 0; i < 10; i++)
{
numbers[i] += 10;
Thread.Sleep(10);
}
}),
Task.Run(() =>
{
foreach (var i in numbers)
{
Console.Write(i + " ");
Thread.Sleep(10);
}
})
);
Since they run at the same time, the output is something like:
11 2 13 4 5 6 7 18 9 10
Some numbers are incremented, others are not, and it's different every time.
The same code with the loops wrapped in a lock, however:
object threadLock = new object();
var numbers = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Task.WhenAll(
Task.Run(() =>
{
lock (threadLock)
{
for (var i = 0; i < 10; i++)
{
numbers[i] += 10;
Thread.Sleep(10);
}
}
}),
Task.Run(() =>
{
lock (threadLock)
{
foreach (var i in numbers)
{
Console.Write(i + " ");
Thread.Sleep(10);
}
}
})
);
This only ever outputs one of the two things, depending on which loop acquires the lock first:
11 12 13 14 15 16 17 18 19 20
or
1 2 3 4 5 6 7 8 9 10
There's no actual coordination between the two tasks, so which set you get (incremented or not) just depends on which happens to acquire the lock first.
Related
I'm trying to use the Parallel library for my code and I'm facing a strange issue.
I made a short program to demonstrate the behavior. In short, I make 2 loops (one inside another). The first loop generates a random array of 200 integers and the second loop adds all the arrays in a big list.
The issue is, in the end, I don't get a multiple of 200 integers, instead I see some runs doesn't wait for the random array to fully be loaded.
It's difficult to explain so here the sample code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
namespace TestParallel
{
class Program
{
static int RecommendedDegreesOfParallelism = 8;
static int DefaultMaxPageSize = 200;
static void Main(string[] args)
{
int maxPage = 50;
List<int> lstData = new List<int>();
Parallel.For(0, RecommendedDegreesOfParallelism, new ParallelOptions() { MaxDegreeOfParallelism = RecommendedDegreesOfParallelism },
(index) =>
{
int cptItems = 0;
int cptPage = 1 - RecommendedDegreesOfParallelism + index;
int idx = index;
do
{
cptPage += RecommendedDegreesOfParallelism;
if (cptPage > maxPage) break;
int Min = 0;
int Max = 20;
Random randNum = new Random();
int[] test2 = Enumerable
.Repeat(0, DefaultMaxPageSize)
.Select(i => randNum.Next(Min, Max))
.ToArray();
var lstItems = new List<int>();
lstItems.AddRange(test2);
var lstRes = new List<int>();
lstItems.AsParallel().WithDegreeOfParallelism(8).ForAll((item) =>
{
lstRes.Add(item);
});
Console.WriteLine($"{Task.CurrentId} = {lstRes.Count}");
lstData.AddRange(lstRes);
cptItems = lstRes.Count;
} while (cptItems == DefaultMaxPageSize);
}
);
Console.WriteLine($"END: {lstData.Count}");
Console.ReadKey();
}
}
}
And here is an execution log :
4 = 200
1 = 200
2 = 200
3 = 200
6 = 200
5 = 200
7 = 200
8 = 200
1 = 200
6 = 194
2 = 191
5 = 200
7 = 200
8 = 200
4 = 200
5 = 200
3 = 182
4 = 176
8 = 150
7 = 200
5 = 147
1 = 200
7 = 189
1 = 200
1 = 198
END: 4827
We can see some loops return less than 200 items.
How is it possible?
This here is not threadsafe:
lstItems.AsParallel().WithDegreeOfParallelism(8).ForAll((item) =>
{
lstRes.Add(item);
});
From the documentation for List<T>:
It is safe to perform multiple read operations on a List, but
issues can occur if the collection is modified while it's being read.
To ensure thread safety, lock the collection during a read or write
operation. To enable a collection to be accessed by multiple threads
for reading and writing, you must implement your own synchronization.
It doesn't explicitly mention it, but .Add() can also fail when called simultaneously by multiple threads.
The solution would be to lock the calls to List<T>.Add() in the loop above, but if you do that it will likely make it slower than just adding the items in a loop in a single thread.
var locker = new object();
lstItems.AsParallel().WithDegreeOfParallelism(8).ForAll((item) =>
{
lock (locker)
{
lstRes.Add(item);
}
});
can anyone please tell me, how is it possible that this code:
for (byte i = 0; i < someArray.Length; i++)
{
pool.QueueTask(() =>
{
if (i > 0 && i < someArray.Length)
{
myFunction(i, someArray[i], ID);
}
});
}
falls on the line where myFunction is called with IndexOutOfRangeException because the i variable gets value equal to someArray.Length? I really do not understand to that...
Note: pool is an instance of simple thread pool with 2 threads.
Note2: The type byte in for loop is intentionally placed because the array length can not go over byte max value (according to preceding logic that creates the array) and I need variable i to be of type byte.
Your code is creating a closure on i, and it will end up being someArray.Length every time it's executed. The Action that you end up passing into QueueTask() retains the state of the for loop, and uses the value of i at execution time. Here is a compilable code sample that expresses this same problem,
static void Main(string[] args)
{
var someArray = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var fns = new List<Action>();
for (int i = 0; i < someArray.Length; i++)
{
fns.Add(() => myFunction(i, someArray[i]));
}
foreach (var fn in fns) fn();
}
private static void myFunction(int i, int v)
{
Console.WriteLine($"{v} at idx:{i}");
}
You can break this by copying the closed around variable in a local, which retains the value of i at creation time of the Action.
static void Main(string[] args)
{
var someArray = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var fns = new List<Action>();
for (int i = 0; i < someArray.Length; i++)
{
var local = i;
fns.Add(() => myFunction(local, someArray[local]));
}
foreach (var fn in fns) fn();
}
private static void myFunction(int i, int v)
{
Console.WriteLine($"{v} at idx:{i}");
}
Related reading: http://csharpindepth.com/Articles/Chapter5/Closures.aspx
Using a byte as index seems like a premature micro-optimization, and it will indeed cause trouble if your array has more than 255 elements.
Also, while we're at it: you mention that you're running on a threadpool. Are you making sure that the someArray does not go out of scope while the code is running?
Imagine I had a waiting list with the following in the queue
Service 1 - 5 minutes
Service 2 - 10 minutes
Service 3 - 5 minutes
Service 4 - 15 minutes
Service 5 - 20 minutes
If I have two staff to service these 5 clients in the queue how could I estimate the waiting time for the next person to walk in the store.
Actually it's pretty simple - it's the "W" queue model as described by Eric Lippert.
Set up an array two "staff" members:
List<int>[] staff = new [] {new List<int>(), new List<int>()};
define your queue:
int[] queue = new int[] {5, 10, 5, 15, 20};
Then simulate the processing - each subsequent customer will go to the servicer that is done first:
foreach (int i in queue)
{
List<int> shortest = staff.OrderBy(s=>s.Sum()).First();
shortest.Add(i);
}
The "next" person to come in will have to wait until the first servicer is free, which is the sum of each of the customers served:
int nextTime = staff.Min(s=>s.Sum());
Console.WriteLine("The wait time for the next customer is {0} minutes",nextTime);
Output:
The wait time for the next customer is 25 minutes.
Here's a not-so-elegant way to do it...
private static int GetEstimatedWaitTime(Queue<int> currentQueue, int numServers)
{
int waitTime = 0;
// Short-circuit if there are more servers than items in the queue
if (currentQueue.Count < numServers) return waitTime;
// Create a copy of the queue so we can dequeue from it
var remainingItems = new Queue<int>();
foreach (var item in currentQueue)
{
remainingItems.Enqueue(item);
}
// Grab an item for each server
var itemsBeingServiced = new List<int>();
for (int i = 0; i < numServers; i++)
{
itemsBeingServiced.Add(remainingItems.Dequeue());
}
do
{
// Get the shortest item left, increment our wait time, and adjust other items
itemsBeingServiced.Sort();
var shortestItem = itemsBeingServiced.First();
waitTime += shortestItem;
itemsBeingServiced.RemoveAll(item => item == shortestItem);
for (int i = 0; i < itemsBeingServiced.Count; i++)
{
itemsBeingServiced[i] = itemsBeingServiced[i] - shortestItem;
}
// Add more items for available servers if there are any
while (itemsBeingServiced.Count < numServers && remainingItems.Any())
{
itemsBeingServiced.Add(remainingItems.Dequeue());
}
} while (itemsBeingServiced.Count >= numServers);
return waitTime;
}
WHen I run the following code:
public static double SumRootN(int root)
{
double result = 0;
for (int i = 1; i < 10000000; i++)
{
result += Math.Exp(Math.Log(i) / root);
}
return result;
}
static void Main()
{
ParallelOptions options = new ParallelOptions();
options.MaxDegreeOfParallelism = 2; // -1 is for unlimited. 1 is for sequential.
try
{
Parallel.For(
0,
9,
options,
(i) =>
{
var result = SumRootN(i);
Console.WriteLine("Thread={0}, root {0} : {1} ",Thread.CurrentThread.ManagedThreadId, i, result);
});
);
}
catch (AggregateException e)
{
Console.WriteLine("Parallel.For has thrown the following (unexpected) exception:\n{0}", e);
}
}
I see that the output is:
There are 3 thread Ids here, but I have specified that the MaxDegreeOFParallelism is only 2. So why is there 3 threads doing the work instead of 2?
Quote from http://msdn.microsoft.com/en-us/library/system.threading.tasks.paralleloptions.maxdegreeofparallelism(v=vs.110).aspx
By default, For and ForEach will utilize however many threads the underlying scheduler provides, so changing MaxDegreeOfParallelism from the default only limits how many concurrent tasks will be used.
Translation: only 2 threads will be running at any given moment, but more (or even less) than 2 may be used out of the thread pool. You can test this with another writeline at the start of the task, you'll see that no 3 threads will enter concurrently.
I'm trying to determine the optimal solution for this tough problem. I've got a length (let's say 11). So it's a one dimensional space 0-10. Now I've got these intervals with same length (let's assume 2 in this example). Now they're randomly distributed (overlapping, or not). Let me draw an example:
Situation:
|00|01|02|03|04|05|06|07|08|09|10| <- Space (length = 11)
|-----|
|-----|
|-----|
|-----|
|-----|
|-----| <- single interval of length = 2
Now the solution needs to find the maximal number of intervals that can fit at once without overlap.
The solution is: 4 intervals
There are three results of 4 intervals:
|00|01|02|03|04|05|06|07|08|09|10|
|-----| |-----|-----| |-----| <- result 1
|-----| |-----| |-----| |-----| <- result 2
|-----| |-----|-----| |-----| <- result 3
But there are also two more constraints as well.
If there are more results (of best solution, in this case = 4), then the one with the least number of gaps.
If there are more results still the one with the highest minimal length of all its spaces. For example the one with spaces (of length) 2 & 3 has minimal length of space = 2, that is better than 1 & 4 where the minimal length of space is only 1.
So the result 2 has 4 "continual" chunks, the other two have only 3 so the refinement is:
|00|01|02|03|04|05|06|07|08|09|10|
|-----| |-----------| |-----| <- result 1
|-----| |-----------| |-----| <- result 3
Those two got same space distributions between them, so let's take first one.
The result for the input set is:
Interval count : 4
Optimal solution: |-----| |-----------| |-----|
The algorithm has to work universally for all the space length (not only 11), all interval lengths (interval length is always <= space length) and any number of intervals.
Update:
Problematic scenario:
|00|01|02|03|04|05|06|07|08|09|
|-----|
|-----|
|-----|
|-----|
|-----|
This is a simple dynamic programming problem.
Let the total length be N and the length of a task be L.
Let F(T) be maximum number of tasks that can be selected from the sub interval (T, N), then at each unit time T, there are 3 possibilities:
There is no task that starts at T.
There is a task that starts at T, but we do not include it in the result set.
There is a task that starts at T, and we do include it in the result set.
Case 1 is simple, we just have F(T) = F(T + 1).
In case 2/3, notice that selecting a task that start a T means we must reject all tasks that start while this task is running, i.e. between T and T + L. So we get F(T) = max(F(T + 1), F(T + L) + 1).
Finally, F(N) = 0. So you just start from F(N) and work your way back to F(0).
EDIT: This will give you the maximum number of intervals, but not the set that fulfils your 2 constraints. Your explanation of the constraints is unclear to me, so I'm not sure how to help you there. In particular, I can't tell what constraint 1 means since all the solutions to your example set are apparently equal.
EDIT 2: Some further explanation as requested:
Consider your posted example, we have N = 11 and L = 2. There are tasks that start at T = 0, 3, 4, 5, 6, 9. Starting from F(11) = 0 and working backwards:
F(11) = 0
F(10) = F(11) = 0 (Since no task starts at T = 10)
F(9) = max(F(10), F(11) + 1) = 1
...
Eventually we get to F(0) = 4:
T |00|01|02|03|04|05|06|07|08|09|10|
F(T)| 4| 3| 3| 3| 3| 2| 2| 1| 1| 1| 0|
EDIT 3: Well I was curious enough about this that I wrote a solution, so may as well post it. This will give you the set that has the most tasks, with the least number of gaps, and the smallest minimum gap. The output for the examples in the question is:
(0, 2) -> (4, 6) -> (6, 8) -> (9, 11)
(0, 2) -> (4, 6) -> (8, 10)
Obviously, I make no guarantees about correctness! :)
private class Task
{
public int Start { get; set; }
public int Length { get; set; }
public int End { get { return Start + Length; } }
public override string ToString()
{
return string.Format("({0:d}, {1:d})", Start, End);
}
}
private class CacheEntry : IComparable
{
public int Tasks { get; set; }
public int Gaps { get; set; }
public int MinGap { get; set; }
public Task Task { get; set; }
public Task NextTask { get; set; }
public int CompareTo(object obj)
{
var other = obj as CacheEntry;
if (Tasks != other.Tasks)
return Tasks - other.Tasks; // More tasks is better
if (Gaps != other.Gaps)
return other.Gaps = Gaps; // Less gaps is better
return MinGap - other.MinGap; // Larger minimum gap is better
}
}
private static IList<Task> F(IList<Task> tasks)
{
var end = tasks.Max(x => x.End);
var tasksByTime = tasks.ToLookup(x => x.Start);
var cache = new List<CacheEntry>[end + 1];
cache[end] = new List<CacheEntry> { new CacheEntry { Tasks = 0, Gaps = 0, MinGap = end + 1 } };
for (int t = end - 1; t >= 0; t--)
{
if (!tasksByTime.Contains(t))
{
cache[t] = cache[t + 1];
continue;
}
foreach (var task in tasksByTime[t])
{
var oldCEs = cache[t + task.Length];
var firstOldCE = oldCEs.First();
var lastOldCE = oldCEs.Last();
var newCE = new CacheEntry
{
Tasks = firstOldCE.Tasks + 1,
Task = task,
Gaps = firstOldCE.Gaps,
MinGap = firstOldCE.MinGap
};
// If there is a task that starts at time T + L, then that will always
// be the best option for us, as it will have one less Gap than the others
if (firstOldCE.Task == null || firstOldCE.Task.Start == task.End)
{
newCE.NextTask = firstOldCE.Task;
}
// Otherwise we want the one that maximises MinGap.
else
{
var ce = oldCEs.OrderBy(x => Math.Min(x.Task.Start - newCE.Task.End, x.MinGap)).Last();
newCE.NextTask = ce.Task;
newCE.Gaps++;
newCE.MinGap = Math.Min(ce.MinGap, ce.Task.Start - task.End);
}
var toComp = cache[t] ?? cache[t + 1];
if (newCE.CompareTo(toComp.First()) < 0)
{
cache[t] = toComp;
}
else
{
var ceList = new List<CacheEntry> { newCE };
// We need to keep track of all subsolutions X that start on the interval [T, T+L] that
// have an equal number of tasks and gaps, but a possibly a smaller MinGap. This is
// because an earlier task may have an even smaller gap to this task.
int idx = newCE.Task.Start + 1;
while (idx < newCE.Task.End)
{
toComp = cache[idx];
if
(
newCE.Tasks == toComp.First().Tasks &&
newCE.Gaps == toComp.First().Gaps &&
newCE.MinGap >= toComp.First().MinGap
)
{
ceList.AddRange(toComp);
idx += toComp.First().Task.End;
}
else
idx++;
}
cache[t] = ceList;
}
}
}
var rv = new List<Task>();
var curr = cache[0].First();
while (true)
{
rv.Add(curr.Task);
if (curr.NextTask == null) break;
curr = cache[curr.NextTask.Start].First();
}
return rv;
}
public static void Main()
{
IList<Task> tasks, sol;
tasks = new List<Task>
{
new Task { Start = 0, Length = 2 },
new Task { Start = 3, Length = 2 },
new Task { Start = 4, Length = 2 },
new Task { Start = 5, Length = 2 },
new Task { Start = 6, Length = 2 },
new Task { Start = 9, Length = 2 },
};
sol = F(tasks);
foreach (var task in sol)
Console.Out.WriteLine(task);
Console.Out.WriteLine();
tasks = new List<Task>
{
new Task { Start = 0, Length = 2 },
new Task { Start = 3, Length = 2 },
new Task { Start = 4, Length = 2 },
new Task { Start = 8, Length = 2 },
};
sol = F(tasks);
foreach (var task in sol)
Console.Out.WriteLine(task);
Console.Out.WriteLine();
tasks = new List<Task>
{
new Task { Start = 0, Length = 5 },
new Task { Start = 6, Length = 5 },
new Task { Start = 7, Length = 3 },
new Task { Start = 8, Length = 9 },
new Task { Start = 19, Length = 1 },
};
sol = F(tasks);
foreach (var task in sol)
Console.Out.WriteLine(task);
Console.Out.WriteLine();
Console.In.ReadLine();
}