I need to execute strategy.AllTablesUpdated(); for 50 strategies in 2 ms (and I need to repeat that ~500 times per second).
Using code below I discovered that just Monitor.TryEnter call spents up to 1 ms (!!!) and I do that 50 times!
// must be called ~500 times per second
public void FinishUpdatingTables()
{
foreach (Strategy strategy in strategies) // about ~50, should be executed in 2 ms
{
// this slow and can be paralleled
strategy.AllTablesUpdated();
}
}
...................
public override bool AllTablesUpdated(Stopwatch sw)
{
this.sw = sw;
Checkpoint(this + " TryEnter attempt ");
if (Monitor.TryEnter(desiredOrdersBuy))
{
Checkpoint(this + " TryEnter success ");
try
{
OnAllTablesUpdated();
} finally
{
Monitor.Exit(desiredOrdersBuy);
}
return true;
} else
{
Checkpoint(this + " TryEnter failed ");
}
return false;
}
public void Checkpoint(string message)
{
if (sw == null)
{
return;
}
long time = sw.ElapsedTicks / (Stopwatch.Frequency / (1000L * 1000L));
Log.Push(LogItemType.Debug, message + time);
}
From logs (in µs), failed attempt spents ~ 1ms:
12:55:43:778 Debug: TryEnter attempt 1264
12:55:43:779 Debug: TryEnter failed 2123
From logs (in µs), succeed attempt spents ~ 0.01ms:
12:55:49:701 Debug: TryEnter attempt 889
12:55:49:701 Debug: TryEnter success 900
So now I think that Monitor.TryEnter is too expensive for me to be executed one by one for 50 strategies. So I want to parallel this work using Task like that:
// must be called ~500 times per second
public void FinishUpdatingTables()
{
foreach (Strategy strategy in strategies) // about ~50, should be executed in 2 ms
{
// this slow and can be paralleled
Task.Factory.StartNew(() => {
strategy.AllTablesUpdated();
});
}
}
I will also probably replace Monitor.TryEnter to just lock as with such approach everything will be asynchronous.
My questions:
Why Monitor.TryEnter is so slow ? (1 ms if lock is not obtained)
How good would be to start 50 Task each 2 ms = 25 000 of Tasks each second? Can .NET manage this effectively? I can also use producer-consumer pattern with BlockingCollection and start 50 "workers" only ONCE and then submit new pack of 50 items each 2 ms to BlockingCollection? Would that be better?
How would you execute 50 methods that can be paralleled each 2 ms (500 times per second), totally 25 000 times per second?
Monitor.TryEnter(object) is just Monitor.TryEnter(object, 0, ref false) (0 millisecond timeout). That 1 ms if the lock is not obtained is just overhead of trying to acquire a lock.
You can start as many tasks as you like, they all use the ThreadPool though which will be limited to a maximum number of threads. The maximum is dependent on your system, number of cores, memory etc... It will not be 25,000 threads that's for sure though. However, if you start meddeling with the TPL scheduler you'll get into trouble. I'd just use Parallel.Foreach and see how far it gets me.
Parallel.ForEach. I'd also ensure that strategies is of type IList so as many items are fired off as possible without waiting on an iterator.
You haven't pasted the code to OnAllTablesUpdated(), you keep the lock for the duration of that procedure. That's going to be your bottleneck in all liklihood.
Some questions, why are you using a lock for when the table is ready to be processed?
Are delegates not possible?
Why lock it when you're running the strategy? Are you modifying that table inside each strategy? Can you not take a copy of it if that is the case?
Related
i have this part of code
bool hasData = true;
using (Context context = new Context())
{
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(MAX_THREADS))
{
while (hasData)
{
Message message = context.Database.SqlQuery<Message>($#"
select top(1) * from Message where
( Status = {(int)MessageStatusEnum.Pending} ) or
( Status = {(int)MessageStatusEnum.Paused } and ResumeOn < GETUTCDATE() )
").FirstOrDefault();
if message == null)
{
hasData = false;
}
else
{
concurrencySemaphore.Wait();
tasks.Add(Task.Factory.StartNew(() =>
{
Process(message);
concurrencySemaphore.Release();
}, this.CancellationToken));
}
}
Task.WaitAll(tasks.ToArray());
}
}
And my Process function is something like this
private void Process(Message message)
{
System.Threading.Thread.Sleep(10000);
}
Now, if i have only 1 item that i want to Process then the total execution time is 10sec and the execution time per item(1 item) is 10 sec.
Well if i have 10 items for example, then the execution per item is increasing to 15-20sec.
I tried to change the value of MAX_THREADS but always if i have more than 10 items in my queue and start the parallel execution then the time of execution per item is about 15sec.
What i am missing?
Not unexpected. Parallel Slowdown is a very real problem. On the one end we have pleasingly parallel operations. On other end inherently serial stuff like calcuating the Fibonacci sequence. And there is a lot of area in between.
Multitasking in general and Multithreading in particular is not a magic bullet that makes everything faster. I like to say: Multitasking has to pick it's problems carefully. If you throw more tasks at a problem that is not designed for it, you end up with code that is:
more complex
more memory demanding
and most importanlty slower then the simple, sequential operation
At best you just added all that Task Management overhead to the operation. At worst you run into deadlocks and resource competition. And your example operation of Sleeping the Thread is not something that will benefit from Multitasking anyway. Please show us the actuall operation you want to speed up via Multitasking/-Threading. So we can tell you if there is any chance that can work out.
I am using a background service now to process a time consuming operation and same time showing progress in UI
But now looking for some performance improvements in terms of time [since at present background worker took more time to complete the processings]
The current solution of Do_Work in background worker is like
foreach(string _entry in ArrayList){
//process the _entry(communicate to a service and store response in Db) and took almost 2-3 seconds for the completion
}
ArrayList contains almost 25000 records . So almost 25000*3 = 75000 seconds time is required to do the processing now
The new solution i am thinking of is like starting 50 Threads of 500 items each at same time and waits for the completion of all Threads something like this
int failed = 0;
var tasks = new List<Task>();
for (int i = 1; i < 50; i++)
{
tasks.Add(Task.Run(() =>
{
try
{
//Process 500 items from Array .(communicate to a service and store response in Db)
}
catch
{
Interlocked.Increment(ref failed);
throw;
}
}));
}
Task t = Task.WhenAll(tasks); //Runs 50 Threads
try
{
await t;
}
catch (AggregateException exc)
{
string _error="";
foreach (Exception ed in t.Exception.InnerExceptions)
{
_error+=ed.Message;
}
MessageBox.Show(_error);
}
if (t.Status == TaskStatus.RanToCompletion)
MessageBox.Show("All ping attempts succeeded.");
else if (t.Status == TaskStatus.Faulted)
MessageBox.Show("{0} ping attempts failed", failed.ToString());
Will this helps to reduce the processing time? Or some better approaches?
I tried with a small sample size of 10 and debugged , but i cant see much difference (Something is wrong in my choice of WhenAll ?)
You need to handle few records at a time.
Process n records with n different tasks. Let's say n = 5. So 5 different tasks t1,t2,t3,t4 and t5. Once anyone is completed processing data one row they should start processing t (1+n), t (2+n), t(3+n),t(4+n), t(5+n) row and so on until all records are processed.
Store all those processed values into a dictionary or list to identify which one belongs to which record.
It's like recursive function.
I have used this kind of approach in past and it really improves the performance a lot. You can fine-tune the value of n based on your PC configuration
I would suggest to try with BlockingCollection
This is what AsyncMethods class looks like:
public class AsyncMethods
{
public static async Task<double> GetdoubleAsync()
{
Console.WriteLine("Thread.CurrentThread.ManagedThreadId: " + Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1000);
return 80d;
}
public static async Task<string> GetStringAsync()
{
Console.WriteLine("Thread.CurrentThread.ManagedThreadId: " + Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1000);
return "async";
}
public static async Task<DateTime> GetDateTimeAsync()
{
Console.WriteLine("Thread.CurrentThread.ManagedThreadId: " + Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1000);
return DateTime.Now;
}
}
This what my main method looks like:
static void Main(string[] args)
{
while (Console.ReadLine() != "exit")
{
Console.WriteLine("Thread.CurrentThread.ManagedThreadId: " + Thread.CurrentThread.ManagedThreadId);
DateTime dt = DateTime.Now;
var res = GetStuffAsync().Result;
var ts = DateTime.Now - dt;
Console.WriteLine(res);
Console.WriteLine("Seconds taken: " + ts.Seconds + " milliseconds taken: " + ts.Milliseconds);
}
Console.ReadLine();
return;
}
static async Task<object> GetStuffAsync()
{
var doubleTask = AsyncMethods.GetdoubleAsync();
var StringTask = AsyncMethods.GetStringAsync();
var DateTimeTask = AsyncMethods.GetDateTimeAsync();
return new
{
_double = await doubleTask,
_String = await StringTask,
_DateTime = await DateTimeTask,
};
}
As it can be seen in each method i added a delay of 1 second. Here is the output:
Thread.CurrentThread.ManagedThreadId: 10
Thread.CurrentThread.ManagedThreadId: 10
Thread.CurrentThread.ManagedThreadId: 10
Thread.CurrentThread.ManagedThreadId: 10
{ _double = 80, _String = async, _DateTime = 2/15/2017 4:32:00 AM }
Seconds taken: 1 milliseconds taken: 40
Thread.CurrentThread.ManagedThreadId: 10
Thread.CurrentThread.ManagedThreadId: 10
Thread.CurrentThread.ManagedThreadId: 10
Thread.CurrentThread.ManagedThreadId: 10
{ _double = 80, _String = async, _DateTime = 2/15/2017 4:32:03 AM }
Seconds taken: 1 milliseconds taken: 16
Now i have 2 questions:
How come everything happened on a single thread?
Why was the Delay only 1 second when i waited 3 seconds?
First off: if you have two questions please ask two questions. Don't put two questions in one question.
How come everything happened on a single thread?
That's the wrong question to ask. The correct question is: why do you think anything should happen on a second thread?
Here, I'll give you a task: wait five minutes, and then check your email. While you're waiting, make a sandwich. Did you have to hire someone to either do the waiting or make the sandwich? Obviously not. Threads are workers. There's no need to hire a worker if the job can be done by one worker.
The whole point of await is to avoid going to extra threads if you don't need to. In this case you don't need to.
Why was the Delay only 1 second when i waited 3 seconds?
Compare these two workflows.
Wait five minutes; while you're waiting, make a sandwich
then check your email
then wait five minutes; while you're waiting, make a sandwich
then check your email
then wait five minutes; while you're waiting, make a sandwich
then check your email
If you execute that workflow, you'll wait a total of fifteen minutes.
The workflow you wrote was:
Wait five minutes
simultaneously, wait five minutes
simultaneously, wait five minutes
while you're waiting, make a sandwich
then check your email
You only wait five minutes with that workflow; all the delays happen at the same time.
Do you see how you wrote your program incorrectly now?
The key insight to understand here is that an await is a point in a program where the continuation of the await is delayed until after the awaited task completes.
If you don't put in an await, the program continues by itself without waiting. That's the meaning of await.
They all start on the same thread. When you call your three Async methods in sequence, they all execute synchronously up until the first await call. (After the await, they become state machines that pick up where they left off whenever they get scheduled. If you checked the thread ID after the await Task.Delay call, you would probably find that the continuations ran on different threads -- at least here in a console app.)
As for why it's only delaying 1 second... that's what you're telling it to do. You've got three async tasks, all running simultaneously, each delaying for one second. You're not saying "[a]wait until the first task is done before starting the second" -- in fact you're carefully doing the opposite, starting all three and then awaiting all three -- so they run in parallel.
Your Console.WriteLine() calls in GetdoubleAsync(), GetStringAsync(), and GetDateTimeAsync() happened in the calling thread because they happened before the first continuation.
Your await Task.Delay() calls yielded the thread back to the calling code.
When the task returned by Task.Delay() completed, the continuation on those Tasks returned their values and set their tasks as completed.
This allowed your 3 awaits (in sequential, synchronous order) in GetStuffAsync() to return. Each one had to wait 1 second before marked as completed, but they were yielding and happening at the same time.
I think you are looking for System.Threading.Tasks.Parallel to do things at the same time. Async...await is useful for yielding threads.
You're starting all your tasks at the same time so they're all going to run in parallel, not in sequence. That's why everything completes after 1000 milliseconds.
Additionally, async doesn't create new threads, it uses the current thread asynchronously. You can see this kind of behaviour in async javascript (which is a single threaded environment) or coroutines in Unity3D. They both allow async behaviour without threads.
So each of your tasks is being run on the same thread and completes in 1 second.
I have a ConcurrentBag urls whose items are being processed in parallel (nothing is being written back to the collection):
urls.AsParallel<UrlInfo>().WithDegreeOfParallelism(17).ForAll( item =>
UrlInfo info = MakeSynchronousWebRequest(item);
(myProgress as IProgress<UrlInfo>).Report(info);
});
I have the timeout set to 30 seconds in the web request. When a url that is very slow to respond is encountered, all of the parallel processing grinds to a halt. Is this expected behavior, or should I be searching out some problem in my code?
Here's the progress :
myProgress = new Progress<UrlInfo>( info =>
{
Action action = () =>
{
Interlocked.Increment(ref itested);
if (info.status == UrlInfo.UrlStatusCode.dead)
{
Interlocked.Increment(ref idead);
this.BadUrls.Add(info);
}
dead.Content = idead.ToString();
tested.Content = itested.ToString();
};
try
{
Dispatcher.BeginInvoke(action);
}
catch (Exception ex)
{
}
});
It's the expected behavior. AsParallel doesn't return until all the operations are finished. Since you're making synchronous requests, you've got to wait until your slowest one is finished. However note that even if you've got one really slow task hogging up a thread, the scheduler continues to schedule new tasks as old ones finish on the remaining threads.
Here's a rather instructive example. It creates 101 tasks. The first task hogs one thread for 5000 ms, the 100 others churn on the remaining 20 threads for 1000 ms each. So it schedules 20 of those tasks and they run for one second each, going through that cycle 5 times to get through all 100 tasks, for a total of 5000 ms. However if you change the 101 to 102, that means you've got 101 tasks churning on the 20 threads, which will end up taking 6000 ms; that 101th task just didn't have a thread to churn on until the 5 sec mark. If you change the 101 to, say, 2, you note it still takes 5000 ms because you have to wait for the slow task to complete.
static void Main()
{
ThreadPool.SetMinThreads(21, 21);
var sw = new Stopwatch();
sw.Start();
Enumerable.Range(0, 101).AsParallel().WithDegreeOfParallelism(21).ForAll(i => Thread.Sleep(i==0?5000:1000));
Console.WriteLine(sw.ElapsedMilliseconds);
}
Is there any change that a multiple Background Workers perform better than Tasks on 5 second running processes? I remember reading in a book that a Task is designed for short running processes.
The reasong I ask is this:
I have a process that takes 5 seconds to complete, and there are 4000 processes to complete. At first I did:
for (int i=0; i<4000; i++) {
Task.Factory.StartNewTask(action);
}
and this had a poor performance (after the first minute, 3-4 tasks where completed, and the console application had 35 threads). Maybe this was stupid, but I thought that the thread pool will handle this kind of situation (it will put all actions in a queue, and when a thread is free, it will take an action and execute it).
The second step now was to do manually Environment.ProcessorCount background workers, and all the actions to be placed in a ConcurentQueue. So the code would look something like this:
var workers = new List<BackgroundWorker>();
//initialize workers
workers.ForEach((bk) =>
{
bk.DoWork += (s, e) =>
{
while (toDoActions.Count > 0)
{
Action a;
if (toDoActions.TryDequeue(out a))
{
a();
}
}
}
bk.RunWorkerAsync();
});
This performed way better. It performed much better then the tasks even when I had 30 background workers (as much tasks as in the first case).
LE:
I start the Tasks like this:
public static Task IndexFile(string file)
{
Action<object> indexAction = new Action<object>((f) =>
{
Index((string)f);
});
return Task.Factory.StartNew(indexAction, file);
}
And the Index method is this one:
private static void Index(string file)
{
AudioDetectionServiceReference.AudioDetectionServiceClient client = new AudioDetectionServiceReference.AudioDetectionServiceClient();
client.IndexCompleted += (s, e) =>
{
if (e.Error != null)
{
if (FileError != null)
{
FileError(client,
new FileIndexErrorEventArgs((string)e.UserState, e.Error));
}
}
else
{
if (FileIndexed != null)
{
FileIndexed(client, new FileIndexedEventArgs((string)e.UserState));
}
}
};
using (IAudio proxy = new BassProxy())
{
List<int> max = new List<int>();
if (proxy.ReadFFTData(file, out max))
{
while (max.Count > 0 && max.First() == 0)
{
max.RemoveAt(0);
}
while (max.Count > 0 && max.Last() == 0)
{
max.RemoveAt(max.Count - 1);
}
client.IndexAsync(max.ToArray(), file, file);
}
else
{
throw new CouldNotIndexException(file, "The audio proxy did not return any data for this file.");
}
}
}
This methods reads from an mp3 file some data, using the Bass.net library. Then that data is sent to a WCF service, using the async method.
The IndexFile(string file) method, which creates tasks is called for 4000 times in a for loop.
Those two events, FileIndexed and FileError are not handled, so they are never thrown.
The reason why the performance for Tasks was so poor was because you mounted too many small tasks (4000). Remember the CPU needs to schedule the tasks as well, so mounting a lots of short-lived tasks causes extra work load for CPU. More information can be found in the second paragraph of TPL:
Starting with the .NET Framework 4, the TPL is the preferred way to
write multithreaded and parallel code. However, not all code is
suitable for parallelization; for example, if a loop performs only a
small amount of work on each iteration, or it doesn't run for many
iterations, then the overhead of parallelization can cause the code to
run more slowly.
When you used the background workers, you limited the number of possible alive threads to the ProcessCount. Which reduced a lot of scheduling overhead.
Given that you have a strictly defined list of things to do, I'd use the Parallel class (either For or ForEach depending on what suits you better). Furthermore you can pass a configuration parameter to any of these methods to control how many tasks are actually performed at the same time:
System.Threading.Tasks.Parallel.For(0, 20000, new ParallelOptions() { MaxDegreeOfParallelism = 5 }, i =>
{
//do something
});
The above code will perform 20000 operations, but will NOT perform more than 5 operations at the same time.
I SUSPECT the reason the background workers did better for you was because you had them created and instantiated at the start, while in your sample Task code it seems you're creating a new Task object for every operation.
Alternatively, did you think about using a fixed number of Task objects instantiated at the start and then performing a similar action with a ConcurrentQueue like you did with the background workers? That should also prove to be quite efficient.
Have you considered using threadpool?
http://msdn.microsoft.com/en-us/library/system.threading.threadpool.aspx
If your performance is slower when using threads, it can only be due to threading overhead (allocating and destroying individual threads).