Creating Tasks dynamically and wait for completion (C#) - c#

In my C# project I have to open a bunch of images.
Let's say we need to open 50. My plan is to create 10 Tasks, do some stuff, and then wait for each to complete before the next 10 Tasks are created.
var fd = new OpenFileDialog
{
Multiselect = true,
Title = "Open Image",
Filter = "Image|*.jpg"
};
using (fd)
{
if (fd.ShowDialog() == DialogResult.OK)
{
int i = 1;
foreach (String file in fd.FileNames)
{
if (i <= 10) {
i++;
Console.WriteLine(i + ";" + file);
Task task = new Task(() =>
{
// do some stuff
});
task.Start();
}
else
{
Task.WaitAll();
i = 1;
}
}
}
}
Console.WriteLine("Wait for Tasks");
Task.WaitAll();
Console.WriteLine("Waited);
The Code is not waiting when i=10 and at the end it is also not waiting.
Does anyone have an idea how to fix it?

Task.WaitAll expects a Task array to wait, you never pass anything in. The following change will wait all the tasks you start.
List<Task> tasksToWait = new List<Task>();
foreach (String file in fd.FileNames)
{
if (i <= 10) {
i++;
Console.WriteLine(i + ";" + file);
Task task = new Task(() =>
{
// do some stuff
});
task.Start();
tasksToWait.Add(task);
}
else
{
Task.WaitAll(tasksToWait.ToArray());
tasksToWait.Clear();
i = 1;
}
}
This is a code fragment from your code above that has changes
Note This answer does not contain a critique on your choice of design and the possible pitfalls thereof.

Related

Stop foreach in asynchronous task on key press passing a cancellation token

I'm running a foreach and I would like to cancel its execution on a key press.
While I succeeded doing it integrating a single if (keypress) within the loop, now I'm trying to achieve the same using a CancellationToken while the task for listening for a key stroke is running.
var ts = new CancellationTokenSource();
CancellationToken ct = ts.Token;
Task.Factory.StartNew(() =>
{
while (true)
{
foreach (var station in stations)
{
/*if (Console.KeyAvailable)
{
break;
}*/
Console.WriteLine(station.name + " ");
Thread.Sleep(100);
}
Thread.Sleep(100);
if (ct.IsCancellationRequested)
{
// another thread decided to cancel
Console.WriteLine("task canceled");
break;
}
}
}, ct);
ts.Cancel();
Console.ReadLine();
I came from this answer How do I abort/cancel TPL Tasks? which helped me a lot.
However, while it works without the foreach, right now the foreach has to end before the task is cancelled.
Obviously, it looks like the iteration has to end before proceeding to the next step and what I don't understand is how I can make the foreach stop.
Cancellation is co-operative.
You need to check inside the foreach() and at any other appropriate point in your routines, to see if cancellation has been requested.
If it has, then you could prematurely exit out of the foreach(), (if your logic allows it), while ensuring that you clean up any resources and complete any actions that need to be completed.
Also, I suggest, instead of Thread.Sleep(), use await Task.Delay(100);.
It seems as though it's as simple as repeating your if:
var ts = new CancellationTokenSource();
CancellationToken ct = ts.Token;
Task.Factory.StartNew(() =>
{
while (true)
{
foreach (var station in stations)
{
/*if (Console.KeyAvailable)
{
break;
}*/
if (ct.IsCancellationRequested)
{
// another thread decided to cancel
break;
}
Console.WriteLine(station.name + " ");
Thread.Sleep(100);
}
Thread.Sleep(100);
if (ct.IsCancellationRequested)
{
// another thread decided to cancel
Console.WriteLine("task canceled");
break;
}
}
}, ct);
ts.Cancel();
Console.ReadLine();
As I commented before, I think you're trying to achieve something like this:
public class Program
{
static ConcurrentQueue<int> queue = new(); // >= .NET 5 / C# 9 only
public static void Main(string[] args)
{
var ts = new CancellationTokenSource();
var rand = new Random();
Task activeTask = null;
while (true)
{
var keyPressed = Console.ReadKey(true).Key;
if (keyPressed == ConsoleKey.Q)
{
for (int i = 0; i < 1000; i++)
queue.Enqueue(rand.Next(0, 1000));
Console.WriteLine("Random elements enqueued");
}
if (keyPressed == ConsoleKey.D)
{
if(activeTask == null)
{
ts.Dispose();
ts = new CancellationTokenSource();
CancellationToken ct = ts.Token;
activeTask = BackgroundLoopAction(ct);
}
else
Console.WriteLine("Background loop task already running");
}
if (keyPressed == ConsoleKey.X)
{
ts.Cancel();
activeTask = null;
}
if (keyPressed == ConsoleKey.Escape)
break;
}
}
private static Task BackgroundLoopAction(CancellationToken ct)
{
return Task.Run(() =>
{
while (queue.Count > 0)
{
queue.TryDequeue(out int q);
Console.WriteLine($"Dequeued element: {q}");
Thread.Sleep(100);
if (ct.IsCancellationRequested)
break;
}
Console.WriteLine(ct.IsCancellationRequested ? "Task canceled by user event" : "Task completed");
}, ct);
}
}

Why my workers work distribution count does not total the number of produced items in this System.Threading.Channel sample?

Following this post, I have been playing with System.Threading.Channel to get confident enough and use it in my production code, replacing the Threads/Monitor.Pulse/Wait based approach I am currently using (described in the referred post).
Basically I created a sample with a bounded channel where I run a couple of producer tasks at the beginning and, without waiting, start my consumer tasks, which start pushing elements from the channel.
After waiting for the producers tasks to complete, I then signal the channel as complete, so the consumer tasks can quit listening to new channel elements.
My channel is a Channel<Action>, and in each action I increment the count for each given worker in the WorkDistribution concurrent dictionary, and at the end of the sample I print it so I can check I consumed as many items as I expected, and also how did the channel distributed the actions between the consumers.
For some reason this "Work Distribution footer" is not printing the same number of items as the total items produced by producer tasks.
What am I missing ?
Some of the variables present were added for the sole purpose of helping troubleshoot.
Here's the full code:
public class ChannelSolution
{
object LockObject = new object();
Channel<Action<string>> channel;
int ItemsToProduce;
int WorkersCount;
int TotalItemsProduced;
ConcurrentDictionary<string, int> WorkDistribution;
CancellationToken Ct;
public ChannelSolution(int workersCount, int itemsToProduce, int maxAllowedItems,
CancellationToken ct)
{
WorkersCount = workersCount;
ItemsToProduce = itemsToProduce;
channel = Channel.CreateBounded<Action<string>>(maxAllowedItems);
Console.WriteLine($"Created channel with max {maxAllowedItems} items");
WorkDistribution = new ConcurrentDictionary<string, int>();
Ct = ct;
}
async Task ProduceItems(int cycle)
{
for (var i = 0; i < ItemsToProduce; i++)
{
var index = i + 1 + (ItemsToProduce * cycle);
bool queueHasRoom;
var stopwatch = new Stopwatch();
stopwatch.Start();
do
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting read loop - cancellation requested !");
break;
}
queueHasRoom = await channel.Writer.WaitToWriteAsync();
if (!queueHasRoom)
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting read loop - cancellation"
+ " requested !");
break;
}
if (stopwatch.Elapsed.Seconds % 3 == 0)
Console.WriteLine("Channel reached maximum capacity..."
+ " producer waiting for items to be freed...");
}
}
while (!queueHasRoom);
channel.Writer.TryWrite((workerName) => action($"A{index}", workerName));
Console.WriteLine($"Channel has room, item {index} added"
+ $" - channel items count: [{channel.Reader.Count}]");
Interlocked.Increment(ref TotalItemsProduced);
}
}
List<Task> GetConsumers()
{
var tasks = new List<Task>();
for (var i = 0; i < WorkersCount; i++)
{
var workerName = $"W{(i + 1).ToString("00")}";
tasks.Add(Task.Run(async () =>
{
while (await channel.Reader.WaitToReadAsync())
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting write loop - cancellation"
+ "requested !");
break;
}
if (channel.Reader.TryRead(out var action))
{
Console.WriteLine($"dequed action in worker [{workerName}]");
action(workerName);
}
}
}));
}
return tasks;
}
void action(string actionNumber, string workerName)
{
Console.WriteLine($"processing {actionNumber} in worker {workerName}...");
var secondsToWait = new Random().Next(2, 5);
Thread.Sleep(TimeSpan.FromSeconds(secondsToWait));
Console.WriteLine($"action {actionNumber} completed by worker {workerName}"
+ $" after {secondsToWait} secs! channel items left:"
+ $" [{channel.Reader.Count}]");
if (WorkDistribution.ContainsKey(workerName))
{
lock (LockObject)
{
WorkDistribution[workerName]++;
}
}
else
{
var succeeded = WorkDistribution.TryAdd(workerName, 1);
if (!succeeded)
{
Console.WriteLine($"!!! failed incremeting dic value !!!");
}
}
}
public void Summarize(Stopwatch stopwatch)
{
Console.WriteLine("--------------------------- Thread Work Distribution "
+ "------------------------");
foreach (var kv in this.WorkDistribution)
Console.WriteLine($"thread: {kv.Key} items consumed: {kv.Value}");
Console.WriteLine($"Total actions consumed: "
+ $"{WorkDistribution.Sum(w => w.Value)} - Elapsed time: "
+ $"{stopwatch.Elapsed.Seconds} secs");
}
public void Run(int producerCycles)
{
var stopwatch = new Stopwatch();
stopwatch.Start();
var producerTasks = new List<Task>();
Console.WriteLine($"Started running at {DateTime.Now}...");
for (var i = 0; i < producerCycles; i++)
{
producerTasks.Add(ProduceItems(i));
}
var consumerTasks = GetConsumers();
Task.WaitAll(producerTasks.ToArray());
Console.WriteLine($"-------------- Completed waiting for PRODUCERS -"
+ " total items produced: [{TotalItemsProduced}] ------------------");
channel.Writer.Complete(); //just so I can complete this demo
Task.WaitAll(consumerTasks.ToArray());
Console.WriteLine("----------------- Completed waiting for CONSUMERS "
+ "------------------");
//Task.WaitAll(GetConsumers().Union(producerTasks/*.Union(
// new List<Task> { taskKey })*/).ToArray());
//Console.WriteLine("Completed waiting for tasks");
Summarize(stopwatch);
}
}
And here is the calling code in Program.cs
var workersCount = 5;
var itemsToProduce = 10;
var maxItemsInQueue = 5;
var cts = new CancellationTokenSource();
var producerConsumerTests = new ProducerConsumerTests(workersCount, itemsToProduce,
maxItemsInQueue, cts.Token);
producerConsumerTests.Run(2);
From a quick look there is a race condition in the ProduceItems method, around the queueHasRoom variable. You don't need this variable. The channel.Writer.TryWrite method will tell you whether there is room in the channel's buffer or not. Alternatively you could simply await the WriteAsync method, instead of using the WaitToWriteAsync/TryWrite combo. AFAIK this combo is intended as a performance optimization of the former method. If you absolutely need to know whether there is available space before attempting to post a value, then the Channel<T> is probably not a suitable container for your use case. You'll need to find something that can be locked during the whole operation of "check-for-available-space -> create-the-value -> post-the-value", so that this operation can be made atomic.
As a side note, using a lock to protect the updating of the ConcurrentDictionary is redundant. The ConcurrentDictionary offers the AddOrUpdate method, that can replace atomically a value it contains with another value. You may had to lock if the dictionary contained mutable objects, and you needed to mutate that objects with thread-safety. But in your case the values are of type Int32, which is an immutable struct. You don't change it, you just replace it with a new Int32, which is created based on the existing value:
WorkDistribution.AddOrUpdate(workerName, 1, (_, existing) => existing + 1);

Is it possible to limit the number of web request per second?

Hi i am spidering the site and reading the contents.I want to keep the request rate reasonable. Up to approx 10 requests per second should probably be ok.Currently it is 5k request per minute and it is causing security issues as this looks to be a bot activity.
How to do this? Here is my code
protected void Iterareitems(List<Item> items)
{
foreach (var item in items)
{
GetImagesfromItem(item);
if (item.HasChildren)
{
Iterareitems(item.Children.ToList());
}
}
}
protected void GetImagesfromItem(Item childitems)
{
var document = new HtmlWeb().Load(completeurl);
var urls = document.DocumentNode.Descendants("img")
.Select(e => e.GetAttributeValue("src", null))
.Where(s => !string.IsNullOrEmpty(s)).ToList();
}
You need System.Threading.Semaphore, using which you can control the max concurrent threads/tasks. Here is an example:
var maxThreads = 3;
var semaphore = new Semaphore(maxThreads, maxThreads);
for (int i = 0; i < 10; i++) //10 tasks in total
{
var j = i;
Task.Factory.StartNew(() =>
{
semaphore.WaitOne();
Console.WriteLine("start " + j.ToString());
Thread.Sleep(1000);
Console.WriteLine("end " + j.ToString());
semaphore.Release();
});
}
You can see at most 3 tasks are working, others are pending by semaphore.WaitOne() because the maximum limit reached, and the pending thread will continue if another thread released the semaphore by semaphore.Release().

ParallelForEach get Directories c#

Here is what I've done so far, I don't know if it is the best way to realize this Parallel.ForEach, because sometimes it crashes and sometimes does not, can you guys please tell me what I'm doing wrong or what can I improve on this code?
Also I've got a problem with the StopWatch it's not showing correctly at all in my textbox, always stops after end the list of all directories....
private async void omplirParallel()
{
Stopwatch clock = new Stopwatch();
clock.Restart();
int contador = 0;
DirectoryInfo nodeDir = new DirectoryInfo(#"c:\files");
Parallel.ForEach(nodeDir.GetDirectories(), async dir =>
{
foreach (string s in Directory.GetFiles(dir.FullName))
{
Invoke(new MethodInvoker(delegate { lbxParallel.Items.Add(s); }));
contador++;
await Task.Delay(1);
}
});
await Task.Delay(1);
clock.Stop();
tbTimerParallel.Text = clock.Elapsed.TotalSeconds.ToString() + " segons";
tbcontadorParallel.Text = contador + " arxius";
}
EDIT
This is my ForEach without Parallel, the thing that I've tried is implement this code adapting with the ParallalelForEach
Stopwatch stopWatch = new Stopwatch();
foreach (string d in Directory.GetDirectories(#"C:\files"))
{
foreach (string s in Directory.GetFiles(d))
{
stopWatch.Start();
listBox1.Items.Add(s);
await Task.Delay(1);
btIniciar1.Enabled = false;
}
}
btIniciar1.Enabled = true;
stopWatch.Stop();
TimeSpan ts = stopWatch.Elapsed;
textBox1.Text = ts.ToString("mm\\:ss\\.ff") + (" minuts");
Paralel, or threads may be crash if there is common objects(shared object), so you have to make sure that there is no shared object (e.g. if two of them tried to write, edit or delete the shared object it will crash, and if not it will not crash).

Use Task.Run instead of Delegate.BeginInvoke

I have recently upgraded my projects to ASP.NET 4.5 and I have been waiting a long time to use 4.5's asynchronous capabilities. After reading the documentation I'm not sure whether I can improve my code at all.
I want to execute a task asynchronously and then forget about it. The way that I'm currently doing this is by creating delegates and then using BeginInvoke.
Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited:
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
var invoker = new MethodInvoker(delegate
{
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
});
invoker.BeginInvoke(StopAsynchronousMethod, invoker);
base.OnActionExecuting(filterContext);
}
But in order to finish this asynchronous task, I need to always define a callback, which looks like this:
public void StopAsynchronousMethod(IAsyncResult result)
{
var state = (MethodInvoker)result.AsyncState;
try
{
state.EndInvoke(result);
}
catch (Exception e)
{
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
}
}
I would rather not use the callback at all due to the fact that I do not need a result from the task that I am invoking asynchronously.
How can I improve this code with Task.Run() (or async and await)?
If I understood your requirements correctly, you want to kick off a task and then forget about it. When the task completes, and if an exception occurred, you want to log it.
I'd use Task.Run to create a task, followed by ContinueWith to attach a continuation task. This continuation task will log any exception that was thrown from the parent task. Also, use TaskContinuationOptions.OnlyOnFaulted to make sure the continuation only runs if an exception occurred.
Task.Run(() => {
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
}).ContinueWith(task => {
task.Exception.Handle(ex => {
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(ex, username);
});
}, TaskContinuationOptions.OnlyOnFaulted);
As a side-note, background tasks and fire-and-forget scenarios in ASP.NET are highly discouraged. See The Dangers of Implementing Recurring Background Tasks In ASP.NET
It may sound a bit out of scope, but if you just want to forget after you launch it, why not using directly ThreadPool?
Something like:
ThreadPool.QueueUserWorkItem(
x =>
{
try
{
// Do something
...
}
catch (Exception e)
{
// Log something
...
}
});
I had to do some performance benchmarking for different async call methods and I found that (not surprisingly) ThreadPool works much better, but also that, actually, BeginInvoke is not that bad (I am on .NET 4.5). That's what I found out with the code at the end of the post. I did not find something like this online, so I took the time to check it myself. Each call is not exactly equal, but it is more or less functionally equivalent in terms of what it does:
ThreadPool: 70.80ms
Task: 90.88ms
BeginInvoke: 121.88ms
Thread: 4657.52ms
public class Program
{
public delegate void ThisDoesSomething();
// Perform a very simple operation to see the overhead of
// different async calls types.
public static void Main(string[] args)
{
const int repetitions = 25;
const int calls = 1000;
var results = new List<Tuple<string, double>>();
Console.WriteLine(
"{0} parallel calls, {1} repetitions for better statistics\n",
calls,
repetitions);
// Threads
Console.Write("Running Threads");
results.Add(new Tuple<string, double>("Threads", RunOnThreads(repetitions, calls)));
Console.WriteLine();
// BeginInvoke
Console.Write("Running BeginInvoke");
results.Add(new Tuple<string, double>("BeginInvoke", RunOnBeginInvoke(repetitions, calls)));
Console.WriteLine();
// Tasks
Console.Write("Running Tasks");
results.Add(new Tuple<string, double>("Tasks", RunOnTasks(repetitions, calls)));
Console.WriteLine();
// Thread Pool
Console.Write("Running Thread pool");
results.Add(new Tuple<string, double>("ThreadPool", RunOnThreadPool(repetitions, calls)));
Console.WriteLine();
Console.WriteLine();
// Show results
results = results.OrderBy(rs => rs.Item2).ToList();
foreach (var result in results)
{
Console.WriteLine(
"{0}: Done in {1}ms avg",
result.Item1,
(result.Item2 / repetitions).ToString("0.00"));
}
Console.WriteLine("Press a key to exit");
Console.ReadKey();
}
/// <summary>
/// The do stuff.
/// </summary>
public static void DoStuff()
{
Console.Write("*");
}
public static double RunOnThreads(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var toProcess = calls;
var stopwatch = new Stopwatch();
var resetEvent = new ManualResetEvent(false);
var threadList = new List<Thread>();
for (var i = 0; i < calls; i++)
{
threadList.Add(new Thread(() =>
{
// Do something
DoStuff();
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
{
resetEvent.Set();
}
}));
}
stopwatch.Start();
foreach (var thread in threadList)
{
thread.Start();
}
resetEvent.WaitOne();
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnThreadPool(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var toProcess = calls;
var resetEvent = new ManualResetEvent(false);
var stopwatch = new Stopwatch();
var list = new List<int>();
for (var i = 0; i < calls; i++)
{
list.Add(i);
}
stopwatch.Start();
for (var i = 0; i < calls; i++)
{
ThreadPool.QueueUserWorkItem(
x =>
{
// Do something
DoStuff();
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
{
resetEvent.Set();
}
},
list[i]);
}
resetEvent.WaitOne();
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnBeginInvoke(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var beginInvokeStopwatch = new Stopwatch();
var delegateList = new List<ThisDoesSomething>();
var resultsList = new List<IAsyncResult>();
for (var i = 0; i < calls; i++)
{
delegateList.Add(DoStuff);
}
beginInvokeStopwatch.Start();
foreach (var delegateToCall in delegateList)
{
resultsList.Add(delegateToCall.BeginInvoke(null, null));
}
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(rs => !rs.IsCompleted))
{
Thread.Sleep(10);
}
beginInvokeStopwatch.Stop();
totalMs += beginInvokeStopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnTasks(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var resultsList = new List<Task>();
var stopwatch = new Stopwatch();
stopwatch.Start();
for (var i = 0; i < calls; i++)
{
resultsList.Add(Task.Factory.StartNew(DoStuff));
}
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(task => !task.IsCompleted))
{
Thread.Sleep(10);
}
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
}
Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited
Auditing is certainly not something I would call "fire and forget". Remember, on ASP.NET, "fire and forget" means "I don't care whether this code actually executes or not". So, if your desired semantics are that audits may occasionally be missing, then (and only then) you can use fire and forget for your audits.
If you want to ensure your audits are all correct, then either wait for the audit save to complete before sending the response, or queue the audit information to reliable storage (e.g., Azure queue or MSMQ) and have an independent backend (e.g., Azure worker role or Win32 service) process the audits in that queue.
But if you want to live dangerously (accepting that occasionally audits may be missing), you can mitigate the problems by registering the work with the ASP.NET runtime. Using the BackgroundTaskManager from my blog:
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
BackgroundTaskManager.Run(() =>
{
try
{
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
}
catch (Exception e)
{
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
}
});
base.OnActionExecuting(filterContext);
}

Categories