I'm trying to implement a multiple download function, that would fire and download like 10 files at the same time.
I implemented the IProgress interface too to let me know what's the progress for that.
What I need now is a simple counter inside each of them to say: this is download #1, this is download #2, etc..
I can't do a normal counter since they may all run at the same time and update the counter before me using/storing the value, causing me to have a wrong value or the same value for many of them.
I've looked into the Interlocked class, but I'm just not able to find a suitable implementation for that, that would store a specific number for a each async dynamic function that is triggered.
I'm using the number to store the progress in an array, so I'll just call like:
IProgress<double> progressHandler = new Progress<double>(p => HandleUnitProgressBar(p, downloadIndex));
and let the handler store the progress in the designated cell of the array.
Can anyone point me in the right direction?
Edit #1: Adding code I'm trying to use:
for (int i = 0; i < _downloadList.Count; i++)
{
var url = _downloadList.ToArray()[i];
Task.Factory.StartNew
(
async () =>
{
try
{
MegaApiClient client = new MegaApiClient();
//client.LoginAnonymous();
downloadIndex = i;
IProgress<double> progressHandler = new Progress<double>(p => HandleUnitProgressBar(p, downloadIndex));
await client.DownloadFileAsync(fileLink, url.Value, progressHandler);
}
catch (Exception e)
{
//will add later
}
}
, CancellationToken.None
, TaskCreationOptions.None
, TaskScheduler.Current
);
}
Problem with this code is that by the time it reaches downloadIndex = i, i is already 4 (for 4 simultaneous downloads) whereas I want to send 0,1,2,3 and not 4 to all the handlers.
I don't know if I fully understand your question, but I'll give it a shot. If I understand correctly, you are asking for an integer to be assigned to a request for tracking purposes right? I've done this before while doing a bit of stress testing and sends the requests out in waves of 100. The code looked a little like this:
static void Main(string[] args)
{
var app = new Program();
var i = 0;
var j = 0;
var tasks = new List<Task>();
while (j < 1000)
{
tasks.Add(app.CreateOpp(j));
Console.WriteLine(i);
if (i == 100)
{
var done = Task.WhenAll(tasks);
done.Wait();
i = 0;
tasks = new List<Task>();
}
i++;
j++;
}
}
private async Task CreateOpp(int i)
{
var client = new RestClient("https...");
var request = new RestRequest(Method.POST);
request.AddHeader("Authorization", "Bearer Token");
request.AddHeader("Content-Type", "application/json");
var response = await client.ExecuteTaskAsync(request);
Console.WriteLine(i + " Status: " + response.StatusCode);
Console.WriteLine();
}
After each task completes, instead of having it write to the console like I did on the line that says "Console.WriteLine(i + " Status: " + response.StatusCode);", you can probably just increment some value. Hope this helps or at least leads you down a new path!
Related
I am working on a WinForm project where I have a label in a for loop. I want to show the label each time after executing the label.text statement. But it doesn't show for every time, rather it shows after for loop is finished.
I tried to achieve this by using Thread.Sleep(). But I can't. Please help me.
NOTE :- lblProgress is a Label
Here's my coding.
for (int i = 1; i <= sourceTable.Rows.Count - 1; i++)
{
string checkout;
checkout= sourceTable.Rows[i].Field<string>(0);
dest = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["local"].ConnectionString);
dest.Open();
destcmd = new SqlCommand(checkout, dest);
destcmd.ExecuteNonQuery();
dest.Close();
prcmail();
prcmessagecheck();
lblProgress.Text = "Hello World"+i;
Thread.Sleep(10000);
}
Whenever you create a WinForm application, it is spun up into a new process and a new thread is created. Any updates to the User Interface are all done on the same thread as your process. This means when your application is doing "busy work", your UI will be blocked because they are on the same thread. What this means is that, in order to achieve what it is you're trying to achieve, you have to do a little extra work.
First step we need to do is create a function for your work routine (we could use an anonymous function, but since you are new to C#, I think it'll be easier to understand if we break it out), like this:
private void DoWork()
{
for (int i = 1; i <= sourceTable.Rows.Count - 1; i++)
{
string checkout;
checkout= sourceTable.Rows[i].Field<string>(0);
dest = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["local"].ConnectionString);
dest.Open();
destcmd = new SqlCommand(checkout, dest);
destcmd.ExecuteNonQuery();
dest.Close();
prcmail();
prcmessagecheck();
lblProgress.Text = "Hello World"+i;
Thread.Sleep(1000); // I changed this from 10000 to 1000 (10 seconds down to 1 second)
}
}
Next, we need to create a new thread that executes our DoWork() function. Its unclear what the "trigger" is for doing your work, but I'm going to assume its a button click:
private void button1_click(object sender, EventArgs e)
{
var work = new Thread(DoWork);
work.Start();
}
So now, whenever someone click the button, we will start a new thread that executes our DoWork function in that thread. The new thread spawns, then execution is immediate returned and our GUI will now update in real time as our thread is executing in the background.
But wait! We still have one more problem to take care of. The problem is that Window's form controls are not thread safe and if we try to update a control from another thread, other then the GUI's thread, we will get a cross-thread operation error. The key to fixing this is to use InvokeRequired and Invoke.
First, we need to make another function that does just the label update:
private void SetProgressLabel(int progress)
{
lblProgress.Text = "Hello World" + progress;
}
In your form class, we also need to create a new delegate:
public partial class Form1 : Form
{
private delegate void ProgressCallback(int progress);
// ..
// The rest of your code
// ..
}
Finally, change your DoWork() method to something like this:
private void DoWork()
{
for (int i = 1; i <= sourceTable.Rows.Count - 1; i++)
{
string checkout;
checkout= sourceTable.Rows[i].Field<string>(0);
dest = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["local"].ConnectionString);
dest.Open();
destcmd = new SqlCommand(checkout, dest);
destcmd.ExecuteNonQuery();
dest.Close();
prcmail();
prcmessagecheck();
if (lblProgress.InvokeRequired)
{
lblProgress.Invoke(new ProgressCallback(SetProgressLabel), new object[] { i });
}
else
{
SetProgressLabel(i);
}
Thread.Sleep(1000); // I changed this from 10000 to 1000 (10 seconds down to 1 second)
}
}
This uses the label's (derived from Control) InvokeRequired property to determine if an Invoke is required. It returns true or false. If its false, we can just call our SetProgressLabel() function like we'd normally do. If its true, we must use Invoke to call our function instead.
Congratulations! You just made your first thread safe application.
Now, just as an aside note, you are not properly releasing and disposing of your objects. I recommend you change your DoWork() code to something like this:
private void DoWork()
{
for (int i = 1; i <= sourceTable.Rows.Count - 1; i++)
{
string checkout;
checkout = sourceTable.Rows[i].Field<string>(0);
using (dest = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["local"].ConnectionString))
{
dest.Open();
using (destcmd = new SqlCommand(checkout, dest))
{
destcmd.ExecuteNonQuery();
dest.Close();
prcmail();
prcmessagecheck();
if (lblProgress.InvokeRequired)
{
lblProgress.Invoke(new ProgressCallback(SetProgressLabel), new object[] { i });
}
else
{
SetProgressLabel(i);
}
Thread.Sleep(1000); // I changed this from 10000 to 1000 (10 seconds down to 1 second)
}
}
}
}
Because I wrapped your IDisposable's into using blocks, the resources will automatically be disposed of once it goes out of scope.
Although threading would be the more ideal solution another solution is:
Application.DoEvents()
this will give the UI thread time to update.
Example
for (int i = 1; i <= sourceTable.Rows.Count - 1; i++)
{
string checkout;
checkout= sourceTable.Rows[i].Field<string>(0);
dest = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["local"].ConnectionString);
dest.Open();
destcmd = new SqlCommand(checkout, dest);
destcmd.ExecuteNonQuery();
dest.Close();
prcmail();
prcmessagecheck();
lblProgress.Text = "Hello World"+i;
Application.DoEvents();
}
var ui = TaskScheduler.FromCurrentSynchronizationContext();
Task.Factory.StartNew(() =>
{
for (int i = 1; i <= sourceTable.Rows.Count - 1; i++)
{
string checkout;
checkout = sourceTable.Rows[i].Field<string>(0);
dest = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["local"].ConnectionString);
dest.Open();
destcmd = new SqlCommand(checkout, dest);
destcmd.ExecuteNonQuery();
dest.Close();
prcmail();
prcmessagecheck();
var task = Task.Factory.StartNew(() =>
{
//Thread.Sleep(1000);
lblProgress.Text = "Hello World" + i;
}, CancellationToken.None, TaskCreationOptions.None, ui);
task.Wait();
}
});
If you are executing the mentioned code on the UI thread, UI will be refreshed only after entire for loop is executed. Based on your needs, progress bar/background worker kind of set up looks suitable.
Following this post, I have been playing with System.Threading.Channel to get confident enough and use it in my production code, replacing the Threads/Monitor.Pulse/Wait based approach I am currently using (described in the referred post).
Basically I created a sample with a bounded channel where I run a couple of producer tasks at the beginning and, without waiting, start my consumer tasks, which start pushing elements from the channel.
After waiting for the producers tasks to complete, I then signal the channel as complete, so the consumer tasks can quit listening to new channel elements.
My channel is a Channel<Action>, and in each action I increment the count for each given worker in the WorkDistribution concurrent dictionary, and at the end of the sample I print it so I can check I consumed as many items as I expected, and also how did the channel distributed the actions between the consumers.
For some reason this "Work Distribution footer" is not printing the same number of items as the total items produced by producer tasks.
What am I missing ?
Some of the variables present were added for the sole purpose of helping troubleshoot.
Here's the full code:
public class ChannelSolution
{
object LockObject = new object();
Channel<Action<string>> channel;
int ItemsToProduce;
int WorkersCount;
int TotalItemsProduced;
ConcurrentDictionary<string, int> WorkDistribution;
CancellationToken Ct;
public ChannelSolution(int workersCount, int itemsToProduce, int maxAllowedItems,
CancellationToken ct)
{
WorkersCount = workersCount;
ItemsToProduce = itemsToProduce;
channel = Channel.CreateBounded<Action<string>>(maxAllowedItems);
Console.WriteLine($"Created channel with max {maxAllowedItems} items");
WorkDistribution = new ConcurrentDictionary<string, int>();
Ct = ct;
}
async Task ProduceItems(int cycle)
{
for (var i = 0; i < ItemsToProduce; i++)
{
var index = i + 1 + (ItemsToProduce * cycle);
bool queueHasRoom;
var stopwatch = new Stopwatch();
stopwatch.Start();
do
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting read loop - cancellation requested !");
break;
}
queueHasRoom = await channel.Writer.WaitToWriteAsync();
if (!queueHasRoom)
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting read loop - cancellation"
+ " requested !");
break;
}
if (stopwatch.Elapsed.Seconds % 3 == 0)
Console.WriteLine("Channel reached maximum capacity..."
+ " producer waiting for items to be freed...");
}
}
while (!queueHasRoom);
channel.Writer.TryWrite((workerName) => action($"A{index}", workerName));
Console.WriteLine($"Channel has room, item {index} added"
+ $" - channel items count: [{channel.Reader.Count}]");
Interlocked.Increment(ref TotalItemsProduced);
}
}
List<Task> GetConsumers()
{
var tasks = new List<Task>();
for (var i = 0; i < WorkersCount; i++)
{
var workerName = $"W{(i + 1).ToString("00")}";
tasks.Add(Task.Run(async () =>
{
while (await channel.Reader.WaitToReadAsync())
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting write loop - cancellation"
+ "requested !");
break;
}
if (channel.Reader.TryRead(out var action))
{
Console.WriteLine($"dequed action in worker [{workerName}]");
action(workerName);
}
}
}));
}
return tasks;
}
void action(string actionNumber, string workerName)
{
Console.WriteLine($"processing {actionNumber} in worker {workerName}...");
var secondsToWait = new Random().Next(2, 5);
Thread.Sleep(TimeSpan.FromSeconds(secondsToWait));
Console.WriteLine($"action {actionNumber} completed by worker {workerName}"
+ $" after {secondsToWait} secs! channel items left:"
+ $" [{channel.Reader.Count}]");
if (WorkDistribution.ContainsKey(workerName))
{
lock (LockObject)
{
WorkDistribution[workerName]++;
}
}
else
{
var succeeded = WorkDistribution.TryAdd(workerName, 1);
if (!succeeded)
{
Console.WriteLine($"!!! failed incremeting dic value !!!");
}
}
}
public void Summarize(Stopwatch stopwatch)
{
Console.WriteLine("--------------------------- Thread Work Distribution "
+ "------------------------");
foreach (var kv in this.WorkDistribution)
Console.WriteLine($"thread: {kv.Key} items consumed: {kv.Value}");
Console.WriteLine($"Total actions consumed: "
+ $"{WorkDistribution.Sum(w => w.Value)} - Elapsed time: "
+ $"{stopwatch.Elapsed.Seconds} secs");
}
public void Run(int producerCycles)
{
var stopwatch = new Stopwatch();
stopwatch.Start();
var producerTasks = new List<Task>();
Console.WriteLine($"Started running at {DateTime.Now}...");
for (var i = 0; i < producerCycles; i++)
{
producerTasks.Add(ProduceItems(i));
}
var consumerTasks = GetConsumers();
Task.WaitAll(producerTasks.ToArray());
Console.WriteLine($"-------------- Completed waiting for PRODUCERS -"
+ " total items produced: [{TotalItemsProduced}] ------------------");
channel.Writer.Complete(); //just so I can complete this demo
Task.WaitAll(consumerTasks.ToArray());
Console.WriteLine("----------------- Completed waiting for CONSUMERS "
+ "------------------");
//Task.WaitAll(GetConsumers().Union(producerTasks/*.Union(
// new List<Task> { taskKey })*/).ToArray());
//Console.WriteLine("Completed waiting for tasks");
Summarize(stopwatch);
}
}
And here is the calling code in Program.cs
var workersCount = 5;
var itemsToProduce = 10;
var maxItemsInQueue = 5;
var cts = new CancellationTokenSource();
var producerConsumerTests = new ProducerConsumerTests(workersCount, itemsToProduce,
maxItemsInQueue, cts.Token);
producerConsumerTests.Run(2);
From a quick look there is a race condition in the ProduceItems method, around the queueHasRoom variable. You don't need this variable. The channel.Writer.TryWrite method will tell you whether there is room in the channel's buffer or not. Alternatively you could simply await the WriteAsync method, instead of using the WaitToWriteAsync/TryWrite combo. AFAIK this combo is intended as a performance optimization of the former method. If you absolutely need to know whether there is available space before attempting to post a value, then the Channel<T> is probably not a suitable container for your use case. You'll need to find something that can be locked during the whole operation of "check-for-available-space -> create-the-value -> post-the-value", so that this operation can be made atomic.
As a side note, using a lock to protect the updating of the ConcurrentDictionary is redundant. The ConcurrentDictionary offers the AddOrUpdate method, that can replace atomically a value it contains with another value. You may had to lock if the dictionary contained mutable objects, and you needed to mutate that objects with thread-safety. But in your case the values are of type Int32, which is an immutable struct. You don't change it, you just replace it with a new Int32, which is created based on the existing value:
WorkDistribution.AddOrUpdate(workerName, 1, (_, existing) => existing + 1);
I would like to call my API in parallel x number of times so processing can be done quickly.
I have three methods below that I have to call APIs in parallel. I am trying to understand which is the best way to perform this action.
Base Code
var client = new System.Net.Http.HttpClient();
client.DefaultRequestHeaders.Add("Accept", "application/json");
client.BaseAddress = new Uri("https://jsonplaceholder.typicode.com");
var list = new List<int>();
var listResults = new List<string>();
for (int i = 1; i < 5; i++)
{
list.Add(i);
}
1st Method using Parallel.ForEach
Parallel.ForEach(list,new ParallelOptions() { MaxDegreeOfParallelism = 3 }, index =>
{
var response = client.GetAsync("posts/" + index).Result;
var contents = response.Content.ReadAsStringAsync().Result;
listResults.Add(contents);
Console.WriteLine(contents);
});
Console.WriteLine("After all parallel tasks are done with Parallel for each");
2nd Method with Tasks. I am not sure if this runs parallel. Let me know if it does
var loadPosts = new List<Task<string>>();
foreach(var post in list)
{
var response = await client.GetAsync("posts/" + post);
var contents = response.Content.ReadAsStringAsync();
loadPosts.Add(contents);
Console.WriteLine(contents.Result);
}
await Task.WhenAll(loadPosts);
Console.WriteLine("After all parallel tasks are done with Task When All");
3rd Method using Action Block - This is what I believe I should always do but I want to hear from community
var responses = new List<string>();
var block = new ActionBlock<int>(
async x => {
var response = await client.GetAsync("posts/" + x);
var contents = await response.Content.ReadAsStringAsync();
Console.WriteLine(contents);
responses.Add(contents);
},
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 6, // Parallelize on all cores
});
for (int i = 1; i < 5; i++)
{
block.Post(i);
}
block.Complete();
await block.Completion;
Console.WriteLine("After all parallel tasks are done with Action block");
Approach number 2 is close. Here's a rule of thumb: I/O bound operations=> use Tasks/WhenAll (asynchrony), compute bound operations => use Parallelism. Http Requests are network I/O.
foreach (var post in list)
{
async Task<string> func()
{
var response = await client.GetAsync("posts/" + post);
return await response.Content.ReadAsStringAsync();
}
tasks.Add(func());
}
await Task.WhenAll(tasks);
var postResponses = new List<string>();
foreach (var t in tasks) {
var postResponse = await t; //t.Result would be okay too.
postResponses.Add(postResponse);
Console.WriteLine(postResponse);
}
I made a little console app to test all the Methods at pinging API "https://jsonplaceholder.typicode.com/todos/{i}" 200 times.
#MikeLimaSierra Method 1 or 3 were the fastest!
Method
DegreeOfParallelism
Time
Not Parallel
n/a
8.4 sec
#LearnAspNet (OP) Method 1
2
5.494 sec
#LearnAspNet (OP) Method 1
30
1.235 sec
#LearnAspNet (OP) Method 3
2
4.750 sec
#LearnAspNet (OP) Method 3
30
1.795 sec
#jamespconnor Method
n/a
21.5 sec
#YuliBonner Method
n/a
21.4 sec
I would use the following, it has no control of concurrency (it will dispatch all HTTP requests in parallel, unlike your 3rd Method) but it is a lot simpler - it only has a single await.
var client = new HttpClient();
var list = new[] { 1, 2, 3, 4, 5 };
var postTasks = list.Select(p => client.GetStringAsync("posts/" + p));
var posts = await Task.WhenAll(postTasks);
foreach (var postContent in posts)
{
Console.WriteLine(postContent);
}
I found a osrm-machine and it returns a json string when i request. The json has some spesific information about 2 location and I am processing the json and getting the value of distance property for building distance matrix of these locations. I have more than 2000 location and this processing takes approximately 4 hours. I need to decrease the execution time with parallelism, but I am very new to the topic. Here is my work, what should I do for optimizing the parallel loop ? or maybe you can drive me to new approach. Thanks
var client = new RestClient("http://127.0.0.1:5000/route/v1/table/");
var watch = System.Diagnostics.Stopwatch.StartNew();
//rawCount = 2500
Parallel.For(0, rowCount, i =>
{
Parallel.For(0, rowCount, j =>
{
//request a server with spesific lats,longs
var request = new RestRequest(String.Format("{0},{1};{2},{3}", le.LocationList[i].longitude, le.LocationList[i].latitude,
le.LocationList[j].longitude, le.LocationList[j].latitude));
//reading the response and deserialize into object
var response = client.Execute<RootObject>(request);
//defining objem with the List of attributes in routes
var objem = response.Data.routes;
//this part reading all distances and durations in each response and take them into dist and dur matrixes.
Parallel.ForEach(objem, (o) =>
{
dist[i, j] = o.distance;
dur[i, j] = o.duration;
threads[i,j] = Thread.CurrentThread.ManagedThreadId;
Thread.Sleep(10);
});
});
});
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
I cleaned it up a bit. One use of parallelism is enough. There still is one loop that you need to look at, because it's overwriting data and I do not know what to do with it, that's your call.
You need to experiement with the maxThreads variable's value a bit. Normally, .NET would spin up enough threads so your processor can handle them, in your case you can spin up more because you know that all of them are just idling waiting for the network stack.
using System;
using System.Linq;
using System.Threading.Tasks;
namespace ConsoleApp7
{
class Program
{
static void Process(int i, int j)
{
var client = new RestClient("http://127.0.0.1:5000/route/v1/table/");
//request a server with spesific lats,longs
var request = new RestRequest(String.Format("{0},{1};{2},{3}", le.LocationList[i].longitude, le.LocationList[i].latitude, le.LocationList[j].longitude, le.LocationList[j].latitude));
//reading the response and deserialize into object
var response = client.Execute<RootObject>(request);
//defining objem with the List of attributes in routes
var objem = response.Data.routes;
//this part reading all distances and durations in each response and take them into dist and dur matrixes.
// !!!
// !!! THIS LOOP NEEDS TO GO.
// !!! IT MAKES NO SENSE!
// !!! YOU ARE OVERWRITING YOUR OWN DATA!
// !!!
Parallel.ForEach(objem, (o) =>
{
dist[i, j] = o.distance;
dur[i, j] = o.duration;
threads[i, j] = Thread.CurrentThread.ManagedThreadId;
Thread.Sleep(10);
});
}
static void Main(string[] args)
{
var watch = System.Diagnostics.Stopwatch.StartNew();
var rowCount = 2500;
var maxThreads = 100;
var allPairs = Enumerable.Range(0, rowCount).SelectMany(x => Enumerable.Range(0, rowCount).Select(y => new { X = x, Y = y }));
Parallel.ForEach(allPairs, new ParallelOptions { MaxDegreeOfParallelism = maxThreads }, pair => Process(pair.X, pair.Y));
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
}
}
}
Hi i am spidering the site and reading the contents.I want to keep the request rate reasonable. Up to approx 10 requests per second should probably be ok.Currently it is 5k request per minute and it is causing security issues as this looks to be a bot activity.
How to do this? Here is my code
protected void Iterareitems(List<Item> items)
{
foreach (var item in items)
{
GetImagesfromItem(item);
if (item.HasChildren)
{
Iterareitems(item.Children.ToList());
}
}
}
protected void GetImagesfromItem(Item childitems)
{
var document = new HtmlWeb().Load(completeurl);
var urls = document.DocumentNode.Descendants("img")
.Select(e => e.GetAttributeValue("src", null))
.Where(s => !string.IsNullOrEmpty(s)).ToList();
}
You need System.Threading.Semaphore, using which you can control the max concurrent threads/tasks. Here is an example:
var maxThreads = 3;
var semaphore = new Semaphore(maxThreads, maxThreads);
for (int i = 0; i < 10; i++) //10 tasks in total
{
var j = i;
Task.Factory.StartNew(() =>
{
semaphore.WaitOne();
Console.WriteLine("start " + j.ToString());
Thread.Sleep(1000);
Console.WriteLine("end " + j.ToString());
semaphore.Release();
});
}
You can see at most 3 tasks are working, others are pending by semaphore.WaitOne() because the maximum limit reached, and the pending thread will continue if another thread released the semaphore by semaphore.Release().