I've coded a void to handle multiple threads for selenium web browsing. The issue is that right now for example, if i input 4 tasks, and 2 threads. The program says it finished when it has finished 2 tasks.
Edit: Basically I want the program to wait for the tasks to complete And also I want that if one thread finishes but the other is running and there are tasks to do, it goes directly to start another task, and not waiting for the 2nd thread to finish.
Thanks and sorry for the code, made it fast to show it as a example of how it is.
{
static void Main(string[] args)
{
Threads(4, 4);
Console.WriteLine("Program has finished");
Console.ReadLine();
}
static Random ran = new Random();
static int loop;
public static void Threads(int number, int threads)
{
for (int i = 0; i < number; i++)
{
if (threads == 1)
{
generateDriver();
}
else if (threads > 1)
{
start:
if (loop < threads)
{
loop++;
Thread thread = new Thread(() => generateDriver());
thread.Start();
}
else
{
Task.Delay(2000).Wait();
goto start;
}
}
}
}
public static void test(IWebDriver driver)
{
driver.Navigate().GoToUrl("https://google.com/");
int timer = ran.Next(100, 2000);
Task.Delay(timer).Wait();
Console.WriteLine("[" + DateTime.Now.ToString("hh:mm:ss") + "] - " + "Task done.");
loop--;
driver.Close();
}
public static void generateDriver()
{
ChromeOptions options = new ChromeOptions();
options.AddArguments("--disable-dev-shm-usage");
options.AddArguments("--disable-extensions");
options.AddArguments("--disable-gpu");
options.AddArguments("window-size=1024,768");
options.AddArguments("--test-type");
ChromeDriverService service = ChromeDriverService.CreateDefaultService(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location));
service.HideCommandPromptWindow = true;
service.SuppressInitialDiagnosticInformation = true;
IWebDriver driver = new ChromeDriver(service, options);
test(driver);
}
Manually keeping track of running threads, waiting for them to finish and reusing ones that are already finished is not trivial.
However the .NET runtime provides ready made solutions that you should prefer to handling it yourself.
The simplest way to achieve your desired result is to use a Parallel.For loop and set the MaxDegreeOfParallelism, e.g.:
public static void Threads(int number, int threads)
{
Parallel.For(0, number,
new ParallelOptions { MaxDegreeOfParallelism = threads },
_ => generateDriver());
}
If you really want to do it manually you will need to use arrays of Thread (or Task) and keep iterating over them, checking whether they have finished and if they did replace them with a new thread. This requires quite a bit more code than the Parallel.For solution (and is unlikely to perform better)
Related
I have been researching (including looking at all other SO posts on this topic) the best way to implement a (most likely) Windows Service worker that will pull items of work from a database and process them in parallel asynchronously in a 'fire-and-forget' manner in the background (the work item management will all be handled in the asynchronous method). The work items will be web service calls and database queries. There will be some throttling applied to the producer of these work items to ensure some kind of measured approach to scheduling the work. The examples below are very basic and are just there to highlight the logic of the while loop and for loop in place. Which is the ideal method or does it not matter? Is there a more appropriate/performant way of achieving this?
async/await...
private static int counter = 1;
static void Main(string[] args)
{
Console.Title = "Async";
Task.Run(() => AsyncMain());
Console.ReadLine();
}
private static async void AsyncMain()
{
while (true)
{
// Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
var x = DoSomethingAsync(counter.ToString());
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static async Task<string> DoSomethingAsync(string jobNumber)
{
try
{
// Simulated mostly IO work - some could be long running
await Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
return "fire and forget so not really interested";
}
Task.Run...
private static int counter = 1;
static void Main(string[] args)
{
Console.Title = "Task";
while (true)
{
// Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
var x = Task.Run(() => { DoSomethingAsync(counter.ToString()); });
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static string DoSomethingAsync(string jobNumber)
{
try
{
// Simulated mostly IO work - some could be long running
Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
return "fire and forget so not really interested";
}
pull items of work from a database and process them in parallel asynchronously in a 'fire-and-forget' manner in the background
Technically, you want concurrency. Whether you want asynchronous concurrency or parallel concurrency remains to be seen...
The work items will be web service calls and database queries.
The work is I/O-bound, so that implies asynchronous concurrency as the more natural approach.
There will be some throttling applied to the producer of these work items to ensure some kind of measured approach to scheduling the work.
The idea of a producer/consumer queue is implied here. That's one option. TPL Dataflow provides some nice producer/consumer queues that are async-compatible and support throttling.
Alternatively, you can do the throttling yourself. For asynchronous code, there's a built-in throttling mechanism called SemaphoreSlim.
TPL Dataflow approach, with throttling:
private static int counter = 1;
static void Main(string[] args)
{
Console.Title = "Async";
var x = Task.Run(() => MainAsync());
Console.ReadLine();
}
private static async Task MainAsync()
{
var blockOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 7
};
var block = new ActionBlock<string>(DoSomethingAsync, blockOptions);
while (true)
{
var dbData = await ...; // Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
block.Post(counter.ToString());
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static async Task DoSomethingAsync(string jobNumber)
{
try
{
// Simulated mostly IO work - some could be long running
await Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
}
Asynchronous concurrency approach with manual throttling:
private static int counter = 1;
private static SemaphoreSlim semaphore = new SemaphoreSlim(7);
static void Main(string[] args)
{
Console.Title = "Async";
var x = Task.Run(() => MainAsync());
Console.ReadLine();
}
private static async Task MainAsync()
{
while (true)
{
var dbData = await ...; // Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
var x = DoSomethingAsync(counter.ToString());
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static async Task DoSomethingAsync(string jobNumber)
{
await semaphore.WaitAsync();
try
{
try
{
// Simulated mostly IO work - some could be long running
await Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
}
finally
{
semaphore.Release();
}
}
As a final note, I hardly ever recommend my own book on SO, but I do think it would really benefit you. In particular, sections 8.10 (Blocking/Asynchronous Queues), 11.5 (Throttling), and 4.4 (Throttling Dataflow Blocks).
First of all, let's fix some.
In the second example you are calling
Task.Delay(5000);
without await. It is a bad idea. It creates a new Task instance which runs for 5 seconds but no one is waiting for it. Task.Delay is only useful with await. Mind you, do not use Task.Delay(5000).Wait() or you are going to get deadlocked.
In your second example you are trying to make the DoSomethingAsync method synchronous, lets call it DoSomethingSync and replace the Task.Delay(5000); with Thread.Sleep(5000);
Now, the second example is almost the old-school ThreadPool.QueueUserWorkItem. And there is nothing bad with it in case you are not using some already-async API inside. Task.Run and ThreadPool.QueueUserWorkItem used in the fire-and-forget case are just the same thing. I would use the latter for clarity.
This slowly drives us to the answer to the main question. Async or not async - this is the question! I would say: "Do not create async methods in case you do not have to use some async IO inside your code". If however there is async API you have to use than the first approach would be more expected by those who are going to read your code years later.
In this example, is this the correct use of the Parallel.For loop if I want to limit the number of threads that can perform the function DoWork to ten at a time? Will other threads be blocked until one of the ten threads becomes available? If not, what is a better multi-threaded solution that would still let me execute that function 6000+ times?
class Program
{
static void Main(string[] args)
{
ThreadExample ex = new ThreadExample();
}
}
public class ThreadExample
{
int limit = 6411;
public ThreadExample()
{
Console.WriteLine("Starting threads...");
int temp = 0;
Parallel.For(temp, limit, new ParallelOptions { MaxDegreeOfParallelism = 10 }, i =>
{
DoWork(temp);
temp++;
});
}
public void DoWork(int info)
{
//Thread.Sleep(50); //doing some work here.
int num = info * 5;
Console.WriteLine("Thread: {0} Result: {1}", info.ToString(), num.ToString());
}
}
You need to use the i passed to the lambda function as index. Parallel.For relieves you from the hassle of working with the loop counter, but you need to use it!
Parallel.For(0, limit, new ParallelOptions { MaxDegreeOfParallelism = 10 }, i =>
{
DoWork(i);
});
As for your other questions:
Yes, this will correctly limit the amount of threads working simultaneously.
There are no threads being blocked. The iterations are queued and as soon as a thread becomes available, it takes the next iteration (in a synchronized manner) from the queue to process.
I have an app that takes on unknown amount of task. The task are blocking (they wait on network) i'll need multiple threads to keep busy.
Is there an easy way for me to have a giant list of task and worker threads which will pull the task when they are idle? ATM i just start a new thread for each task, which is fine but i'd like some control so if there are 100task i dont have 100threads.
Assuming that the network I/O classes that you are dealing with expose Begin/End style async methods, then what you want to do is use the TPL TaskFactory.FromAsync method. As laid out in TPL TaskFactory.FromAsync vs Tasks with blocking methods, the FromAsync method will use async I/O under the covers, rather than keeping a thread busy just waiting for the I/O to complete (which is actually not what you want).
The way that Async I/O works is that you have a pool of threads that can handle the result of I/O when the result is ready, so that if you have 100 outstanding I/Os you don't have 100 threads blocked waiting for those I/Os. When the whole pool is busy handling I/O results, subsequent results get queued up automatically until a thread frees up to handle them. Keeping a huge pool of threads waiting like that is a scalability disaster- threads are hugely expensive objects to keep around idling.
here a msdn sample to manage through a threadpool many threads:
using System;
using System.Threading;
public class Fibonacci
{
public Fibonacci(int n, ManualResetEvent doneEvent)
{
_n = n;
_doneEvent = doneEvent;
}
// Wrapper method for use with thread pool.
public void ThreadPoolCallback(Object threadContext)
{
int threadIndex = (int)threadContext;
Console.WriteLine("thread {0} started...", threadIndex);
_fibOfN = Calculate(_n);
Console.WriteLine("thread {0} result calculated...", threadIndex);
_doneEvent.Set();
}
// Recursive method that calculates the Nth Fibonacci number.
public int Calculate(int n)
{
if (n <= 1)
{
return n;
}
return Calculate(n - 1) + Calculate(n - 2);
}
public int N { get { return _n; } }
private int _n;
public int FibOfN { get { return _fibOfN; } }
private int _fibOfN;
private ManualResetEvent _doneEvent;
}
public class ThreadPoolExample
{
static void Main()
{
const int FibonacciCalculations = 10;
// One event is used for each Fibonacci object
ManualResetEvent[] doneEvents = new ManualResetEvent[FibonacciCalculations];
Fibonacci[] fibArray = new Fibonacci[FibonacciCalculations];
Random r = new Random();
// Configure and launch threads using ThreadPool:
Console.WriteLine("launching {0} tasks...", FibonacciCalculations);
for (int i = 0; i < FibonacciCalculations; i++)
{
doneEvents[i] = new ManualResetEvent(false);
Fibonacci f = new Fibonacci(r.Next(20,40), doneEvents[i]);
fibArray[i] = f;
ThreadPool.QueueUserWorkItem(f.ThreadPoolCallback, i);
}
// Wait for all threads in pool to calculation...
WaitHandle.WaitAll(doneEvents);
Console.WriteLine("All calculations are complete.");
// Display the results...
for (int i= 0; i<FibonacciCalculations; i++)
{
Fibonacci f = fibArray[i];
Console.WriteLine("Fibonacci({0}) = {1}", f.N, f.FibOfN);
}
}
}
Here is some code that perpetually generate GUIDs. I've written it to learn about threading. In it you'll notice that I've got a lock around where I generate GUIDs and enqueue them even though the ConcurrentQueue is thread safe. It's because my actual code will need to use NHibernate and so I must make sure that only one thread gets to fill the queue.
While I monitor this code in Task Manager, I notice the process drops the number of threads from 18 (on my machine) to 14 but no less. Is this because my code isn't good?
Also can someone refactor this if they see fit? I love shorter code.
class Program
{
ConcurrentNewsBreaker Breaker;
static void Main(string[] args)
{
new Program().Execute();
Console.Read();
}
public void Execute()
{
Breaker = new ConcurrentNewsBreaker();
QueueSome();
}
public void QueueSome()
{
ThreadPool.QueueUserWorkItem(DoExecute);
}
public void DoExecute(Object State)
{
String Id = Breaker.Pop();
Console.WriteLine(String.Format("- {0} {1}", Thread.CurrentThread.ManagedThreadId, Breaker.Pop()));
if (Breaker.Any())
QueueSome();
else
Console.WriteLine(String.Format("- {0} XXXX ", Thread.CurrentThread.ManagedThreadId));
}
}
public class ConcurrentNewsBreaker
{
static readonly Object LockObject = new Object();
ConcurrentQueue<String> Store = new ConcurrentQueue<String>();
public String Pop()
{
String Result = null;
if (Any())
Store.TryDequeue(out Result);
return Result;
}
public Boolean Any()
{
if (!Store.Any())
{
Task FillTask = new Task(FillupTheQueue, Store);
FillTask.Start();
FillTask.Wait();
}
return Store.Any();
}
private void FillupTheQueue(Object StoreObject)
{
ConcurrentQueue<String> Store = StoreObject as ConcurrentQueue<String>;
lock(LockObject)
{
for(Int32 i = 0; i < 100; i++)
Store.Enqueue(Guid.NewGuid().ToString());
}
}
}
You are using .NET's ThreadPool so .NET/Windows manages the number of threads based on the amount of work waiting to be processed.
While I monitor this code in Task
Manager, I notice the process drops
the number of threads from 18 (on my
machine) to 14 but no less. Is this
because my code isn't good?
This does not indicate a problem. 14 is still high, unless you've got a 16-core cpu.
The threadpool will try to adjust and do the work with as few threads as possible.
You should start to worry when the number of threads goes up significantly.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
C# Spawn Multiple Threads for work then wait until all finished
I have two method calls that I want to call using two threads. Then I want them to wait till method executions get completed before continuing. My sample solution is something like below.
public static void Main()
{
Console.WriteLine("Main thread starting.");
String[] strThreads = new String[] { "one", "two" };
String ctemp = string.Empty;
foreach (String c in strThreads)
{
ctemp = c;
Thread thread = new Thread(delegate() { MethodCall(ctemp); });
thread.Start();
thread.Join();
}
Console.WriteLine("Main thread ending.");
Console.Read();
}
public static void MethodCalls(string number)
{
Console.WriteLine("Method call " + number);
}
Is this will do the job? Or is there another better way to do the same thing?
I'd look into running your method via ThreadPool.QueueUserWorkItem and then using WaitHandle.WaitAll to wait for all of them to complete.
This sequence of statements...:
Thread thread = new Thread(delegate() { MethodCall(ctemp); });
thread.Start();
thread.Join();
is equivalent to just calling the method directly -- since you're waiting for the new thread to finish right after starting it, there's no benefit from threading! You need to first start all threads in a loop (put them in an array list or some similar container), then join them in a separate loop, to get concurrent execution of the methods.
What you're doing ther eis creating a thread and then waiting to finish, one by one. You have, at any time, at most two thread running: the main and the one started.
What you want is to start all threads, then wait for all to complete:
public static void Main()
{
Console.WriteLine("Main thread starting.");
String[] strThreads = new String[] { "one", "two" };
int threadCount = strThreads.Length;
AutoResetEvent eventdone = new AutoResetEvent(false);
String ctemp = string.Empty;
foreach (String c in strThreads)
{
ctemp = c;
Thread thread = new Thread(delegate() {
try
{
MethodCall(ctemp);
}
finally
{
if (0 == Interlocked.Decrement(ref threadCount)
{
eventDone.Set();
}
}
});
thread.Start();
}
eventDone.WaitOne();
Console.WriteLine("Main thread ending.");
Console.Read();
}
public static void MethodCalls(string number)
{
Console.WriteLine("Method call " + number);
}
If you intended for your two threads to execute one after the other, then yes, the above code will suffice (though my C# syntax knowledge is a little fuzzy off the top of my head so I can't say if the above compiles nicely or not), but why use threads if you want ordered, synchronous execution?
If instead what you want is for the two method calls to execute in parallel, you need to take the thread.Join(); out of the for-loop (you'll need to hang on to the thread objects, likely in an array.)
Take a look at BackgroundWorker Component; I beleive it works with Windows Forms, WPF and Silverlight, basically somewhere UI is involved