Running parallel in background - c#

Main question is: How to run the code within TestingButton_Click on several threads in background (similar to BackgroundWorker) so I will be able to:
1. Get all the raw data to the methods
2. Cancel test for all threads simultaneously
3. Report progress
4. Retrieve all the result tables to main thread.
The following code is within TestingButton_Click
List<Thread> threads = new List<Thread>();
//Testing for each pair
foreach (InterfaceWithClassName aCompound in Group1)
{
foreach (InterfaceWithClassName bCompound in Group2)
{
InstancePair pair = new InstancePair();
//some code
if (testModeParallel)
{
Thread thr = new Thread(TestPairParallel);
thr.Start(pair);
threads.Add(thr);
}
else
{
Thread thr = new Thread(TestPairSerial);
thr.Start(pair);
threads.Add(thr);
}
}
}
while (true)
{
int i = 0;
foreach (Thread thread in threads)
{
if (thread.IsAlive)
break;
i++;
}
if (i == threads.Count)
break;
Thread.Sleep(1000);
}
pairsResultsDataGrid.ItemsSource = tab.DefaultView
User is able to choose what compounds to test so every time I have different number of pairs to test.
I made to different methods TestPairSerial() and TestPairParallel() just in case.
TestPairSerial() structure is
do
{
do
{
} while (isSetbCompaundParams);
} while (isSetaCompaundParams);
//filling up results into tables (main window variables) later to be connected to DataGrids
TestPairParallel() is implemented with InfinitePartitioner and using similar structure only with Parallel.ForEach(new InfinitePartitioner(),...
Thank you for your help.

Use .NET 4.0 Tasks instead of creating new Threads yourself. Tasks give you finer granularity of control, make it easy to pass data into the background operation, and provide excellent support for waiting for results across multiple concurrent tasks and for cancellation of everything in one fell swoop if needed. Highly recommended.

How to run the code within TestingButton_Click on several threads in
background.
I would use a Task as they were design to do exactly what you want.
The only other question I will answer until you get closer to the actual solution is the following:
Report progress
There are lots of ways to report the progress on a given thread, you would have to subscribe to an event, and write code to report the progress of the thread. In order to update a control on the form, this would require you Invoke the change, this is not a trivial feature.

Related

Multi-threading potentially long running operations

I am writing a Windows Service. I have a 'backlog' (in SQL) of records that have to be processed by the service. The backlog might also be empty. The record processing is a potentially very long running operation (3+ minutes).
I have a class and method in it which would go to the SQL, choose a record and process it, if there are any records to process. Then the method will exist and that's it. Note: I can't know in advance which records will be processed - the class method decides this as part of its logic.
I want to achieve parallel processing. I want to have X number of workers (where X is the optimal for the host PC) at any time. While the backlog is empty, those workers finish their jobs and exit pretty quickly (~50-100ms, maybe). I want any 'freed' worker to start over again (i.e. re-run).
I have done some reading and I deduct that ThreadPool is not a good option for long-running operations. The .net 4.0+ parallel library is not a good option either, as I don't want to wait all workers to finish and I don't want to predefine/declare in advance the tasks.
In layman terms I want to have X workers who query the data source for items and when some of them find such - operate on it, the rest would continue to look for newly pushed items into the backlog.
What would be the best approach? I think I will have to manage the threads entirely by myself? i.e. first step - determine the optimum number of threads (perhaps by checking the Environment.ProcessorCount) and then start the X threads. Monitor for IsAlive on each thread and restart it? This seems awfully unprofessional.
Any suggestions?
You can start one task per core,As tasks finish start new ones.You can use numOfThreads depending on ProcessorCount or specific number
int numOfThreads = System.Environment.ProcessorCount;
// int numOfThreads = X;
for(int i =0; i< numOfThreads; i++)
task.Add(Task.Factory.StartNew(()=> {});
while(task.count>0) //wait for task to finish
{
int index = Task.WaitAny(tasks.ToArray());
tasks.RemoveAt(index);
if(incomplete work)
task.Add(Task.Factory.StartNew()=> {....});
}
or
var options = new ParallelOptions();
options.MaxDegreeOfParllelism = System.Environment.ProcessorCount;
Parallel.For(0,N,options, (i) => {/*long running computattion*/};
or
You can Implement Producer-Coustomer pattern with BlockingCollection
This topic is excellently taught by Dr.Joe Hummel on his Pluralsight course "Async and parallel programming: Application design "
Consider using ActionBlock<T> from TPL.DataFlow library. It can be configured to process concurrently multiple messages using all available CPU cores.
ActionBlock<QueueItem> _processor;
Task _completionTask;
bool _done;
async Task ReadQueueAsync(int pollingInterval)
{
while (!_done)
{
// Get a list of items to process from SQL database
var list = ...;
// Schedule the work
foreach(var item in list)
{
_processor.Post(item);
}
// Give SQL server time to re-fill the queue
await Task.Delay(pollingInterval);
}
// Signal the processor that we are done
_processor.Complete();
}
void ProcessItem(QueueItem item)
{
// Do your work here
}
void Setup()
{
// Configure action block to process items concurrently
// using all available CPU cores
_processor= new ActionBlock<QueueItem>(new Action<QueueItem>(ProcessItem),
new ExecutionDataFlowBlock{MaxDegreeOfParallelism = DataFlowBlockOptions.Unbounded});
_done = false;
var queueReaderTask = ReadQueueAsync(QUEUE_POLL_INTERVAL);
_completionTask = Task.WhenAll(queueReaderTask, _processor.Completion);
}
void Complete()
{
_done = true;
_completionTask.Wait();
}
Per MaxDegreeOfParallelism's documentation: "Generally, you do not need to modify this setting. However, you may choose to set it explicitly in advanced usage scenarios such as these:
When you know that a particular algorithm you're using won't scale
beyond a certain number of cores. You can set the property to avoid
wasting cycles on additional cores.
When you're running multiple algorithms concurrently and want to
manually define how much of the system each algorithm can utilize.
You can set a MaxDegreeOfParallelism value for each.
When the thread pool's heuristics is unable to determine the right
number of threads to use and could end up injecting too many
threads. For example, in long-running loop body iterations, the
thread pool might not be able to tell the difference between
reasonable progress or livelock or deadlock, and might not be able to reclaim threads that were added to improve performance. In this
case, you can set the property to ensure that you don't use more
than a reasonable number of threads."
If you do not have an advanced usage scenario like the 3 cases above, you may want to hand your list of items or tasks to be run to the Task Parallel Library and let the framework handle the processor count.
List<InfoObject> infoList = GetInfo();
ConcurrentQueue<ResultObject> output = new ConcurrentQueue<ResultObject>();
await Task.Run(() =>
{
Parallel.Foreach<InfoObject>(infoList, (item) =>
{
ResultObject result = ProcessInfo(item);
output.Add(result);
});
});
foreach(var resultObj in output)
{
ReportOnResultObject(resultObj);
}
OR
List<InfoObject> infoList = GetInfo();
List<Task<ResultObject>> tasks = new List<Task<ResultObject>>();
foreach (var item in infoList)
{
tasks.Add(Task.Run(() => ProcessInfo(item)));
}
var results = await Task.WhenAll(tasks);
foreach(var resultObj in results)
{
ReportOnResultObject(resultObj);
}
H/T to IAmTimCorey tutorials:
https://www.youtube.com/watch?v=2moh18sh5p4
https://www.youtube.com/watch?v=ZTKGRJy5P2M

Freezing UI and odd thread behavior (WPF)

I am building an application with WPF and C# and I have run into an odd problem. There are multiple threads in my application, many of which are created through System.Timer timers. There is a particular action I can take that will cause the UI of my application to freeze; my UI will freeze permanently when this happens. The oddest thing about this though is that if I check the number of available threads with Threadpool.GetAvialbaleThreads, the number of available threads will continuously diminish if I don't stop the application. All of the threads in my application are created either via timer or the Task.StartNew method.
There are several parts of my code that spin up new threads, I will try to show all of them here in an abbreviated way.
DataManager.cs
// Two System.Timers.Timer(s) handle generating data values on 3 second and 1.5 second intervals.
_averageTimer = new System.Timers.Timer(1000 * 3);
_averageTimer.Elapsed += GenerateNewAveragePoint;
_averageTimer.Enabled = false;
_flowGenTimer = new System.Timers.Timer(1000 * 1.5);
_flowGenTimer.Elapsed += GenerateNewFlowSet;
_flowGenTimer.Enabled = false;
// I also have a primary work loop in this class which is created via Task
// This function runs continuously
Task.Run(() => ProcessData());
// I also send off a Task to do file I/O about every 3 seconds
Task.Run(() =>
{
lock (_sharedData.FileLock)
{
_continuousFileManager.WriteDataSubset(tmpData);
}
});
// Finally, I have a similar file I/O function to the one above, but it is
// called in the timer which fires every 3 seconds
Task.Run(() =>
{
lock (_sharedData.FileLock)
{
_averageFileManager.WriteDataSubset(tmpData);
}
});
So from this class it appears that I probably utilize a max of 6 threads.
In another class I have 3 calls to Dispatcher.Invoke
AdvancedPanel.cs
// RealtimeDataFlowItems is an ObservableCollection
Application.Current.Dispatcher.Invoke(delegate
{
RealtimeDataFlowItems.Insert(0, avgFlow);
});
// RealtimeDataHRItems is an ObservableCollection
Application.Current.Dispatcher.Invoke(delegate
{
RealtimeDataHRItems.Insert(0, avgFlow);
});
// RealtimeDataTimeItems is an ObservableCollection
Application.Current.Dispatcher.Invoke(delegate
{
RealtimeDataTimeItems.Insert(0, avgFlow);
});
In another class I have one call to Dispatcher.Invoke
DataInfo.cs
Application.Current.Dispatcher.Invoke(delegate
{
IndicatorPoints = new PointCollection(new[] { /* Removed for brevity */});
}
In another class there are two System.Timers.Timer(s) which execute at intervals of 15 ms and 2 seconds.
PlotData.cs
_plotTimer = new System.Timers.Timer(15);
_plotTimer.Elapsed += ProcessLoop;
_plotTimer.Enabled = false;
_boundsTimer = new System.Timers.Timer(1000 * 2);
_boundsTimer.Enabled = false;
_boundsTimer.Elapsed += CheckYAxisBounds;
Also, this class utilizes the OxyPlot plotting library in order to present realtime data in the UI. OxyPlot obviously has to modify the UI, so will be performing actions on the UI thread.
As a final note, I use Caliburn.Micro in my application and make use of its EventAggregator heavily. Each time I wish to publish a message a new Task is started with the StartNew method. Nearly all of the classes in my application make use of the EventAggregator in some capacity, so there are several threads that become utilized from these actions.
I hope this information is helpful, please let me know if there is more that I should provide. I am hoping that one of you could provide me any insight on what might be happening, and how I might be able to go about debugging and solving my UI freezing issue. Thank you so much for your help!

Threading and collections modification in WPF / C#

I'm currently developing a system in C# / WPF which accesses an SQL database, retrieves some data (around 10000 items) and then should update a collection of data points that is used as data for a WPF chart I'm using in my application (Visifire charting solution, in case anyone was wondering).
When I wrote the straight-forward, single-threaded solution, the system would, as you might expect, hang for the period of time it took the application to query the database, retrieve the data and render the charts. However, I wanted to make this task quicker by adding a wait animation to the user while the data was being fetched and processed using multithreading. However, two problems arise:
I'm having trouble updating my collections and keeping them synchronized when using multithreading. I'm not very familiar with the Dispatcher class, so I'm not very sure what to do.
Since I'm obviously not handling the multi-threading very well, the wait animation won't show up (since the UI is frozen).
I'm trying to figure out if there's a good way to use multi-threading effectively for collections. I found that Microsoft had Thread-Safe collections but none seems to fit my needs.
Also, if anyone have a good reference to learn and understand the Dispatcher I would highly appreciate it.
EDIT:
Here's a code snippet of what I'm trying to do, maybe it can shed some more light on my question:
private List<DataPoint> InitializeDataSeries(RecentlyPrintedItemViewModel item)
{
var localDataPoints = new List<DataPoint>();
// Stopping condition for recursion - if we've hit a childless (roll) item
if (item.Children.Count == 0)
{
// Populate DataPoints and return it as one DataSeries
_dataPoints.AddRange(InitializeDataPoints(item));
}
else
{
// Iterate through all children and activate this function on them (recursion)
var datapointsCollection = new List<DataPoint>();
Parallel.ForEach(item.Children, child => datapointsCollection = (InitializeDataSeries((RecentlyPrintedItemViewModel)child)));
foreach (var child in item.Children)
{
localDataPoints.AddRange(InitializeDataSeries((RecentlyPrintedItemViewModel)child));
}
}
RaisePropertyChanged("DataPoints");
AreDataPointsInitialized = true;
return localDataPoints;
}
Thanks
The Dispatcher is an object used to manage multiple queues of work items on a single thread, and each queues has a different priority for when it should execute it's work items.
The Dispatcher usually references WPF's main application thread, and is used to schedule code at different DispatcherPriorities so they run in a specific order.
For example, suppose you want to show a loading graphic, load some data, then hide the graphic.
IsLoading = true;
LoadData();
IsLoading = false;
If you do this all at once, it will lock up your application and you won't ever see the loading graphic. This is because all the code runs by default in the DispatcherPriority.Normal queue, so by the time it's finished running the loading graphic will be hidden again.
Instead, you could use the Dispatcher to load the data and hide the graphic at a lower dispatcher priority than DispatcherPriority.Render, such as DispatcherPriority.Background, so all tasks in the other queues get completed before the loading occurs, including rendering the loading graphic.
IsLoading = true;
Dispatcher.BeginInvoke(DispatcherPriority.Background,
new Action(delegate() {
LoadData();
IsLoading = false;
}));
But this still isn't ideal because the Dispatcher references the single UI thread of the application, so you will still be locking up the thread while your long running process occurs.
A better solution is to use a separate thread for your long running process. My personal preference is to use the Task Parallel Library because it's simple and easy to use.
IsLoading = true;
Task.Factory.StartNew(() =>
{
LoadData();
IsLoading = false;
});
But this can still give you problems because WPF objects can only be modified from the thread that created them.
So if you create an ObservableCollection<DataItem> on a background thread, you cannot modify that collection from anywhere in your code other than that background thread.
The typical solution is to obtain your data on a background thread and return it to the main thread in a temp variable, and have the main UI thread create the object and fill it with data obtained from the background thread.
So often your code ends up looking something like this :
IsLoading = true;
Task.Factory.StartNew(() =>
{
// run long process and return results in temp variable
return LoadData();
})
.ContinueWith((t) =>
{
// this block runs once the background code finishes
// update with results from temp variable
UpdateData(t.Result)
// reset loading flag
IsLoading = false;
});

Why does this Parallel.ForEach code freeze the program up?

More newbie questions:
This code grabs a number of proxies from the list in the main window (I couldn't figure out how to make variables be available between different functions) and does a check on each one (simple httpwebrequest) and then adds them to a list called finishedProxies.
For some reason when I press the start button, the whole program hangs up. I was under the impression that Parallel creates separate threads for each action leaving the UI thread alone so that it's responsive?
private void start_Click(object sender, RoutedEventArgs e)
{
// Populate a list of proxies
List<string> proxies = new List<string>();
List<string> finishedProxies = new List<string>();
foreach (string proxy in proxiesList.Items)
{
proxies.Add(proxy);
}
Parallel.ForEach<string>(proxies, (i) =>
{
string checkResult;
checkResult = checkProxy(i);
finishedProxies.Add(checkResult);
// update ui
/*
status.Dispatcher.Invoke(
System.Windows.Threading.DispatcherPriority.Normal,
new Action(
delegate()
{
status.Content = "hello" + checkResult;
}
)); */
// update ui finished
//Console.WriteLine("[{0}] F({1}) = {2}", Thread.CurrentThread.Name, i, CalculateFibonacciNumber(i));
});
}
I've tried using the code that's commented out to make changes to the UI inside the Parallel.Foreach and it makes the program freeze after the start button is pressed. It's worked for me before but I used Thread class.
How can I update the UI from inside the Parallel.Foreach and how do I make Parallel.Foreach work so that it doesn't make the UI freeze up while it's working?
Here's the whole code.
You must not start the parallel processing in your UI thread. See the example under the "Avoid Executing Parallel Loops on the UI Thread" header in this page.
Update: Or, you can simply create a new thread manuall and start the processing inside that as I see you have done. There's nothing wrong with that too.
Also, as Jim Mischel points out, you are accessing the lists from multiple threads at the same time, so there are race conditions there. Either substitute ConcurrentBag for List, or wrap the lists inside a lock statement each time you access them.
A good way to circumvent the problems of not being able to write to the UI thread when using Parallel statements is to use the Task Factory and delegates, see the following code, I used this to iterate over a series of files in a directory, and process them in a Parallel.ForEach loop, after each file is processed the UI thread is signaled and updated:
var files = GetFiles(directoryToScan);
tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
Task task = Task.Factory.StartNew(delegate
{
// Were we already canceled?
ct.ThrowIfCancellationRequested();
Parallel.ForEach(files, currentFile =>
{
// Poll on this property if you have to do
// other cleanup before throwing.
if (ct.IsCancellationRequested)
{
// Clean up here, then...
ct.ThrowIfCancellationRequested();
}
ProcessFile(directoryToScan, currentFile, directoryToOutput);
// Update calling thread's UI
BeginInvoke((Action)(() =>
{
WriteProgress(currentFile);
}));
});
}, tokenSource.Token); // Pass same token to StartNew.
task.ContinueWith((t) =>
BeginInvoke((Action)(() =>
{
SignalCompletion(sw);
}))
);
And the methods that do the actual UI changes:
void WriteProgress(string fileName)
{
progressBar.Visible = true;
lblResizeProgressAmount.Visible = true;
lblResizeProgress.Visible = true;
progressBar.Value += 1;
Interlocked.Increment(ref counter);
lblResizeProgressAmount.Text = counter.ToString();
ListViewItem lvi = new ListViewItem(fileName);
listView1.Items.Add(lvi);
listView1.FullRowSelect = true;
}
private void SignalCompletion(Stopwatch sw)
{
sw.Stop();
if (tokenSource.IsCancellationRequested)
{
InitializeFields();
lblFinished.Visible = true;
lblFinished.Text = String.Format("Processing was cancelled after {0}", sw.Elapsed.ToString());
}
else
{
lblFinished.Visible = true;
if (counter > 0)
{
lblFinished.Text = String.Format("Resized {0} images in {1}", counter, sw.Elapsed.ToString());
}
else
{
lblFinished.Text = "Nothing to resize";
}
}
}
Hope this helps!
If anyone's curious, I kinda figured it out but I'm not sure if that's good programming or any way to deal with the issue.
I created a new thread like so:
Thread t = new Thread(do_checks);
t.Start();
and put away all of the parallel stuff inside of do_checks().
Seems to be doing okay.
One problem with your code is that you're calling FinishedProxies.Add from multiple threads concurrently. That's going to cause a problem because List<T> isn't thread-safe. You'll need to protect it with a lock or some other synchronization primitive, or use a concurrent collection.
Whether that causes the UI lockup, I don't know. Without more information, it's hard to say. If the proxies list is very long and checkProxy doesn't take long to execute, then your tasks will all queue up behind that Invoke call. That's going to cause a whole bunch of pending UI updates. That will lock up the UI because the UI thread is busy servicing those queued requests.
This is what I think might be happening in your code-base.
Normal Scenario: You click on button. Do not use Parallel.Foreach loop. Use Dispatcher class and push the code to run on separate thread in background. Once the background thread is done processing, it will invoke the main UI thread for updating the UI. In this scenario, the background thread(invoked via Dispatcher) knows about the main UI thread, which it needs to callback. Or simply said the main UI thread has its own identity.
Using Parallel.Foreach loop: Once you invoke Paralle.Foreach loop, the framework uses the threadpool thread. ThreadPool threads are chosen randomly and the executing code should never make any assumption on the identity of the chosen thread. In the original code its very much possible that dispatcher thread invoked via Parallel.Foreach loop is not able to figure out the thread which it is associated with. When you use explicit thread, then it works fine because the explicit thread has its own identity which can be relied upon by the executing code.
Ideally if your main concern is all about keeping UI responsive, then you should first use the Dispatcher class to push the code in background thread and then in there use what ever logic you want to speedup the overall execution.
if you want to use parallel foreach in GUI control like button click etc
then put parallel foreach in Task.Factory.StartNew
like
private void start_Click(object sender, EventArgs e)
{
await Task.Factory.StartNew(() =>
Parallel.ForEach(YourArrayList, (ArraySingleValue) =>
{
Console.WriteLine("your background process code goes here for:"+ArraySingleValue);
})
);
}//func end
it will resolve freeze/stuck or hang issue

Basic Threading Question

This question has probably been asked in various ways before, but here is what I want to do. I am going to have a Windows form with many tabs. Each tab will contain a grid object. For each tab/grid that is created by the user, I would like a spawn off a dedicated thread to populate the contents of that grid with constantly arriving information. Could anyone provide an example of how to do this safely?
Thanks.
Inside the initialization for the tab (assuming WinForms until I see otherwise):
Thread newThread = new Thread(() =>
{
// Get your data
dataGridView1.Invoke(new Action(() => { /* add data to the grid here */ } );
});
newThread.Start();
That is obviously the most simple example. You could also spawn the threads using the ThreadPool (which is more commonly done in server side applications).
If you're using .NET 4.0 you also have the Task Parallel library which could help as well.
There are two basic approaches you can use. Choose the one that makes the most sense in your situation. Often times there is no right or wrong choice. They can both work equally well in many situations. Each has its own advantages and disadvantages. Oddly the community seems to overlook the pull method too often. I am not sure why that is really. I recently stumbled upon this question in which everyone recommeded the push approach despite it being the perfect situation for the pull method (there was one poor soul who did go against the herd and got downvoted and eventually deleted his answer leaving only me as the lone dissenter).
Push Method
Have the worker thread push the data to the form. You will need to use the ISynchronizeInvoke.Invoke method to accomplish this. The advantage here is that as each data item arrives it will immediately be added to the grid. The disadvantage is that you have to use an expensive marshaling operation and the UI could bog down if the worker thread acquires the data too fast.
void WorkerThread()
{
while (true)
{
object data = GetNewData();
yourForm.Invoke(
(Action)(() =>
{
// Add data to your grid here.
}));
}
}
Pull Method
Have the UI thread pull the data from the worker thread. You will have the worker thread enqueue new data items into a shared queue and the UI thread will dequeue the items periodically. The advantage here is that you can throttle the amount of work each thread is performing independently. The queue is your buffer that will shrink and grow as CPU usage ebbs and flows. It also decouples the logic of the worker thread from the UI thread. The disadvantage is that if your UI thread does not poll fast enough or keep up the worker thread could overrun the queue. And, of course, the data items would not appear in real-time on your grid. However, if you set the System.Windows.Forms.Timer interval short enough that might be not be an issue for you.
private Queue<object> m_Data = new Queue<object>();
private void YourTimer_Tick(object sender, EventArgs args)
{
lock (m_Data)
{
while (m_Data.Count > 0)
{
object data = m_Data.Dequeue();
// Add data to your grid here.
}
}
}
void WorkerThread()
{
while (true)
{
object data = GetNewData();
lock (m_Data)
{
m_Data.Enqueue(data);
}
}
}
You should have an array of threads, to be able to control them
List<Thread> tabs = new List<Thread>();
...
To add a new one, would be like:
tabs.Add( new Thread( new ThreadStart( TabRefreshHandler ) );
//Now starting:
tabs[tabs.Count - 1].Start();
And finally, in the TabRefreshHandler you should check which is the calling thread number and you'll know which is the tab that should be refreshed!

Categories