Multiple 'Tasks' running parallel C# - c#

I am trying to run multiple tasks in parallel (multithreaded), however, tests are showing they are still running concurrently. The code below seems to be similar to other examples I have read here on SO, along with the MS docs, its probably something I am just missing or not understanding.
Task.Run(async () =>
{
while (_runSensors)
{
var tasks = new List<Task>
{
Task.Factory.StartNew(() => {
_sensors.Find(item => item.Name == nameof(SensorA)).Refresh(); }, token),
Task.Factory.StartNew(() => {
_sensors.Find(item => item.Name == nameof(SensorB)).Refresh(); }, token),
Task.Factory.StartNew(() => {
_sensors.Find(item => item.Name == nameof(SensorC)).Refresh(); }, token)
};
await Task.WhenAll(tasks);
}
}
The signature for the 'Refresh()' method that is called is simply:
public void Refresh() { ... }
Where the internal workings are irrelevant to the question, basically reads from an input source and processes it (converts/formats/etc) and updates public properties of the sensor object.
Each of the 'sensors' above takes approximately 15-20 ms (as calculated by Stopwatch). With each sensor added to the list above, the runtime until await Task.WhenAll(tasks); is completed increases 15-20ms. ie. if there are 8 sensors added each iteration is roughly 120-160ms.
What I am trying to do is get them to run in parallel so that say 5 sensors takes approximately the same amount of time per iteration as 3 sensors. I do realize there would be some overhead with thread swapping as those numbers get higher
Edit #1 - Refresh Method from Sensor
public override void Refresh()
{
Read(Offset, Range); // Reads a portion of the screen (bitblt copy) ~8ms
_preprocesed = Preprocess(_buffer); // OpenCV methods to prep for OCR
var text = _preprocesed.ToText(_ocrOptions); // Tesseract OCR conversion ~10ms
var vals = text.Trim().Replace(" ", string.Empty).Split('/'); // Reformat
if (vals.Length != 2)
return;
int.TryParse(vals[0], out var current);
// Setter (only updates if changed INotifyPropertyChanged)
Current = current;
int.TryParse(vals[1], out var refreshCap);
// Setter (only updates if changed INotifyPropertyChanged)
RefreshCap = refreshCap;
}

Related

Whether to convert from Task<>.Factory.StartNew / Task.Factory.ContinueWhenAll() to async/await?

I have the simplified code below that gets shipping rates from multiple carriers asynchronously and was wondering if it would be worthwhile to convert to using an async/await methodology instead and if so what the best approach would be to go about doing that? Or if its working okay now is it really not worth the effort? Thank you.
List<Task<ShippingRateCollection>> lstTasks = new List<Task<ShippingRateCollection>>();
Task<ShippingRateCollection> t;
t = Task<ShippingRateCollection>.Factory.StartNew(() => { return GetUPSRates(...); });
lstTasks.Add(t);
t = Task<ShippingRateCollection>.Factory.StartNew(() => { return GetUSPSRates(...); });
lstTasks.Add(t);
t = Task<ShippingRateCollection>.Factory.StartNew(() => { return GetFedExRates(...); });
lstTasks.Add(t);
//wait until all requests complete (or error out)
Task.Factory.ContinueWhenAll(lstTasks.ToArray(), (tasks) =>
{
//wrap all exceptions into 1 AggregateException
Task.WaitAll(tasks);
});
foreach (Task<ShippingRateCollection> task in lstTasks)
{
foreach (ShippingRate rate in task.Result)
{
... //display rate
} //next returned rate
}
You should definitely refactor it to async/await if you can make the Get*Rates methods asynchronous. In your current code you execute them on separate threads only to block the threads waiting for I/O to complete. That's a waste.
If you can make these methods asynchronous, the code might look like this:
var tasks = new Task<ShippingRateCollection>[]
{
GetUPSRatesAsync(),
GetUSPSRatesAsync(),
GetFedExRatesAsync()
};
ShippingRateCollection[] results = await Task.WhenAll(tasks);
// process results
Even if you have to work with synchronous GetRates methods, refactoring the code to async/await will simplify it.
I would look at using Microsoft's Reactive Framework for this. You can use this code:
var fs = new Func<ShippingRateCollection>[]
{
() => GetUPSRates(...),
() => GetUSPSRates(...),
() => GetFedExRates(...),
};
var query =
from f in fs.ToObservable()
from rate in Observable.Start(f)
select rate;
query
.Subscribe(rate =>
{
//display rate
});
The code runs asynchronously and the .Subscribe(...) method will return results as soon as they are available rather than waiting for all of them to finish.
If you do want them to finish then you can change the code like this:
query
.ToArray()
.Subscribe(rates =>
{
foreach (ShippingRate rate in rates)
{
//display rate
}
});
Just NuGet "Rx-Main" to get the required bits.
A less verbose version of Jakub Lortz's answer:
var results = await Task.WhenAll(
GetUPSRatesAsync(),
GetUSPSRatesAsync(),
GetFedExRatesAsync());

Poll a webservice using Reactive Extensions and bind the last x results

I'm trying to use Reactive Extensions (Rx) for a task where it seems to be a good fit, polling at a specific interval a web service and display its last x results.
I have a web service that sends me the status of an instrument I want to monitor. I would like to poll this instrument at a specific rate and display in a list the last 20 status that have been polled.
So my list would be like a "moving window" of the service result.
I'm developing a WPF app with Caliburn.Micro, but I don't think this is very relevant.
What I managed to get until now is the following (just a sample app that I hacked quickly, I'm not going to do this in the ShellViewModel in the real app):
public class ShellViewModel : Caliburn.Micro.PropertyChangedBase, IShell
{
private ObservableCollection<string> times;
private string currentTime;
public ShellViewModel()
{
times = new ObservableCollection<string>();
Observable
.Interval(TimeSpan.FromSeconds(1))
.SelectMany(x => this.GetCurrentDate().ToObservable())
.ObserveOnDispatcher()
.Subscribe(x =>
{
this.CurrentTime = x;
this.times.Add(x);
});
}
public IEnumerable<string> Times
{
get
{
return this.times;
}
}
public string CurrentTime
{
get
{
return this.currentTime;
}
set
{
this.currentTime = value;
this.NotifyOfPropertyChange(() => this.CurrentTime);
}
}
private async Task<string> GetCurrentDate()
{
var client = new RestClient("http://www.timeapi.org");
var request = new RestRequest("/utc/now.json");
var response = await client.ExecuteGetTaskAsync(request);
return response.Content;
}
}
In the view I have just a label bound to the CurrentTime property and a list bound to the Times property.
The issue I have is:
It's not limited to the 20 items in the list as I always add items to the ObservableCollection but I can't find a better way to databind
The Interval doesn't work as I'd like. If the querying takes more than 1 second to run, two queries will be run in parallel, which I'd like not to happen. My goal would be that the query repeats indefinitely but at a pace of no more than 1 query every seconds. If a query makes more than 1 second to end, it should wait for it to have finish and directly trigger the new query.
Second edit:
Previous edit below was me being stupid and very confused, it triggers events continuously because Interval is something continuous that never ends. Brandon's solution is correct and works as expected.
Edit:
Based on Brandon's example, I tried to do the following code in LinqPad:
Observable
.Merge(Observable.Interval(TimeSpan.FromSeconds(2)), Observable.Interval(TimeSpan.FromSeconds(10)))
.Repeat()
.Scan(new List<double>(), (list, item) => { list.Add(item); return list; })
.Subscribe(x => Console.Out.WriteLine(x))
And I can see that the write to the console occurs every 2 seconds, and not every 10. So the Repeat doesn't wait for both Observable to be finished before repeating.
Try this:
// timer that completes after 1 second
var intervalTimer = Observable
.Empty<string>()
.Delay(TimeSpan.FromSeconds(1));
// queries one time whenever subscribed
var query = Observable.FromAsync(GetCurrentDate);
// query + interval timer which completes
// only after both the query and the timer
// have expired
var intervalQuery = Observable.Merge(query, intervalTimer);
// Re-issue the query whenever intervalQuery completes
var queryLoop = intervalQuery.Repeat();
// Keep the 20 most recent results
// Note. Use an immutable list for this
// https://www.nuget.org/packages/microsoft.bcl.immutable
// otherwise you will have problems with
// the list changing while an observer
// is still observing it.
var recentResults = queryLoop.Scan(
ImmutableList.Create<string>(), // starts off empty
(acc, item) =>
{
acc = acc.Add(item);
if (acc.Count > 20)
{
acc = acc.RemoveAt(0);
}
return acc;
});
// store the results
recentResults
.ObserveOnDispatcher()
.Subscribe(items =>
{
this.CurrentTime = items[0];
this.RecentItems = items;
});
This should skip the interval messages while a GetCurrentDate is in Progress.
Observable
.Interval(TimeSpan.FromSeconds(1))
.GroupByUntil(p => 1,p => GetCurrentDate().ToObservable().Do(x => {
this.CurrentTime = x;
this.times.Add(x);
}))
.SelectMany(p => p.LastAsync())
.Subscribe();

Parallel.Invoke - Dynamically creating more 'threads'

I am educating myself on Parallel.Invoke, and parallel processing in general, for use in current project. I need a push in the right direction to understand how you can dynamically\intelligently allocate more parallel 'threads' as required.
As an example. Say you are parsing large log files. This involves reading from file, some sort of parsing of the returned lines and finally writing to a database.
So to me this is a typical problem that can benefit from parallel processing.
As a simple first pass the following code implements this.
Parallel.Invoke(
()=> readFileLinesToBuffer(),
()=> parseFileLinesFromBuffer(),
()=> updateResultsToDatabase()
);
Behind the scenes
readFileLinesToBuffer() reads each line and stores to a buffer.
parseFileLinesFromBuffer comes along and consumes lines from buffer and then let's say it put them on another buffer so that updateResultsToDatabase() can come along and consume this buffer.
So the code shown assumes that each of the three steps uses the same amount of time\resources but lets say the parseFileLinesFromBuffer() is a long running process so instead of running just one of these methods you want to run two in parallel.
How can you have the code intelligently decide to do this based on any bottlenecks it might perceive?
Conceptually I can see how some approach of monitoring the buffer sizes might work, spawning a new 'thread' to consume the buffer at an increased rate for example...but I figure this type of issue has been considered in putting together the TPL library.
Some sample code would be great but I really just need a clue as to what concepts I should investigate next. It looks like maybe the System.Threading.Tasks.TaskScheduler holds the key?
Have you tried the Reactive Extensions?
http://msdn.microsoft.com/en-us/data/gg577609.aspx
The Rx is a new tecnology from Microsoft, the focus as stated in the official site:
The Reactive Extensions (Rx)... ...is a library to compose
asynchronous and event-based programs using observable collections and
LINQ-style query operators.
You can download it as a Nuget package
https://nuget.org/packages/Rx-Main/1.0.11226
Since I am currently learning Rx I wanted to take this example and just write code for it, the code I ended up it is not actually executed in parallel, but it is completely asynchronous, and guarantees the source lines are executed in order.
Perhaps this is not the best implementation, but like I said I am learning Rx, (thread-safe should be a good improvement)
This is a DTO that I am using to return data from the background threads
class MyItem
{
public string Line { get; set; }
public int CurrentThread { get; set; }
}
These are the basic methods doing the real work, I am simulating the time with a simple Thread.Sleep and I am returning the thread used to execute each method Thread.CurrentThread.ManagedThreadId. Note the timer of the ProcessLine it is 4 sec, it's the most time-consuming operation
private IEnumerable<MyItem> ReadLinesFromFile(string fileName)
{
var source = from e in Enumerable.Range(1, 10)
let v = e.ToString()
select v;
foreach (var item in source)
{
Thread.Sleep(1000);
yield return new MyItem { CurrentThread = Thread.CurrentThread.ManagedThreadId, Line = item };
}
}
private MyItem UpdateResultToDatabase(string processedLine)
{
Thread.Sleep(700);
return new MyItem { Line = "s" + processedLine, CurrentThread = Thread.CurrentThread.ManagedThreadId };
}
private MyItem ProcessLine(string line)
{
Thread.Sleep(4000);
return new MyItem { Line = "p" + line, CurrentThread = Thread.CurrentThread.ManagedThreadId };
}
The following method I am using it just to update the UI
private void DisplayResults(MyItem myItem, Color color, string message)
{
this.listView1.Items.Add(
new ListViewItem(
new[]
{
message,
myItem.Line ,
myItem.CurrentThread.ToString(),
Thread.CurrentThread.ManagedThreadId.ToString()
}
)
{
ForeColor = color
}
);
}
And finally this is the method that calls the Rx API
private void PlayWithRx()
{
// we init the observavble with the lines read from the file
var source = this.ReadLinesFromFile("some file").ToObservable(Scheduler.TaskPool);
source.ObserveOn(this).Subscribe(x =>
{
// for each line read, we update the UI
this.DisplayResults(x, Color.Red, "Read");
// for each line read, we subscribe the line to the ProcessLine method
var process = Observable.Start(() => this.ProcessLine(x.Line), Scheduler.TaskPool)
.ObserveOn(this).Subscribe(c =>
{
// for each line processed, we update the UI
this.DisplayResults(c, Color.Blue, "Processed");
// for each line processed we subscribe to the final process the UpdateResultToDatabase method
// finally, we update the UI when the line processed has been saved to the database
var persist = Observable.Start(() => this.UpdateResultToDatabase(c.Line), Scheduler.TaskPool)
.ObserveOn(this).Subscribe(z => this.DisplayResults(z, Color.Black, "Saved"));
});
});
}
This process runs totally in the background, this is the output generated:
in an async/await world, you'd have something like:
public async Task ProcessFileAsync(string filename)
{
var lines = await ReadLinesFromFileAsync(filename);
var parsed = await ParseLinesAsync(lines);
await UpdateDatabaseAsync(parsed);
}
then a caller could just do var tasks = filenames.Select(ProcessFileAsync).ToArray(); and whatever (WaitAll, WhenAll, etc, depending on context)
Use a couple of BlockingCollection. Here is an example
The idea is that you create a producer that puts data into the collection
while (true) {
var data = ReadData();
blockingCollection1.Add(data);
}
Then you create any number of consumers that reads from the collection
while (true) {
var data = blockingCollection1.Take();
var processedData = ProcessData(data);
blockingCollection2.Add(processedData);
}
and so on
You can also let TPL handle the number of consumers by using Parallel.Foreach
Parallel.ForEach(blockingCollection1.GetConsumingPartitioner(),
data => {
var processedData = ProcessData(data);
blockingCollection2.Add(processedData);
});
(note that you need to use GetConsumingPartitioner not GetConsumingEnumerable (see here)

C# RX (System.Reactive) - Async - Publish an IEnumerable<DataRow> to multiple observing data handers

I'm new to RX.
I'd like to traverse an IEnumerable and publish to multi DataHandlers that process the data in their respective threads.
Below is my sample program. The publish works and a new thread is created, but the 3 RowHandlers are all running in 1 thread. I need 3 threads. What is the best way to implement this?
class Program
{
public class MyDataGenerator
{
public IEnumerable<int> myData()
{
//Heavy lifting....Don't want to process more than once.
yield return 1;
yield return 2;
yield return 3;
yield return 4;
yield return 5;
yield return 6;
}
}
static void Main(string[] args)
{
MyDataGenerator h = new MyDataGenerator();
Console.WriteLine("Thread id " + Thread.CurrentThread.ManagedThreadId.ToString());
//
var shared = h.myData().ToObservable().Publish();
///////////////////////////////
// Row Handling Requirements
//
// 1. Single Scan of IEnumerable.
// 2. Row handlers process data in their own threads.
// 3. OK if scanning thread blocks while data is processed
//
//Create the RowHandlers
MyRowHandler rn1 = new MyRowHandler();
rn1.ido = shared.Subscribe(i => rn1.processID(i));
MyRowHandler rn2 = new MyRowHandler();
rn2.ido = shared.Subscribe(i => rn2.processID(i));
MyRowHandler rn3 = new MyRowHandler();
rn3.ido = shared.Subscribe(i => rn3.processID(i));
//
shared.Connect();
}
public class MyRowHandler
{
public IDisposable ido = null;
public void processID(int i)
{
var o = Observable.Start(() =>
{
Console.WriteLine(String.Format("Start Thread ID {0} Int{1}", Thread.CurrentThread.ManagedThreadId, i));
Thread.Sleep(30);
Console.WriteLine("Done Thread ID"+Thread.CurrentThread.ManagedThreadId.ToString());
}
);
o.First();
}
}
}
Discovery :
The coding speed & code quality gains one receives from Rx come at the expense of performance. Task/Delegates are without a doubt multiples faster. That means that the most important thing one needs to learn about Rx is when to use Rx. Below is a draft summary guideline. For large volumes I can see use for Rx in chuncking, combining, and other many stream-many handler models; however, basic Async should not use rx.
I'd post an image with a matrix guideline, but the site won't let me post images
If I understand your sequencing requirements correctly and you want three parallel running scans, you can just observe on the TaskPool and subscribe from there;
...
//Create the RowHandlers
MyRowHandler rn1 = new MyRowHandler();
rn1.ido = shared.ObserveOn(Scheduler.TaskPool).Subscribe(i => rn1.processID(i));
...
Note that since you're then running asynchronously and your main thread doesn't wait for the scans to get done, your program will terminate right away unless you for example put a Console.ReadKey() at the end of the program.
EDIT: Regarding running the same thread "all the way", you're scheduling a bit strangely for that. If you drop the observable in the rowhandler, you can use Scheduler.NewThread and get good results;
...
var rowHandler1 = new MyRowHandler();
rowHandler1.ido = shared.ObserveOn(Scheduler.NewThread).Subscribe(rowHandler1.ProcessID);
...
public void ProcessID(int i)
{
Console.WriteLine(String.Format("Start Thread ID {0} Int{1}", Thread.CurrentThread.ManagedThreadId, i));
Thread.Sleep(30);
Console.WriteLine("Done Thread ID" + Thread.CurrentThread.ManagedThreadId.ToString(CultureInfo.InvariantCulture));
}
That will give each subscription its own thread, and stay with it.

How to ContinueWith another function with result from previous task when using Tasks?

I have WCF connector that should get some small amount of data for me, usually it takes up to 20 seconds to get this data for each item ( which is fine ). I want to use Task to get data for me and then add WinForm controls with value from this Tasks.
I've created list of objects which will consist this data.
Used first Task as the one which updates the list and i want Task that is right away after first Task is done to create controls.
This is the code so far:
List<IpVersionCounter> ipVersionCounters = new List<IpVersionCounter>();
Task task = Task.Factory.StartNew(() =>
{
foreach (var sitein settings.Sites)
{
string ip = site.ip;
string version = "undefined";
using (WcfConnector wcfConnector =
WcfConnector.CreateConnectorWithoutException((ip)))
{
if (wcfConnector != null)
{
version= string.Format("{0} {1} {2}",
wcfConnector.VersionController.GetBranchName(),
wcfConnector.VersionController.GetBuildNumber(),
wcfConnector.VersionController.GetCurrentVersion());
}
}
counter++;
ipVersionCounters.Add(new IpVersionCounter
{
Ip = ip,
Version = Version,
Counter = counter
});
}
return ipVersionCounters;
}).ContinueWith();
AddProgressBar(ipVersionCounter);
I don't know if i'm going right way and how to use ContinueWith to pass value from first method to second.
In the example below previousTask references the previous task, use the Result property to get the return value from it.
Task task = Task.Factory.StartNew(() =>
{
// Background work
return ipVersionCounters;
}).ContinueWith((previousTask) =>
{
var ipVersionCounters = previousTask.Result;
});
Update
If you want the continuewith to execute on the UI thread use (If you are starting on the UI thread) ...
Task.Factory.StartNew(() =>
{
// Background work
}).ContinueWith((previousTask) => {
// Update UI thread
}, TaskScheduler.FromCurrentSynchronizationContext());
(which was taken from this answer for more info)

Categories