non reentrant observable in c#

non reentrant observable in c# - c#

Given the following method:
If I leave the hack in place, my unit test completes immediately with "observable has no data".
If I take the hack out, there are multiple threads all attempting to login at the same time.
The host service does not allow this.
How do I ensure that only one thread is producing observables at any given point in time.
private static object obj = new object();
private static bool here = true;
public IObservable<Party> LoadAllParties(CancellationToken token)
{
var parties = Observable.Create<Party>(
async (observer, cancel) =>
{
// this is just a hack to test behavior
lock (obj)
{
if (!here)
return;
here = false;
}
// end of hack.
try
{
if (!await this.RequestLogin(observer, cancel))
return;
// request list.
await this._request.GetAsync(this._configuration.Url.RequestList);
if (this.IsCancelled(observer, cancel))
return;
while (!cancel.IsCancellationRequested)
{
var entities = await this._request.GetAsync(this._configuration.Url.ProcessList);
if (this.IsCancelled(observer, cancel))
return;
var tranche = this.ExtractParties(entities);
// break out if it's the last page.
if (!tranche.Any())
break;
Array.ForEach(tranche, observer.OnNext);
await this._request.GetAsync(this._configuration.Url.ProceedList);
if (this.IsCancelled(observer, cancel))
return;
}
observer.OnCompleted();
}
catch (Exception ex)
{
observer.OnError(ex);
}
});
return parties;
}
My Unit Test:
var sut = container.Resolve<SyncDataManager>();
var count = 0;
var token = new CancellationTokenSource();
var observable = sut.LoadAllParties(token.Token);
observable.Subscribe(party => count++);
await observable.ToTask(token.Token);
count.Should().BeGreaterThan(0);

I do think your question is suffering from the XY Problem - the code contains several calls to methods not included which may contain important side effects and I feel that going on the information available won't lead to the best advice.
That said, I suspect you did not intend to subscribe to observable twice - once with the explicit Subscribe call, and once with the ToTask() call. This would certainly explain the concurrent calls, which are occurring in two different subscriptions.
EDIT:
How about asserting on the length instead (tweak the timeout to suit):
var length = await observable.Count().Timeout(TimeSpan.FromSeconds(3));
Better would be to look into Rx-Testing and mock your dependencies. That's a big topic, but this long blog post from the Rx team explains it very well and this answer regarding TPL-Rx interplay may help: Executing TPL code in a reactive pipeline and controlling execution via test scheduler

Related

How can I make sure a thread gets dibs after a certain Task

I'm in a bit of a conundrum regarding multithreading.
I'm currently working on a real-time service using SinglaR. The idea is that a connected user can request data from another.
Below is a gist of what the request and response functions look like.
Consider the following code:
private readonly ConcurrentBag _sharedObejcts= new ConcurrentBag();
The request:
[...]
var sharedObject = new MyObject();
_sharedObejcts.Add(sharedObject);
ForwardRequestFireAndForget();
try
{
await Task.Delay(30000, sharedObject.myCancellationToken);
}
catch
{
return sharedObject.ResponseProperty;
}
_myConcurrentBag.TryTake(sharedObject);
[...]
The response:
[...]
var result = DoSomePossiblyVeryLengthyTaskHere();
var sharedObject = ConcurrentBag
.Where(x)
.FirstOrDefault();
// The request has timed out so the object isn't there anymore.
if(sharedObject == null)
{
return someResponse;
}
sharedObject.ResponseProperty = result;
// triggers the cancellation source
sharedObject.Cancel();
return someOtherResponse;
[...]
So basically a request is made to the server, forwarded to the other host and the function waits for cancellation or it times out.
The other hosts call the respond function, which adds the repsonseObject and triggers myCancellationToken.
I am however unsure whether this represents a race condition.
In theory, could the responding thread retrieve the sharedObject while the other thread still sits on the finally block?
This would mean, the request timed out already, the task just hasn't gotten around to removing the object from the bag, which means the data is inconsistent.
What would be some guaranteed ways to make sure that the first thing that gets called after the Task.Delay() call is the TryTake()call?

You don't want to have the producer cancel the consumer's wait. That's way too much conflation of responsibilities.
Instead, what you really want is for the producer to send an asynchronous signal. This is done via TaskCompletionSource<T>. The consumer can add the object with an incomplete TCS, and then the consumer can (asynchronously) wait for that TCS to complete (or timeout). Then the producer just gives its value to the TCS.
Something like this:
class MyObject
{
public TaskCompletionSource<MyProperty> ResponseProperty { get; } = new TaskCompletionSource<MyProperty>();
}
// request (consumer):
var sharedObject = new MyObject();
_sharedObejcts.Add(sharedObject);
ForwardRequestFireAndForget();
var responseTask = sharedObject.ResponseProperty.Task;
if (await Task.WhenAny(Task.Delay(30000), responseTask) != responseTask)
return null;
_myConcurrentBag.TryTake(sharedObject);
return await responseTask;
// response (producer):
var result = DoSomePossiblyVeryLengthyTaskHere();
var sharedObject = ConcurrentBag
.Where(x)
.FirstOrDefault();
// The request has timed out so the object isn't there anymore.
if(sharedObject == null)
return someResponse;
sharedObject.ResponseProperty.TrySetResult(result);
return someOtherResponse;
The code above can be cleaned up a bit; specifically, it's not a bad idea to have the producer have a "producer view" of the shared object, and the consumer have a "consumer view", with both interfaces implemented by the same type. But the code above should give you the general idea.

Timeout for asynchronous Task<T> with additional exception handling

In my project, I reference types and interfaces from a dynamic link library.
The very first thing I have to do when using this specific library is to create an instance of EA.Repository, which is defined within the library and serves as kind of an entry point for further usage.
The instantiation EA.Repository repository = new EA.Repository() performs some complex stuff in the background, and I find myself confronted with three possible outcomes:
Instantiation takes some time but finishes successfully in the end
An exception is thrown (either immediately or after some time)
The instantiation blocks forever (in which case I'd like to cancel and inform the user)
I was able to come up with an asynchronous approach using Task:
public static void Connect()
{
// Do the lengthy instantiation asynchronously
Task<EA.Repository> task = Task.Run(() => { return new EA.Repository(); });
bool isCompletedInTime;
try
{
// Timeout after 5.0 seconds
isCompletedInTime = task.Wait(5000);
}
catch (Exception)
{
// If the instantiation fails (in time), throw a custom exception
throw new ConnectionException();
}
if (isCompletedInTime)
{
// If the instantiation finishes in time, store the object for later
EapManager.Repository = task.Result;
}
else
{
// If the instantiation did not finish in time, throw a custom exception
throw new TimeoutException();
}
}
(I know, you can probably already spot a lot of issues here. Please be patient with me... Recommendations would be appreciated!)
This approach works so far - I can simulate both the "exception" and the "timeout" scenario and I obtain the desired behavior.
However, I have identified another edge case: Let's assume the instantiation task takes long enough that the timeout expires and then throws an exception. In this case, I sometimes end up with an AggregateException saying that the task has not been observed.
I'm struggling to find a feasible solution to this. I can't really cancel the task when the timeout expires, because the blocking instantiation obviously prevents me from using the CancellationToken approach.
The only thing I could come up with is to start observing the task asynchronously (i.e. start another task) right before throwing my custom TimeoutException:
Task observerTask = Task.Run(() => {
try { task.Wait(); }
catch (Exception) { }
});
throw new TimeoutException();
Of course, if the instantiation really blocks forever, I already had the first task never finish. With the observer task, now I even have two!
I'm quite insecure about this whole approach, so any advice would be welcome!
Thank you very much in advance!

I'm not sure if I fully understood what you're trying to achieve, but what if you do something like this -
public static void Connect()
{
Task<EA.Repository> _realWork = Task.Run(() => { return new EA.Repository(); });
Task _timeoutTask = Task.Delay(5000);
Task.WaitAny(new Task[]{_realWork, timeoutTask});
if (_timeoutTask.Completed)
{
// timed out
}
else
{
// all good, access _realWork.Result
}
}
or you can even go a bit shorter -
public static void Connect()
{
Task<EA.Repository> _realWork = Task.Run(() => { return new EA.Repository(); });
var completedTaskIndex = Task.WaitAny(new Task[]{_realWork}, 5000);
if (completedTaskIndex == -1)
{
// timed out
}
else
{
// all good, access _realWork.Result
}
}
You can also always call Task.Run with a CancellationToken that will time out, but that will raise an exception - the above solutions give you control of the behaviour without an exception being thrown (even though you can always try/catch)

Here is an extension method that you could use to explicitly observe the tasks that may fail while unobserved:
public static Task<T> AsObserved<T>(this Task<T> task)
{
task.ContinueWith(t => t.Exception);
return task;
}
Usage example:
var task = Task.Run(() => new EA.Repository()).AsObserved();

Await all Tasks in the list, and after that update the list base on task result

I want to run all tasks with WhenAll (not one by one).
But after that I need to update list (LastReport property) base on result.
I think I have solution but I would like to check if there is better way.
Idea is to:
Run all tasks
Remember relation between configuration and task
Update configuration
My solution is:
var lastReportAllTasks = new List<Task<Dictionary<string, string>>>();
var configurationTaskRelation = new Dictionary<int, Task<Dictionary<string, string>>>();
foreach (var configuration in MachineConfigurations)
{
var task = machineService.GetReports(configuration);
lastReportAllTasks.Add(task);
configurationTaskRelation.Add(configuration.Id, task);
}
await Task.WhenAll(lastReportAllTasks);
foreach (var configuration in MachineConfigurations)
{
var lastReportTask = configurationTaskRelation[configuration.Id];
configuration.LastReport = await lastReportTask;
}

The Select function can be asynchronous itself. You can await the report and return both the configuration and result in the same result object (anonymous type or tuple, whatever you prefer) :
var tasks=MachineConfigurations.Select(async conf=>{
var report= await machineService.GetReports(conf);
return new {conf,report});
var results=await Task.WhenAll(tasks);
foreach(var pair in results)
{
pair.conf.LastReport=pair.report;
}
EDIT - Loops and error handling
As Servy suggested, Task.WhenAll can be ommited and awaiting can be moved inside the loop :
foreach(var task in tasks)
{
var pair=await task;
pair.conf.LastReport=pair.report;
}
The tasks will still execute concurrently. In case of exception though, some configuration objects will be modified and some not.
In general, this would be an ugly situation, requiring extra exception handling code to clean up the modified objects. Exception handling is a lot easier when modifications are done on-the-side and finalized/applied when the happy path completes. That's one reason why updating the Configuration objects inside the Select() requires careful consideration.
In this particular case though it may be better to "skip" the failed reports, possibly move them to an error queue and reprocess them at a later time. It may be better to have partial results than no results at all, as long as this behaviour is expected:
foreach(var task in tasks)
{
try
{
var pair=await task;
pair.conf.LastReport=pair.report;
}
catch(Exception exc)
{
//Make sure the error is logged
Log.Error(exc);
ErrorQueue.Enqueue(new ProcessingError(conf,ex);
}
}
//Handle errors after the loop
EDIT 2 - Dataflow
For completeness, I do have several thousand ticket reports to generate each day, and each GDS call (the service through which every travel agency sells tickets) takes considerable time. I can't run all requests at the same time - I start getting server serialization errors if I try more than 10 concurrent requests. I can't retry everything either.
In this case I used TPL DataFlow combined with some Railway oriented programming tricks. An ActionBlock with a DOP of 8 processes the ticket requests. The results are wrapped in a Success class and sent to the next block. Failed requests and exceptions are wrapped in a Failure class and sent to another block. Both classes inherit from IFlowEnvelope which has a Successful flag. Yes, that's F# Discriminated Union envy.
This is combined with some retry logic for timeouts etc.
In pseudocode the pipeline looks like this :
var reportingBlock=new TransformBlock<Ticket,IFlowEnvelope<TicketReport>(reportFunc,dopOptions);
var happyBlock = new ActionBlock<IFlowEnvelope<TicketReport>>(storeToDb);
var errorBlock = new ActionBlock<IFlowEnvelope<TicketReport>>(logError);
reportingBlock.LinkTo(happyBlock,linkOptions,msg=>msg.Success);
reportingBlock.LinkTo(errorBlock,linkOptions,msg=>!msg.Success);
foreach(var ticket in tickets)
{
reportingBlock.Post(ticket);
}
reportFunc catches any exceptions and wraps them as Failure<T> objects:
async Task<IFlowEnvelope<Ticket,TicketReport>> reportFunc(Ticket ticket)
{
try
{
//Do the heavy processing
return new Success<TicketReport>(report);
}
catch(Exception exc)
{
//Construct an error message, msg
return new Failure<TicketReport>(report,msg);
}
}
The real pipeline includes steps that parse daily reports and individual tickets. Each call to the GDS takes 1-6 seconds so the complexity of the pipeline is justified.

I think you don't need Lists or Dictionaries. Why not simple loop which updates LastReport with results
foreach (var configuration in MachineConfigurations)
{
configuration.LastReport = await machineService.GetReports(configuration);
}
For executing all reports "in parallel"
Func<Configuration, Task> loadReport =
async config => config.LastReport = await machineService.GetReports(config);
await Task.WhenAll(MachineConfigurations.Select(loadReport));
And very poor try to be more functional.
Func<Configuration, Task<Configuration>> getConfigWithReportAsync =
async config =>
{
var report = await machineService.GetReports(config);
return new Configuration
{
Id = config.Id,
LastReport = report
};
}
var configsWithUpdatedReports =
await Task.WhenAll(MachineConfigurations.Select(getConfigWithReportAsync));

using System.Linq;
var taskResultsWithConfiguration = MachineConfigurations.Select(conf =>
new { Conf = conf, Task = machineService.GetReports(conf) }).ToList();
await Task.WhenAll(taskResultsWithConfiguration.Select(pair => pair.Task));
foreach (var pair in taskResultsWithConfiguration)
pair.Conf.LastReport = pair.Task.Result;

When should TaskCompletionSource<T> be used?

AFAIK, all it knows is that at some point, its SetResult or SetException method is being called to complete the Task<T> exposed through its Task property.
In other words, it acts as the producer for a Task<TResult> and its completion.
I saw here the example:
If I need a way to execute a Func<T> asynchronously and have a Task<T>
to represent that operation.
public static Task<T> RunAsync<T>(Func<T> function)
{
if (function == null) throw new ArgumentNullException(“function”);
var tcs = new TaskCompletionSource<T>();
ThreadPool.QueueUserWorkItem(_ =>
{
try
{
T result = function();
tcs.SetResult(result);
}
catch(Exception exc) { tcs.SetException(exc); }
});
return tcs.Task;
}
Which could be used if I didn’t have Task.Factory.StartNew -
But I do have Task.Factory.StartNew.
Question:
Can someone please explain by example a scenario related directly to TaskCompletionSource
and not to a hypothetical situation in which I don't have Task.Factory.StartNew?

I mostly use it when only an event based API is available (for example Windows Phone 8 sockets):
public Task<Args> SomeApiWrapper()
{
TaskCompletionSource<Args> tcs = new TaskCompletionSource<Args>();
var obj = new SomeApi();
// will get raised, when the work is done
obj.Done += (args) =>
{
// this will notify the caller
// of the SomeApiWrapper that
// the task just completed
tcs.SetResult(args);
}
// start the work
obj.Do();
return tcs.Task;
}
So it's especially useful when used together with the C#5 async keyword.

In my experiences, TaskCompletionSource is great for wrapping old asynchronous patterns to the modern async/await pattern.
The most beneficial example I can think of is when working with Socket. It has the old APM and EAP patterns, but not the awaitable Task methods that TcpListener and TcpClient have.
I personally have several issues with the NetworkStream class and prefer the raw Socket. Being that I also love the async/await pattern, I made an extension class SocketExtender which creates several extension methods for Socket.
All of these methods make use of TaskCompletionSource<T> to wrap the asynchronous calls like so:
public static Task<Socket> AcceptAsync(this Socket socket)
{
if (socket == null)
throw new ArgumentNullException("socket");
var tcs = new TaskCompletionSource<Socket>();
socket.BeginAccept(asyncResult =>
{
try
{
var s = asyncResult.AsyncState as Socket;
var client = s.EndAccept(asyncResult);
tcs.SetResult(client);
}
catch (Exception ex)
{
tcs.SetException(ex);
}
}, socket);
return tcs.Task;
}
I pass the socket into the BeginAccept methods so that I get a slight performance boost out of the compiler not having to hoist the local parameter.
Then the beauty of it all:
var listener = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
listener.Bind(new IPEndPoint(IPAddress.Loopback, 2610));
listener.Listen(10);
var client = await listener.AcceptAsync();

To me, a classic scenario for using TaskCompletionSource is when it's possible that my method won't necessarily have to do a time consuming operation. What it allows us to do is to choose the specific cases where we'd like to use a new thread.
A good example for this is when you use a cache. You can have a GetResourceAsync method, which looks in the cache for the requested resource and returns at once (without using a new thread, by using TaskCompletionSource) if the resource was found. Only if the resource wasn't found, we'd like to use a new thread and retrieve it using Task.Run().
A code example can be seen here: How to conditionally run a code asynchonously using tasks

In this blog post, Levi Botelho describes how to use the TaskCompletionSource to write an asynchronous wrapper for a Process such that you can launch it and await its termination.
public static Task RunProcessAsync(string processPath)
{
var tcs = new TaskCompletionSource<object>();
var process = new Process
{
EnableRaisingEvents = true,
StartInfo = new ProcessStartInfo(processPath)
{
RedirectStandardError = true,
UseShellExecute = false
}
};
process.Exited += (sender, args) =>
{
if (process.ExitCode != 0)
{
var errorMessage = process.StandardError.ReadToEnd();
tcs.SetException(new InvalidOperationException("The process did not exit correctly. " +
"The corresponding error message was: " + errorMessage));
}
else
{
tcs.SetResult(null);
}
process.Dispose();
};
process.Start();
return tcs.Task;
}
and its usage
await RunProcessAsync("myexecutable.exe");

It looks like no one mentioned, but I guess unit tests too can be considered real life enough.
I find TaskCompletionSource to be useful when mocking a dependency with an async method.
In actual program under test:
public interface IEntityFacade
{
Task<Entity> GetByIdAsync(string id);
}
In unit tests:
// set up mock dependency (here with NSubstitute)
TaskCompletionSource<Entity> queryTaskDriver = new TaskCompletionSource<Entity>();
IEntityFacade entityFacade = Substitute.For<IEntityFacade>();
entityFacade.GetByIdAsync(Arg.Any<string>()).Returns(queryTaskDriver.Task);
// later on, in the "Act" phase
private void When_Task_Completes_Successfully()
{
queryTaskDriver.SetResult(someExpectedEntity);
// ...
}
private void When_Task_Gives_Error()
{
queryTaskDriver.SetException(someExpectedException);
// ...
}
After all, this usage of TaskCompletionSource seems another case of "a Task object that does not execute code".

TaskCompletionSource is used to create Task objects that don't execute code.
In real world scenarios, TaskCompletionSource is ideal for I/O bound operations. This way, you get all the benefits of tasks (e.g. return values, continuations, etc) without blocking a thread for the duration of the operation. If your "function" is an I/O bound operation, it isn't recommended to block a thread using a new Task. Instead, using TaskCompletionSource, you can create a slave task to just indicate when your I/O bound operation finishes or faults.

There's a real world example with a decent explanation in this post from the "Parallel Programming with .NET" blog. You really should read it, but here's a summary anyway.
The blog post shows two implementations for:
"a factory method for creating “delayed” tasks, ones that won’t
actually be scheduled until some user-supplied timeout has occurred."
The first implementation shown is based on Task<> and has two major flaws. The second implementation post goes on to mitigate these by using TaskCompletionSource<>.
Here's that second implementation:
public static Task StartNewDelayed(int millisecondsDelay, Action action)
{
// Validate arguments
if (millisecondsDelay < 0)
throw new ArgumentOutOfRangeException("millisecondsDelay");
if (action == null) throw new ArgumentNullException("action");
// Create a trigger used to start the task
var tcs = new TaskCompletionSource<object>();
// Start a timer that will trigger it
var timer = new Timer(
_ => tcs.SetResult(null), null, millisecondsDelay, Timeout.Infinite);
// Create and return a task that will be scheduled when the trigger fires.
return tcs.Task.ContinueWith(_ =>
{
timer.Dispose();
action();
});
}

This may be oversimplifying things but the TaskCompletion source allows one to await on an event. Since the tcs.SetResult is only set once the event occurs, the caller can await on the task.
Watch this video for more insights:
http://channel9.msdn.com/Series/Three-Essential-Tips-for-Async/Lucian03-TipsForAsyncThreadsAndDatabinding

I real world scenario where I have used TaskCompletionSource is when implementing a download queue. In my case if the user starts 100 downloads I don't want to fire them all off at once and so instead of returning a strated task I return a task attached to TaskCompletionSource. Once the download gets completed the thread that is working the queue completes the task.
The key concept here is that I am decoupling when a client asks for a task to be started from when it actually gets started. In this case because I don't want the client to have to deal with resource management.
note that you can use async/await in .net 4 as long as you are using a C# 5 compiler (VS 2012+) see here for more details.

I've used TaskCompletionSource to run a Task until it is cancelled. In this case it's a ServiceBus subscriber that I normally want to run for as long as the application runs.
public async Task RunUntilCancellation(
CancellationToken cancellationToken,
Func<Task> onCancel)
{
var doneReceiving = new TaskCompletionSource<bool>();
cancellationToken.Register(
async () =>
{
await onCancel();
doneReceiving.SetResult(true); // Signal to quit message listener
});
await doneReceiving.Task.ConfigureAwait(false); // Listen until quit signal is received.
}

The Blazor's WebAssemblyHost also uses this to prevent .NET VM stop.
await new TaskCompletionSource().Task;

Query on Queues and Thread Safety

Thread-Safety is not an aspect that I have worried about much as the simple apps and libraries I have written usually only run on the main thread, or do not directly modified properties or fields in any classes that I needed to worry about before.
However, I have started working on a personal project that I am using a WebClient to download data asynchronously from a remote server. There is a Queue<Uri> that contains a pre-built queue of a series of URI's to download data.
So consider the following snippet (this is not my real code, but something I am hoping illustrates my question:
private WebClient webClient = new WebClient();
private Queue<Uri> requestQueue = new Queue<Uri>();
public Boolean DownloadNextASync()
{
if (webClient.IsBusy)
return false;
if (requestQueue.Count == 0)
return false
var uri = requestQueue.Dequeue();
webClient.DownloadDataASync(uri);
return true;
}
If I am understanding correctly, this method is not thread safe (assuming this specific instance of this object is known to multiple threads). My reasoning is webClient could become busy during the time between the IsBusy check and the DownloadDataASync() method call. And also, requestQueue could become empty between the Count check and when the next item is dequeued.
My question is what is the best way to handle this type of situation to make it thread-safe?
This is more of an abstract question as I realize for this specific method that there would have to be an exceptionally inconvenient timing for this to actually cause a problem, and to cover that case I could just wrap the method in an appropriate try-catch since both pieces would throw an exception. But is there another option? Would a lock statement be applicable here?

If you're targeting .Net 4.0, you could use the Task Parallel Library for help:
var queue = new BlockingCollection<Uri>();
var maxClients = 4;
// Optionally provide another producer/consumer collection for the data
// var data = new BlockingCollection<Tuple<Uri,byte[]>>();
// Optionally implement CancellationTokenSource
var clients = from id in Enumerable.Range(0, maxClients)
select Task.Factory.StartNew(
() =>
{
var client = new WebClient();
while (!queue.IsCompleted)
{
Uri uri;
if (queue.TryTake(out uri))
{
byte[] datum = client.DownloadData(uri); // already "async"
// Optionally pass datum along to the other collection
// or work on it here
}
else Thread.SpinWait(100);
}
});
// Add URI's to search
// queue.Add(...);
// Notify our clients that we've added all the URI's
queue.CompleteAdding();
// Wait for all of our clients to finish
clients.WaitAll();
To use this approach for progress indication you can use TaskCompletionSource<TResult> to manage the Event based parallelism:
public static Task<byte[]> DownloadAsync(Uri uri, Action<double> progress)
{
var source = new TaskCompletionSource<byte[]>();
Task.Factory.StartNew(
() =>
{
var client = new WebClient();
client.DownloadProgressChanged
+= (sender, e) => progress(e.ProgressPercentage);
client.DownloadDataCompleted
+= (sender, e) =>
{
if (!e.Cancelled)
{
if (e.Error == null)
{
source.SetResult((byte[])e.Result);
}
else
{
source.SetException(e.Error);
}
}
else
{
source.SetCanceled();
}
};
});
return source.Task;
}
Used like so:
// var urls = new List<Uri>(...);
// var progressBar = new ProgressBar();
Task.Factory.StartNew(
() =>
{
foreach (var uri in urls)
{
var task = DownloadAsync(
uri,
p =>
progressBar.Invoke(
new MethodInvoker(
delegate { progressBar.Value = (int)(100 * p); }))
);
// Will Block!
// data = task.Result;
}
});

I highly recommend reading "Threading In C#" by Joseph Albahari. I have taken a look through it in preparation for my first (mis)adventure into threading and it's pretty comprehensive.
You can read it here: http://www.albahari.com/threading/

Both of the thread-safety concerns you raised are valid. Furthermore, the both WebClient and Queue are documented as not being thread-safe (at the bottom of the MSDN docs). For example, if two threads were dequeuing simultaneously, they might actually cause the queue to become internally inconsistent or could lead to non-sensical return values. For example, if the implementation of Dequeue() was something like:
1. var valueToDequeue = this._internalList[this._startPointer];
2. this._startPointer = (this._startPointer + 1) % this._internalList.Count;
3. return valueToDequeue;
and two threads each executed line 1 before either continued to line 2, then both would return the same value (there are other potential issues here as well). This would not necessarily throw an exception, so you should use a lock statement to guarantee that only one thread can be inside the method at a time:
private readonly object _lock = new object();
...
lock (this._lock) {
// body of method
}
You could also lock on the WebClient or the Queue if you know that no-one else will be synchronizing on them.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

non reentrant observable in c# - c#

Related

How can I make sure a thread gets dibs after a certain Task

Timeout for asynchronous Task<T> with additional exception handling

Await all Tasks in the list, and after that update the list base on task result

When should TaskCompletionSource<T> be used?

Query on Queues and Thread Safety

Categories

Resources