Force C# async tasks to be lazy? - c#

I have a situation where I have an object tree created by a special factory. This is somewhat similar to a DI container, but not quite.
Creation of objects always happens via constructor, and the objects are immutable.
Some parts of the object tree may not be needed in a given execution and should be created lazily. So the constructor argument should be something that is just a factory for on-demand creation. This looks like a job for Lazy.
However, object creation may need to access slow resources and is thus always async. (The object factory's creation function returns a Task.) This means that the creation function for the Lazy would need to be async, and thus the injected type needs to be Lazy<Task<Foo>>.
But I'd rather not have the double wrapping. I wonder if it is possible to force a Task to be lazy, i.e. to create a Task that is guaranteed to not execute until it is awaited. As I understand it, a Task.Run or Task.Factory.StartNew may start executing at any time (e.g. if a thread from the pool is idle), even if nothing is waiting for it.
public class SomePart
{
// Factory should create OtherPart immediately, but SlowPart
// creation should not run until and unless someone actually
// awaits the task.
public SomePart(OtherPart eagerPart, Task<SlowPart> lazyPart)
{
EagerPart = eagerPart;
LazyPart = lazyPart;
}
public OtherPart EagerPart {get;}
public Task<SlowPart> LazyPart {get;}
}

I'm not sure exactly why you want to avoid using Lazy<Task<>>,, but if it's just for keeping the API easier to use, as this is a property, you could do it with a backing field:
public class SomePart
{
private readonly Lazy<Task<SlowPart>> _lazyPart;
public SomePart(OtherPart eagerPart, Func<Task<SlowPart>> lazyPartFactory)
{
_lazyPart = new Lazy<Task<SlowPart>>(lazyPartFactory);
EagerPart = eagerPart;
}
OtherPart EagerPart { get; }
Task<SlowPart> LazyPart => _lazyPart.Value;
}
That way, the usage is as if it were just a task, but the initialisation is lazy and will only incur the work if needed.

#Max' answer is good but I'd like to add the version which is built on top of Stephen Toub' article mentioned in comments:
public class SomePart: Lazy<Task<SlowPart>>
{
public SomePart(OtherPart eagerPart, Func<Task<SlowPart>> lazyPartFactory)
: base(() => Task.Run(lazyPartFactory))
{
EagerPart = eagerPart;
}
public OtherPart EagerPart { get; }
public TaskAwaiter<SlowPart> GetAwaiter() => Value.GetAwaiter();
}
SomePart's explicitly inherited from Lazy<Task<>> so it's clear that it's lazy and asyncronous.
Calling base constructor wraps lazyPartFactory to Task.Run to avoid long block if that factory needs some cpu-heavy work before real async part. If it's not your case, just change it to base(lazyPartFactory)
SlowPart is accessible through TaskAwaiter. So SomePart' public interface is:
var eagerValue = somePart.EagerPart;
var slowValue = await somePart;

Declaration:
private Lazy<Task<ServerResult>> _lazyServerResult;`
ctor()
{
_lazyServerResult = new Lazy<Task<ServerResult>>(async () => await
GetServerResultAsync())
}
Usage:
ServerResult result = await _lazyServerResult.Value;

Using the constructor for Task make the task lazy a.k.a not running until you say it to run, so you could do something like this:
public class TestLazyTask
{
private Task<int> lazyPart;
public TestLazyTask(Task<int> lazyPart)
{
this.lazyPart = lazyPart;
}
public Task<int> LazyPart
{
get
{
// You have to start it manually at some point, this is the naive way to do it
this.lazyPart.Start();
return this.lazyPart;
}
}
}
public static async void Test()
{
Trace.TraceInformation("Creating task");
var lazyTask = new Task<int>(() =>
{
Trace.TraceInformation("Task run");
return 0;
});
var taskWrapper = new TestLazyTask(lazyTask);
Trace.TraceInformation("Calling await on task");
await taskWrapper.LazyPart;
}
Result:
SandBox.exe Information: 0 : Creating task
SandBox.exe Information: 0 : Calling await on task
SandBox.exe Information: 0 : Task run
However I strongly recommend you to use Rx.NET and IObservable as in your case you will get way less troubles for handling less naive cases to start your task at the right moment.
Also it makes the code a bit cleaner in my opinion
public class TestLazyObservable
{
public TestLazyObservable(IObservable<int> lazyPart)
{
this.LazyPart = lazyPart;
}
public IObservable<int> LazyPart { get; }
}
public static async void TestObservable()
{
Trace.TraceInformation("Creating observable");
// From async to demonstrate the Task compatibility of observables
var lazyTask = Observable.FromAsync(() => Task.Run(() =>
{
Trace.TraceInformation("Observable run");
return 0;
}));
var taskWrapper = new TestLazyObservable(lazyTask);
Trace.TraceInformation("Calling await on observable");
await taskWrapper.LazyPart;
}
Result:
SandBox.exe Information: 0 : Creating observable
SandBox.exe Information: 0 : Calling await on observable
SandBox.exe Information: 0 : Observable run
To be more clear: The Observable here handle when to start the task, it is Lazy by default and will run the task everytime it is subscribed (here subscribe is used by the awaiter that enable the use of the await keyword).
You could, if you need to, make the task run only once every minute (or ever) and having its result published across all subscribers to save performance for instance, like in a real world app, all of this and many more is handled by observables.

Related

Kick off async method in constructor in c#

I'm wondering is it safe to call async method in a constructor in the following way:
Let's say we have an async method Refresh that is fetching data from the internet. We are also using Reactive Extensions to notify everyone that is interested that new data was fetched.
I'm wondering is it safe to call Refresh first time in a class constructor? Can I use such construction?
Task.Run(Refresh);
or
Refresh().ConfigureAwait(false)
I'm not really interested here if the method has finished or not, since I will get notified through Reactive Extensions when data is fetched.
Is it ok to do something like this?
public class MyClass
{
BehvaiorSubject<Data> _dataObservable = new BehvaiorSubject(Data.Default);
IObservable DataObservable => _dataObservable;
public MyClass()
{
Refresh().ConfigureAwait(false);
}
public async Task Refresh()
{
try
{
var data = await FetchDataFromNetwork();
_dataObservable.OnNext(data);
}
catch (VariousExceptions e)
{
//do some appropriate stuff
}
catch(Exception)
{
//do some appropriate stuff
}
}
}
Though people are against the idea, we have similar things in our project :)
The thing is you have to properly handle any exceptions thrown from that Task in case they go unobserved. Also you might need to expose the task via either a method or a property, just so that it is possible to await (when necessary) the async part is finished.
class MyClass
{
public MyClass()
{
InitTask = Task.Delay(3000);
// Handle task exception.
InitTask.ContinueWith(task => task.Exception, TaskContinuationOptions.OnlyOnFaulted);
}
public Task InitTask { get; }
}

Using Task.Run() for hardware interfacing thread invocations

I need to invoke a method that meets the following criteria.
The method may run for hours.
The method may interface with hardware.
The method may request user input (parameter values, confirmation, etc). The request should block the method until input has been received.
I have a prototype implementation that fulfills this criteria using the following design.
Assume a Form exists and contains a Panel.
The IntegerInput class is a UserControl with a TextBox and a Button.
public partial class IntegerInput : UserControl
{
public TaskCompletionSource<int> InputVal = new TaskCompletionSource<int>(0);
public IntegerInput()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
int val = 0;
Int32.TryParse(textBox1.Text, out val);
InputVal.SetResult(val);
}
}
The Form1UserInput class is instanced by Form1. The container is a Panel set by Form1 before being provided to the invoking class.
public interface IUserInput
{
Task<int> GetInteger();
}
public class Form1UserInput : IUserInput
{
public Control container;
private IntegerInput integerInput = new IntegerInput();
public IntegerInput IntegerInput { get { return integerInput; } }
public async Task<int> GetInteger()
{
container.Invoke(new Action(() =>
{
container.Controls.Clear();
container.Controls.Add(integerInput);
}));
await integerInput.InputVal.Task;
return integerInput.InputVal.Task.Result;
}
}
The Demo class contains the method I want to invoke.
public class Demo
{
public IUserInput ui;
public async void MethodToInvoke()
{
// Interface with hardware...
// Block waiting on input
int val = await ui.GetInteger();
// Interface with hardware some more...
}
public async void AnotherMethodToInvoke()
{
// Interface with hardware...
// Block waiting on multiple input
int val1 = await ui.getInteger();
int val2 = await ui.getInteger();
// Interface with hardware...
}
}
This is a rough outline of what the invoking class looks like. The call to Task.Run() is accurate for my prototype.
public class Invoker
{
public async Task RunTestAsync(IUserInput ui)
{
object DemoInstance = Activator.CreateInstance(typeof(Demo));
MethodInfo method = typeof(Demo).GetMethod("MethodToInvoke");
object[] args = null;
((IUserInput)DemoInstance).ui = ui;
var t = await Task.Run(() => method.Invoke(DemoInstance, args));
// Report completion information back to Form1
}
}
The Form1 controller class instances the Invoker and calls RunTestAsync passing in an instance of Form1UserInput.
I am aware of some concerns about long running Tasks that may block and what that would mean for ThreadPool resources. However, the ability to invoke multiple methods at once is not provided by the application I am building. It's possible that the application may provide some other limited functionality while the invoked method is running but the current requirements do not specify such functionality in detail. I anticipate that there would only be one long running thread in service at any time.
Is the use of Task.Run() for this type of method invocation a reasonable implementation? If not, what would a more reasonable implementation be that provides for the required criteria? Should I consider a dedicated thread outside of the ThreadPool for this invocation?
Is the use of Task.Run() for this type of method invocation a reasonable implementation?
Assuming that your "interface with hardware" can only be done using synchronous APIs, then yes, Task.Run is fine for that.
However, I would change when it's called. Right now, Task.Run is wrapping an async void method that executes on the thread pool (and uses Invoke to jump back on the UI thread). These are each problematic: Task.Run over async void will seem to complete "early" (i.e., at the first await); and using Invoke indicates that there's some tight coupling going on (UI calls background service which calls UI).
I would replace the async void with async Task and also change where Task.Run is used to avoid Invoke:
public async Task<int> GetInteger()
{
container.Controls.Clear();
container.Controls.Add(integerInput);
// Note: not `Result`, which will wrap exceptions.
return await integerInput.InputVal.Task;
}
public async Task MethodToInvokeAsync()
{
await Task.Run(...); // Interface with hardware...
// Block waiting on input
int val = await ui.GetInteger();
await Task.Run(...); // Interface with hardware some more...
}
var t = await (Task)method.Invoke(DemoInstance, args);

How to cache data as long as Task executes?

private Data GetDefaultData()
{
var data = Task.Factory.StartNew(() => GetData());
return data.Result;
}
If GetData() executes in 100ms and I run one GetDefaultData() per 10ms. Is it correct that first 10 calls will use the same data.Result? GetData() collect Data inside lock statement. If not how to change the code to provide this opportunity?
Let's say, we have the first call GetDefaultData (GetData() executes in 100ms), and then we have 10 calls(GetDefaultData() per 10ms). I want that this rest of calls will get the same answer as the first one.
It sounds like you want the Lazy<T> class.
public class YourClass
{
private readonly Lazy<Data> _lazyData;
public YourClass()
{
_lazyData = new Lazy<Data>(() => GetData());
}
private Data GetDefaultData()
{
return _lazyData.Value;
}
public Data GetData()
{
//...
}
}
The first thread to call GetDefaultData() will run GetData() when it hits _lazyData.Value, all the rest of the threads will block on the call _lazyData.Value till the first thread finishes and use the result from that first thread's call. GetData() will only ever be called once.
If you don't want the call to block you can easily make a AsyncLazy<T> class that uses Threads internally.
public class AsyncLazy<T> : Lazy<Task<T>>
{
public AsyncLazy(Func<T> valueFactory) :
base(() => Task.Run(valueFactory))
{
}
public AsyncLazy(Func<Task<T>> taskFactory, bool runFactoryInNewTask = true) :
base(() => runFactoryInNewTask ? Task.Run(taskFactory) : taskFactory())
{
}
//This lets you use `await _lazyData` instead of doing `await _lazyData.Value`
public TaskAwaiter<T> GetAwaiter()
{
return Value.GetAwaiter();
}
}
Then your code becomes (I also made GetData an async function too, but the overloads of AsyncLazy let it be either or)
public class YourClass
{
private readonly AsyncLazy<Data> _lazyData;
public YourClass()
{
_lazyData = new AsyncLazy<Data>(() => GetData(), false);
}
private async Task<Data> GetDefaultData()
{
//I await here to defer any exceptions till the returned task is awaited.
return await _lazyData;
}
public Task<Data> GetData()
{
//...
}
}
EDIT: There are some possible issues with AsyncLazy, see here.
In short: No.
Each time you call GetDefaultData() a new task is started, so Data.Result will remain unchanged for the duration of GetData() and then contain what you assigned to it in GetData().
Also returning a value from a new Task object will do you no good - this is how multitasking works. Your code will continue to execute in the main thread, but you result value will only be set once the separate task is finished executing. Whether it contains lock statements or not.
Probably the ReaderWriterLock will suit you in this purpose. ReaderWriterLock is used to synchronize access to a resource. At any given time, it allows either concurrent read access for multiple threads, or write access for a single thread. This logic should be embaded into your GetData method probably, so that depening on some timeout it could either use writelock and hold it for this timeout time, otherwise use read operation.

TPL Fire and Forget using a Separate Class

I'm trying to implement fire and forget functionality, using the Task Parallel Library. With an inline call to Task.Factory.StartNew, everything works as expected. However, I want to move the Task.Factory.StartNew call into a separate class so that I can add logging, error handling, etc, and potentially upgrade the code in the future as better threading classes, etc are added to the .NET Framework, without duplicating code.
Below is a unit test that I would expect to pass, but that does not. I would appreciate help trying to figure out how to make this work.
[TestFixture]
public class ThreadingServiceFixture
{
public static bool methodFired = false;
[Test]
public void CanFireAndForgetWithThreadingService()
{
try
{
var service = new ThreadingService();
service.FireAndForget(() => methodFired = true);
var endTime = DateTime.Now.AddSeconds(1);
while(DateTime.Now < endTime)
{
//wait
}
Assert.IsTrue(methodFired == true);
}
finally
{
methodFired = false;
}
}
}
public class ThreadingService
{
public Task FireAndForget(Action action)
{
return Task.Factory.StartNew(() => action);
}
}
You're not executing the action, you're just returning it.
Try:
return Task.Factory.StartNew(() => action());
If is "fire and forget" you don't need to return the Task from the FireAndForget method, because the caller could get that Task and cancel it (strictly speaking the caller would "remember" of the call).
If you want to invoke this method from many services that do not inherit from a common ThreadingService you can implement an extension method via an interface.
public interface IFireAndForget
{
// no member needed.
}
public static class FireAndForgetExtensions
{
public static void FireAndForget(this IFireAndForget obj, Action action)
{
// pass the action, not a new lambda
Task.Factory.StartNew(action);
}
}
// using
public class ThreadingService : IFireAndForget
{
}
Also note the in your method you have to pass the action to the StartNew method insted of pass a lambda that return the action parameter.
You did not invoke the action in the ThreadingService
The code should read something like
public class ThreadingService
{
public Task FireAndForget(Action action)
{
return Task.Factory.StartNew(() => action.Invoke());
}
}
Additional note: testing state with a public field is evil. Think about repeatability, maintenance, running tests in different order. You should move bool methodFired inside the test. I would also assume there is a better technique to test this (but I am not sure which one).
Testing threaded code is hard.
Basing your tests on timing is a bad idea, they may become non-deterministic and you might observe erratic behavior on you build server. Imagine a tests that sometime passes and sometimes doesn't!
Your code has a bug, since you are not actually invoking the action.
But consider this variation:
[Test]
[TimeOut(5000)]
public void CanFireAndForgetWithThreadingService()
{
var service = new ThreadingService();
ManualResetEvent mre = new ManualRestEvent(bool); // I never remember what is the default...
service.FireAndForget(() => mre.Set() /*will release the test asynchroneously*/);
mre.WaitOne(); // blocks, will timeout if FireAndForget does not fire the action.
}
Yes, we are still using timing. But the test the timeout will happen only if the code breaks!
In all other scenarios, the test is absolutely predictable and takes a very short amount of time to execute, no waiting and praying for timing issues not to happen ;-)

Is there a common pattern for initializing object on a background thread?

I have an object that takes a long time to be initialized. Therefore I the capability to Start Initializing on application startup. Any subsequent calls to methods on the class we need to have a delay mechanism that waits for the class to finish initialization.
I have a couple of potential solutions however I am not entirely satisfied with either of them. The first uses Task.Delay in a while loop and the second uses SemaphoreSlim but involves some unnecessary blocking. I feel this must be a fairly common requirement, can anybody provide some advice on how to best manage this?
Oh btw, this is a Metro application so we have limited API's
Here is the pseudocode:
public class ExposeSomeInterestingItems
{
private InitialisationState _initialised;
private readonly SemaphoreSlim _waiter =
new SemaphoreSlim(0);
public async Task StartInitialize()
{
if (_initialised == InitialisationState.Initialised)
{
throw new InvalidOperationException(
"Attempted to initialise ActiveTrackDown" +
"loads when it is already initialized");
}
_initialised =
InitialisationState.StartedInitialisation;
new TaskFactory().StartNew(async () =>
{
// This takes some time to load
this._interestingItems =
InterestingItemsLoader.LoadItems();
_waiter.Release();
_initialised = InitialisationState.Initialised;
});
}
public InterestingItem GetItem(string id)
{
DelayUntilLoaded();
DelayUntilLoadedAlternative();
}
private async Task DelayUntilLoaded()
{
if (_initialised == InitialisationState.NotInitialised)
{
throw new InvalidOperationException("Error " +
"occurred attempting to access details on " +
"ActiveTrackDownloads before calling initialise");
}
while (true)
{
if (_initialised == InitialisationState.Initialised)
{
return;
}
await Task.Delay(300);
}
}
private async Task DelayUntilLoadedAlternative()
{
if (_initialised == InitialisationState.NotInitialised)
{
throw new InvalidOperationException(
"Error occurred attempting to access details " +
"on ActiveTrackDownloads before calling initialise");
}
try
{
await _waiter.WaitAsync();
}
finally
{
_waiter.Release();
}
}
}
I think that a better design would be an asynchronous factory, where the calling code awaits the object creation and then receives a regular object instance.
Stealing liberally from Stephen Toub:
public class AsyncLazy<T> : Lazy<Task<T>>
{
public AsyncLazy(Func<T> valueFactory) :
base(() => Task.Run(valueFactory)) { }
public AsyncLazy(Func<Task<T>> taskFactory) :
base(() => Task.Run(taskFactory)) { }
public TaskAwaiter<T> GetAwaiter() { return Value.GetAwaiter(); }
}
public static class ExposeSomeInterestingItemsFactory
{
public static AsyncLazy<ExposeSomeInterestingItems> Instance
{
get { return _instance; }
}
private static readonly AsyncLazy<ExposeSomeInterestingItems> _instance =
new AsyncLazy<ExposeSomeInterestingItems>(() => new ExposeSomeInterestingItems());
public static void StartInitialization()
{
var unused = Instance.Value;
}
}
public class ExposeSomeInterestingItems
{
public ExposeSomeInterestingItems()
{
// This takes some time to load
this._interestingItems = InterestingItemsLoader.LoadItems();
}
public InterestingItem GetItem(string id)
{
// Regular logic. No "delays".
}
}
...
var exposeSomeInterestingItems = await ExposeSomeInterestingItemsFactory.Instance;
var item = exposeSomeInterestingItems.GetItem("id");
That way, you keep the Single Responsibility Principle nicely:
AsyncLazy<T> combines Task<T> with Lazy<T> (so the instance is created asynchronously only when needed).
ExposeSomeInterestingItemsFactory contains construction logic.
ExposeSomeInterestingItems is only concerned with exposing interesting items, rather than having to pollute all its members with asynchronous delays.
Also, this solution is asynchronous throughout (no blocking), which is good (particularly for Metro apps).
Update, 2012-09-14: I've taken this code and cleaned it up and commented it on my blog.
You can use the Task<T> for this. This will take care of all the synchronisation for you and allows you to block untill the value is available:
private static Task<HeavyObject> heavyObjectInitializer;
// Call this method during application initialization
public static void Bootstrap()
{
heavyObjectInitializer = new Task<HeavyObject>(() =>
{
// creation of heavy object here
return new HeavyObject();
});
// Start running the initialization right now on a
// background thread. We don't have to wait on this.
heavyObjectInitializer.Start();
}
// Call this method whenever you need to use the object.
public static HeavyObject GetHeavyObject()
{
// Get the initialized object, or block untill this
// instance gets available.
return heavyObjectInitializer.Result;
}
Optionally, you can also query to see if the object is available or not:
public static bool IsHeavyObjectAvailable
{
get { return heavyObjectInitializer.IsCompleted; }
}
Put the method calls into a queue which you process when you finish initialising. Only put methods into the queue when you have not yet initialised.
You could move to a an event driven architecture where you application is in different states.
Initially the application moves into the Starting state. In this state HeavyObject is created using a background task. When the initialization is complete an event is fired. (You don't have to use an actual .NET event. You can use callbacks or something similar and frameworks like Reactive Extensions allows you to compose sequences of events.)
When all initialization events have fired you move into the Started state of your application. For an UI application this could modify the UI to enable some previously disabled operations.
Check this Prototype Pattern. Maybe it can help you
You only need to create your object once and clone it when you need another one.

Categories