C# Reusable or Persistent Tasks that behave like Threads - c#

With threads, you can create persistent, reusable local variables which are useful for things like client connections. However, with Tasks like ActionBlock from System.Threading.Tasks.Dataflow, there does not appear to be any sort of persistence or reusability of the action block. So for an ActionBlock that involves interacting with a client, my understanding is that you either need to initialize a client connection from scratch or reuse one in a higher scope (with locking?).
The use case: I am using a .NET library that inverts control. The bulk of the logic (aside from startup and shutdown) must be in a single Task method named ProcessEventsAsync, called by the library, that receives an IEnumerable of data. ProcessEventsAsync must do some processing of all the data, then send it out to some downstream consumers. To improve performance, I am trying to parallelize the logic within ProcessEventsAsync using Tasks. I also want to gather some performance metrics from this Task.
Let me give a detailed example of what I'm doing:
internal class MyClass
{
private String firstDownStreamConnectionString;
private String secondDownStreamConnectionString;
private SomeClient firstClient;
private SomeClient secondClient;
private ReportingClient reportingClient;
private int totalUnhandledDataCount;
public MyClass(String firstDownStreamConnectionString, String secondDownStreamConnectionString, String reportingClientKey)
{
this.firstDownStreamConnectionString = firstDownStreamConnectionString;
this.secondDownStreamConnectionString = secondDownStreamConnectionString;
this.DegreeOfParallelism = Math.Max(Environment.ProcessorCount - 1, 1);
this.reportingClient = new ReportingClient (reportingClientKey, DegreeOfParallelism);
this.totalUnhandledDataCount = 0;
}
// called once when the framework signals that processing is about to be ready
public override async Task OpenAsync(CancellationToken cancellationToken, PartitionContext context)
{
this.firstClient = SomeClient.CreateFromConnectionString(this.firstDownStreamConnectionString);
this.secondClient = SomeClient.CreateFromConnectionString(this.secondDownStreamConnectionString );
await Task.Yield();
}
// this is called repeatedly by the framework
// outside of startup and shutdown, it is the only entrypoint to my logic
public override async Task ProcessEventsAsync(CancellationToken cancellationToken, PartitionContext context, IEnumerable<Data> inputData)
{
ActionBlock<List<Data>> processorActionBlock = new ActionBlock<List<Data>>(
inputData =>
{
SomeData firstDataset = new SomeData();
SomeData secondDataset = new SomeData();
int unhandledDataCount = 0;
foreach (Data data in inputData)
{
// if data fits one set of criteria, put it in firstDataSet
// if data fits other set of criteria, put it in secondDataSet
// otherwise increment unhandledDataCount
}
Interlocked.Add(ref this.totalUnhandledDataCount, unhandledDataCount);
lock (this.firstClient)
{
try
{
firstDataset.SendData(this.firstClient);
} catch (Exception e)
{
lock(this.reportingClient)
{
this.reportingClient.LogTrace(e);
}
}
}
lock (this.secondClient)
{
try
{
secondDataset.SendData(this.secondClient);
} catch (Exception e)
{
lock(this.reportingClient)
{
this.reportingClient.LogTrace(e);
}
}
}
},
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = this.DegreeOfParallelism
});
// construct as many List<Data> from inputData as there is DegreeOfParallelism
// put that in a variable called batches
for(int i = 0; i < DegreeOfParallelism; i++)
{
processorActionBlock.Post(batches[i]);
}
processorActionBlock.Complete();
processorActionBlock.Completion.Wait();
await context.CheckpointAsync();
}
}
I tried to keep this to only the relevant code, I omitted the processing logic, most metric gathering, how data is sent out, shutdown logic, etc.
I want to utilize some flavor of Task that allows for reusability. I don't want to reuse a single client connection for all running Tasks of this type, nor do I want each Task to create a new client connection each time it is invoked. I do want each Thread-like Task to have a persistent set of client connections. Ideally, I also do not want to create a new class that wraps a Task or which extends an abstract class/interface in System.Threading.Tasks.Dataflow.

It sounds like you just need a class that stores the dependencies?
void Main()
{
var doer1 = new ThingDoer();
var doer2 = new ThingDoer();
// A & B use one pair of clients, and C & D use another pair
var taskA = doer1.DoTheThing();
var taskB = doer1.DoTheThing();
var taskC = doer2.DoTheThing();
var taskD = doer2.DoTheThing();
}
public class ThingDoer
{
private SomeClient _someClient;
private SomeErrorReportingClient _someErrorReportingClient;
public ThingDoer(SomeClient someClient, SomeErrorReportingClient someErrorReportingClient)
{
_someClient = someClient;
_someErrorReportingClient = someErrorReportingClient;
}
public ThingDoer()
: this(new SomeClient, new SomeErrorReportingClient)
{
}
public async Task DoTheThing()
{
// Implementation here
}
}
The concept of "reusability" isn't really compatible with tasks.

What you're describing sounds like an async delegate, or Func.
For example:
Func<Task> TestFunc = async () =>
{
Console.WriteLine("Begin");
await Task.Delay(100);
Console.WriteLine("Delay");
await Task.Delay(100);
Console.WriteLine("End");
};
If the function is in scope, you'd just have to:
await TestFunc();
You can reuse it as many times as you need. You can also change the function to accept parameters.
Edit
You can also try AsyncLocal<T>. Per the documentation:
Because the task-based asynchronous programming model tends to abstract the use of threads, AsyncLocal instances can be used to persist data across threads.
The AsyncLocal class also provides optional notifications when the value associated with the current thread changes, either because it was explicitly changed by setting the Value property, or implicitly changed when the thread encountered an await or other context transition.

Related

How to efficiently count HTTP Calls in asp.net core?

I have an abstract class called HttpHelper it has basic methods like, GET, POST, PATCH, PUT
What I need to achieve is this:
Store the url, time & date in the database each time the function is called GET, POST, PATCH, PUT
I don't want to store directly to the database each time the functions are called (that would be slow) but to put it somewhere (like a static queue-memory-cache) which must be faster and non blocking, and have a background long running process that will look into this cache-storage-like which will then store the values in the database.
I have no clear idea how to do this but the main purpose of doing so is to take the count of each calls per hour or day, by domain, resource and url query.
I'm thinking if I could do the following:
Create a static class which uses ConcurrentQueue<T> to store data and call that class in each function inside HttpHelper class
Create a background task similar to this: Asp.Net core long running/background task
Or use Hangfire, but that might be too much for simple task
Or is there a built-in method for this in .netcore?
Both Hangfire and background tasks would do the trick as consumers of the queue items.
Hangfire was there before long running background tasks (pre .net core), so go with the long running tasks for net core implementations.
There is a but here though.
How important is to you that you will not miss a call? If it is, then neither can help you.
The Queue or whatever static construct you have will be deleted the time your application crashes/machine restarts or just plain recycling of the application pools.
You need to consider some kind of external Queuing mechanism like rabbit mq with persistence on.
You can also append to a file, but that might also cause some delays as read/write.
I do not know how complex your problem is but I would consider two solutions.
First is calling Async Insert Method which will not block your main thread but will start task. You can return response without waiting for your log to be appended to database. Since you want it to be implemented in only some methods, I would do it using Attributes and Middleware.
Simplified example:
public IActionResult SomePostMethod()
{
LogActionAsync("This Is Post Method");
return StatusCode(201);
}
public static Task LogActionAsync(string someParameter)
{
return Task.Run(() => {
// Communicate with database (X ms)
});
}
Better solution is creating buffer which will not communicate with database each time but only when filled or at interval. It would look like this:
public IActionResult SomePostMethod()
{
APILog.Log(new APILog.Item() { Date = DateTime.Now, Item1 = "Something" });
return StatusCode(201);
}
public partial class APILog
{
private static List<APILog.Item> _buffer = null;
private cont int _msTimeout = 60000; // Timeout between updates
private static object _updateLock = new object();
static APILog()
{
StartDBUpdateLoopAsync();
}
private void StartDBUpdateLoopAsync()
{
// check if it has been already and other stuff
Task.Run(() => {
while(true) // Do not use true but some other expression that is telling you if your application is running.
{
Thread.Sleep(60000);
lock(_updateLock)
{
foreach(APILog.Item item in _buffer)
{
//Import into database here
}
}
}
});
}
public static void Log(APILog.Item item)
{
lock(_updateLock)
{
if(_buffer == null)
_buffer = new List<APILog.Item>();
_buffer.Add(item);
}
}
}
public partial class APILog
{
public class Item
{
public string Item1 { get; set; }
public DateTime Date { get; set; }
}
}
Also in this second example I would not call APILog.Log() each time but use Middleware in combination with Attribute

.Net Core Async critical section if working on same entity

I need to be sure that a method accessed via a web API cannot be accessed by multiple call at the same time if it work on the same object with the same id
I understand the use of SemaphoreSlim but a simple implemetation of that will lock the critical section for all. But I need that section locked only if it works on the same entity and not on 2 different
This is my scenario, an user start to work, the entity is created and is ready to be modified, then one or more user can manipulate this entity, but a part of this manipulation has to be in a critical section or it will lead to inconsistent data, when the work is finished, the entity will be removed from the work status and moved to and archive and can only be accessed readonly
The class which contains that function is injected as transient in the startup of the application
services.AddTransient<IWorkerService>(f => new WorkerService(connectionString));
public async Task<int> DoStuff(int entityId)
{
//Not Critical Stuff
//Critical Stuff
ReadObjectFromRedis();
ManipulateObject();
UpdateSqlDatabase();
SaveObjectToRedis();
//Not Critical Stuff
}
How can I achieve that?
Try this, I'm not sure if those objects are available in .net-core
class Controller
{
private static ConcurrentDictionary<int, SemaphoreSlim> semaphores = new ConcurrentDictionary<int, SemaphoreSlim>();
public async Task<int> DoStuff(int entityId)
{
SemaphoreSlim sem = semaphores.GetOrAdd(entityId, ent => new SemaphoreSlim(0, 1));
await sem.WaitAsync();
try
{
//do real stuff
}
finally
{
sem.Release();
}
}
}
This is not an easy problem to solve. I have a similar problem with cache: I want that when cache expires only one call is made to repopulate it. Very common approach for token e.g. that you have to renew every now and then.
A problem with an ordinary use of semaphore is that after you exit, all threads that were waiting will just go in and do the call again, that's why you need double check locking to fix it. If you can have some local state for you case I am not sure (I suppose you do since you have a reason for doing only one call and have state most likely), but here is how I solved it for token cache:
private readonly SemaphoreSlim _semaphore = new SemaphoreSlim(1);
public async Task<string> GetOrCreateAsync(Func<Task<TokenResponse>> getToken)
{
string token = Get();
if (token == null)
{
await _semaphore.WaitAsync();
try
{
token = Get();
if (token == null)
{
var data = await getToken();
Set(data);
token = data.AccessToken;
}
}
finally
{
_semaphore.Release();
}
}
return token;
}
Now I don't really know if it is bullet proof. If it were ordinary double check locking (not async), then it is not, though explanation why is really hard and goes to how processor do multithreading behind the scenes and how they reorder instructions.
But in cache case if there is a double call once in a blue moon is not that big of a problem.
I have not found a better way to do that and this is an example provided by e.g. Scott Hanselman and found few places on Stack Overflow as well.
Use of a semaphore is overkill for this. A named mutex will suffice.
class Foo
{
public void Bar(int id)
{
using var mutex = new Mutex(false, id.ToString(), out var createdNew);
if (createdNew)
{
// Business logic here.
}
}
}

How to execute multiple parallel tasks on completion of a prior task

I have a situation where I need to call a web service and, on successful completion, do multiple things with the results returned from the web service. I have developed code that "works" -- just not as I intended. Specifically, I want to take the results from the call to the web service and pass those results onto multiple successive tasks that are to execute in parallel, but what I have at the moment executes the first successive task before starting the second.
I've put together a much simplified example of what I'm currently doing that'll hopefully help illustrate this situation. First, the implementation:
public interface IConfigurationSettings
{
int? ConfigurationSetting { get; set; }
}
public interface IPrintCommandHandler
{
System.Threading.Tasks.Task<bool> ExecuteAsync(byte[] reportContent);
}
public interface ISaveCommandHandler
{
System.Threading.Tasks.Task<bool> ExecuteAsync(byte[] reportContent);
}
public interface IWebService
{
System.Threading.Tasks.Task<object> RetrieveReportAsync(string searchToken, string reportFormat);
}
public class ReportCommandHandler
{
private readonly IConfigurationSettings _configurationSettings;
private readonly IPrintCommandHandler _printCommandHandler;
private readonly ISaveCommandHandler _saveCommandHandler;
private readonly IWebService _webService;
public ReportCommandHandler(IWebService webService, IPrintCommandHandler printCommandHandler, ISaveCommandHandler saveCommandHandler, IConfigurationSettings configurationSettings)
{
_webService = webService;
_printCommandHandler = printCommandHandler;
_saveCommandHandler = saveCommandHandler;
_configurationSettings = configurationSettings;
}
public async Task<bool> ExecuteAsync(string searchToken)
{
var reportTask = _webService.RetrieveReportAsync(searchToken, "PDF");
var nextStepTasks = new List<Task<bool>>();
// Run "print" task after report task.
var printTask = await reportTask.ContinueWith(task => _printCommandHandler.ExecuteAsync((byte[]) task.Result));
nextStepTasks.Add(printTask);
// Run "save" task after report task.
if (_configurationSettings.ConfigurationSetting.HasValue)
{
var saveTask = await reportTask.ContinueWith(task => _saveCommandHandler.ExecuteAsync((byte[]) task.Result));
nextStepTasks.Add(saveTask);
}
var reportTaskResult = await Task.WhenAll(nextStepTasks);
return reportTaskResult.Aggregate(true, (current, result) => current & result);
}
}
So, the web service (third party, nothing to do with me) has an endpoint for doing a search/lookup that, if successful, returns a reference number (I've called it a search token in my example). This reference number is then used to retrieve the results of the lookup (using a different endpoint) in any of several different formats.
The IWebService interface in this example is representative of an application service I created to manage interaction with the web service. The actual implementation has other methods on it for doing a lookup, ping, etc.
Just to make things more interesting, one of the successive tasks is required (will always execute after the primary task) but the other successive task is optional, execution subject to a configuration setting set elsewhere in the application.
To more easily demonstrate the issue, I created a unit test:
public class RhinoMockRepository : IDisposable
{
private readonly ArrayList _mockObjectRepository;
public RhinoMockRepository()
{
_mockObjectRepository = new ArrayList();
}
public T CreateMock<T>() where T : class
{
var mock = MockRepository.GenerateMock<T>();
_mockObjectRepository.Add(mock);
return mock;
}
public T CreateStub<T>() where T : class
{
return MockRepository.GenerateStub<T>();
}
public void Dispose()
{
foreach (var obj in _mockObjectRepository) obj.VerifyAllExpectations();
_mockObjectRepository.Clear();
}
}
[TestFixture]
public class TapTest
{
private const string SearchToken = "F71C8B50-ECD1-4C02-AD3F-6C24F1AF3D9A";
[Test]
public void ReportCommandExecutesPrintAndSave()
{
using (var repository = new RhinoMockRepository())
{
// Arrange
const string reportContent = "This is a PDF file.";
var reportContentBytes = System.Text.Encoding.Default.GetBytes(reportContent);
var retrieveReportResult = System.Threading.Tasks.Task.FromResult<object>(reportContentBytes);
var webServiceMock = repository.CreateMock<IWebService>();
webServiceMock.Stub(x => x.RetrieveReportAsync(SearchToken, "PDF")).Return(retrieveReportResult);
var printCommandHandlerMock = repository.CreateMock<IPrintCommandHandler>();
var printResult = System.Threading.Tasks.Task.FromResult(true);
printCommandHandlerMock
.Expect(x => x.ExecuteAsync(reportContentBytes))
//.WhenCalled(method => System.Threading.Thread.Sleep(TimeSpan.FromSeconds(2)))
.Return(printResult);
var configurationSettingsStub = repository.CreateStub<IConfigurationSettings>();
configurationSettingsStub.ConfigurationSetting = 10;
var saveCommandHandlerMock = repository.CreateMock<ISaveCommandHandler>();
var saveResult = System.Threading.Tasks.Task.FromResult(true);
saveCommandHandlerMock.Expect(x => x.ExecuteAsync(reportContentBytes)).Return(saveResult);
// Act
var reportCommandHandler = new ReportCommandHandler(webServiceMock, printCommandHandlerMock, saveCommandHandlerMock, configurationSettingsStub);
var result = System.Threading.Tasks.Task
.Run(async () => await reportCommandHandler.ExecuteAsync(SearchToken))
.Result;
// Assert
Assert.That(result, Is.True);
}
}
}
Ideally, on completion of the call to RetrieveReportAsync() on IWebService both the "print" and "save" command handlers should be executed simultaneously, having received a copy of the results from RetrieveReportAsync(). However, if the call to WhenCalled... in the unit test is uncommented, and on stepping through the implementation of ReportCommandHandler.ExecuteAsync(), you can see that the "print" command executes and completes before it gets to the "save" command. Now, I am aware that the whole point of await is to suspend execution of the calling async method until the awaited code completes, but it isn't clear to me how to instantiate both the "print" and "save" commands (tasks) as continuations of the "report" task such that they both execute in parallel when the "report" task completes, and the "report" command is then able to return a result that is based on the results from both the "print" and "save" commands (tasks).
Your question really involves addressing two different goals:
How to wait for a task?
How to execute two other tasks concurrently?
I find the mixing of await and ContinueWith() in your code confusing. It's not clear to me why you did that. One of the key things await does for you is to automatically set up a continuation, so you don't have to call ContinueWith() explicitly. Yet, you do anyway.
On the assumption that's simply a mistake, out of lack of full understanding of how to accomplish your goal, here's how I'd have written your method:
public async Task<bool> ExecuteAsync(string searchToken)
{
var reportTaskResult = await _webService.RetrieveReportAsync(searchToken, "PDF");
var nextStepTasks = new List<Task<bool>>();
// Run "print" task after report task.
var printTask = _printCommandHandler.ExecuteAsync((byte[]) reportTaskResult);
nextStepTasks.Add(printTask);
// Run "save" task after report task.
if (_configurationSettings.ConfigurationSetting.HasValue)
{
var saveTask = _saveCommandHandler.ExecuteAsync((byte[]) reportTaskResult);
nextStepTasks.Add(saveTask);
}
var reportTaskResult = await Task.WhenAll(nextStepTasks);
return reportTaskResult.Aggregate(false, (current, result) => current | result);
}
In other words, do await the original task first. Then you know it's done and have its result. At that time, go ahead and start the other tasks, adding their Task object to your list, but not awaiting each one individually. Finally, await the entire list of tasks.

Share queue with two or more stateful services within Service Fabric

Is it possible to share a queue between 2 or more stateful services, or do I need to directly call it via tcp/http to put a message on its own internal queue?
For example; say I have my first service that puts an order on a queue based on a condition:
public sealed class Service1 : StatefulService
{
public Service1(StatefulServiceContext context, IReliableStateManagerReplica reliableStateManagerReplica)
: base(context, reliableStateManagerReplica)
{ }
protected override async Task RunAsync(CancellationToken cancellationToken)
{
var customerQueue = await this.StateManager.GetOrAddAsync<IReliableQueue<Order>>("orders");
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
using (var tx = this.StateManager.CreateTransaction())
{
if (true /* some logic here */)
{
await customerQueue.EnqueueAsync(tx, new Order());
}
await tx.CommitAsync();
}
}
}
}
Then my second service reads from that queue and then continues the processing.
public sealed class Service2 : StatefulService
{
public Service2(StatefulServiceContext context, IReliableStateManagerReplica reliableStateManagerReplica)
: base(context, reliableStateManagerReplica)
{ }
protected override async Task RunAsync(CancellationToken cancellationToken)
{
var customerQueue = await this.StateManager.GetOrAddAsync<IReliableQueue<Order>>("orders");
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
using (var tx = this.StateManager.CreateTransaction())
{
var value = await customerQueue.TryDequeueAsync(tx);
if (value.HasValue)
{
// Continue processing the order.
}
await tx.CommitAsync();
}
}
}
}
I can't see much within the documentation on this, I can see that GetOrAddAsync method can take in a uri but I've seen no examples on how this works or if you can even do cross services?
The idea behind this is to split up the processing on to separate queues so that we don't get in a inconsistent state when we try to re-try a message.
There's no way to share state across services. The statemanager acts on a service partition level.
You could use an external queue for this purpose, like Service Bus.
You could also invert control, by using an Event Driven approach. Service 1 would raise an event, that Service 2 would use as a trigger to continue processing. The data to process could be inside the event, or data stored in another location, referenced to from the event.

Task chaining without TaskCompletionSource?

I'm converting some async/await code to chained tasks, so I can use it in the released framework. The await code looks like this
public async Task<TraumMessage> Get() {
var message = await Invoke("GET");
var memorized = await message.Memorize();
return memorized;
}
where
Task<TraumMessage> Invoke(string verb) {}
Task<TraumMessage> Memorize() {}
I was hoping to chain Invoke and Memorize to return the task produced by Memorize, but that results in a Task<Task<TraumMessage>. The solution i've ended up is a TaskCompletionSource<TraumMessage> as my signal:
public Task<TraumMessage> Get() {
var completion = new TaskCompletionSource<TraumMessage>();
Invoke("GET").ContinueWith( t1 => {
if(t1.IsFaulted) {
completion.SetException(t1.Exception);
return;
}
t1.Result.Memorize().ContinueWith( t2 => {
if(t2.IsFaulted) {
completion.SetException(t2.Exception);
return;
}
completion.SetResult(t2.Result);
});
});
return completion.Task;
}
Is there a way to accomplish this without the TaskCompletionSource?
Yes, the framework comes with a handy Unwrap() extension method for exactly what you want.
Invoke("GET").ContinueWith( t => t.Result.Memorize() ).Unwrap();
If you're doing cancellation then you'll need to pass cancel tokens into the appropriate places, obviously.
I think that's pretty much the only way to accomplish what you want. Chaining disparate Tasks together isn't supported by the continuation APIs, so you have to resort to using a TaskCompletionSource like you have to coordinate the work.
I don't have the Async CTP installed on this machine, but why don't you take a look at the code with a decompiler (or ILDASM if you know how to read IL) to see what it's doing. I bet it does something very similar to your TCS code under the covers.
You can use attached child tasks. The parent task will only transition into the completed status when all child tasks are complete. Exceptions are propagated to the parent task.
You will need a result holder, as the result will be assigned after the parent task's delegate has finished, but will be set when the parent tasks continuations are run.
Like this:
public class Holder<T> where T: class
{
public T Value { get; set; }
}
public Task<Holder<TraumMessage>> Get() {
var invokeTask = Invoke("GET");
var result = invokeTask.ContinueWith<Holder<TraumMessage>>(t1 => {
var holder = new Holder<TraumMessage>();
var memorizeTask = t1.Result.Memorize();
memorizeTask.ContinueWith(t2 => {
holder.Value = t2.Result;
}, TaskContinuationOptions.AttachedToParent);
return holder;
});
return result;
}

Categories