send multiple responses using task based approach instead of events/callbacks - c#

Recently in C# 4.0 the task based approach for async programming was unveiled. So we were trying to develop some of our functions which used callbacks earlier.
The problem we are facing is with the implementation of multiple responses for the functions using tasks. E.g. we have a function which fetches some data from a thirdparty API. But before fetching the data from API we first check whether we already have it in our in-memory cache or in DB then only we go to the API. The main client application sends a request for a list of symbols for which data to fetch. If we find data for some symbols in cache or in DB we send it immediately via the callback. For remaining symbols we request the API.
This gives a feeling of real-time processing on client application for some symbols. And for other symbols the user gets to know that it will take time. If I do not send responses to the client instantly and first collect all the data and then only send response for the whole list then the user will be stuck for 99 symbols even if only 1 symbol is to be fetched from API.
How can I send multiple responses using the task based approach?

It seems like you want to have an async method that returns more than once. The answer is you can't.
What you can do is:
Call 2 different methods with the same "symbols". The first only checks the cache and DB and returns what it can, and the second one only calls the remote API. This way you get what you can fast from a cache and the rest more slowly.
Keep using callbacks as a "mini producer consumer" design so you can call it as many times you like.
I could try for a more concrete answer if you post the code you're using.

Related

Limited Scaling Azure Function

I have a scenario where I have a stream of URLs which I need to make a HTTP request against. I'll then download the data received and save it in Blob storage. I have to do this using Azure functions so that I'm only paying for the service when there are actually URLs to process.
However, the difficulty I'm having is conceiving of a way of triggering downloads through a limited number of proxies. Although I'm happy for the download function to scale out to the number of proxies I have available, I want each proxy to deal with each URL it receives in series. In other words, each proxy must be limited to downloading data from one URL at a time.
I considered having URLs in one queue and proxies in another queue and triggering a function when one of each is available, then pushing the used proxy back into the proxy queue, but functions can only take one trigger.
I also considered creating as many queues as there are proxies and distributing URLs between the queues, but I'm not sure how to limit the concurrency on each triggered function to one.
Anybody got an idea how to do this?
Okay, I found a way to do this via this post:
https://medium.com/#yuka1984/azure-functions-%E3%81%AE-singletonattribute%E3%81%A8mode%E3%83%97%E3%83%AD%E3%83%91%E3%83%86%E3%82%A3-bb728062198e
The answer is to add a [Singleton] attribute to the function.
However, according to this comment, you are spending money while your entities are awaiting processing:
https://github.com/Azure/azure-functions-host/issues/912#issuecomment-419608830

Invoke NServiceBus Saga as a single awaitable request-response

Consider a web application that implemented every database action except querying (i.e. add, update, remove) as a NServiceBus message, so that whenever a user calls a web API, in the back-end it will be mapped to await endpointInstance.Request method to return the response in the same HTTP request connection.
The challenge is when a message handler needs to send some other messages and wait for their response to finish its job. NServiceBus does not allow to call Request inside a message handler.
I ended up using Saga to implement message handlers that are relied on some other message handler responses. But the problem with Saga is that I can't send back the result in the same HTTP request, because Saga uses publish/subscribe pattern.
All our web APIs need to be responded in the same HTTP request (connection should be kept open until the result is received or a timeout exception occurred).
Is there any clean solution (preferably without using Saga)?
An example scenario:
user call http://test.com/purchase?itemId=5&paymentId=133
web server calls await endpointInstance.Request<PurchaseResult>(new PurchaseMessage(itemId, paymentId));
PurchaseMessage handler should call await endpointInstance.Request<AddPaymentResult>(new AddPaymentMessage(paymentId));
if the AddPaymentResult was successfull, store the purchase details in the database and return true as PurchaseResult, otherwise return false
You're trying to achieve something that we (at Particular Software) are trying to actively prevent. Let me explain.
With Remote Procedure Calls (RPC) you call another component out-of-process. That what makes the procedure call 'remote'. Where with regular programming you do everything in-process and it is blazing fast, with RPC you have the overhead of serialization, latency and more. Basically, you have to deal with the fallacies of distributed computing.
Still, people do it for various reasons. Sometimes because you want to use a WebAPI (or 'old fashioned' web service) because it offers the functionality you don't want to develop. Oldest example in the book is searching for an address by postal code. Or deducting money from someone's bank account. If you're building a CRM, you can use these remote components. These days a lot of people build distributed monoliths because they are taught at conferences that this is a good thing. In an architecture diagram, it looks really nice, but there's still temporal coupling that can provide a lot of headaches.
Some of these headaches come from the fact that you're trying to do stuff in an atomic action. Back in the days, with in-process calling of code/classes/etc this was easy and fast. Until you hit limitations, like tons of locks on a database.
A solution to this is asynchronous communication. You send some information via fire-and-forget. This solves temporal coupling. Instead of having a database that is getting dozens and dozens of requests to update data, etc. and as a result, your website is grinding to a halt, you have various options to make sure this doesn't happen. This is a really good thing, because instead of a single atomic operation, you have various smaller operations and many ways to distributed work, scale your system, etc, etc.
It also brings additional challenges, because not everyone is able to work with fire-and-forget. Some systems that were already built, try to introduce asynchronous communication via messaging (and hopefully NServiceBus). Some parts can work flawlessly with this. But others parts can't. Mainly the user-interface (UI). Because it was built to get an immediate result. So when you send a message from the UI, you expect a result!
With NServiceBus we've built a package called "Client-Side Callbacks" to make exactly this a possibility. We highly recommend our customers not to use it, except for this specific scenario that I just described. It is much better to migrate your entire UI to be able to deal with the fact that you don't receive an immediate answer, but we understand this is so much work, that not many will be able to achieve this.
However once that first message was sent and the UI received a result, there is no need to use callbacks anymore. As a result I'd like to propose this scenario:
use call http://test.com/purchase?itemId=5&paymentId=133
web server calls await endpointInstance.Request<PurchaseResult>();
PurchaseMessage handler retrieves info it needs and sends or publishes a message to (an)other component(s) and then replies back to the web server with an answer.
The next handler works with the send/published message and continues the process
Let us know if you need more information. You can always contact us by sending an email to support#particular.net

Switch from up-to-date historical data observable to live data observable without duplicates

An application (Saver) receives live data over a websocket from a remote server and stores it in a database. It exposes a REST endpoint to clients that returns all data stored in the database so far.
A client application subscribes to live data on the remote server's websocket on start up. It then makes the request to Saver's REST endpoint, and receives all data so far.
Both data sources are exposed as IObservable<AType> in the client application.
AType includes a Timestamp property.
How can I combine these two Observables so that they are sequential (by timestamp) without duplicates?
Update: duplicates are not possible within either one of the data sources/Observables, but are possible when they are combined, since the websocket is subscribed to before the REST endpoint is called. They are subscribed to in that order to avoid loss of data.
Ordering a real time stream by a value without buffering/windowing is not possible. You need to explicitly buffer the merged streams (Observable.Merge) using Observable.Buffer/Observable.Window operators and then you can use Enumerable.SortByto sort by timestamp.
Here is the relevant issue in the rxnet github page: https://github.com/Reactive-Extensions/Rx.NET/issues/122 .
There is a Observable.Distinct operator but be careful using it on long running /high throughput streams as it stores the hashes of the values to detect duplicates.
My working solution:
Subscribe to websocket
Retrieve all data so far over REST
If anything has arrived over websocket, get the timestamp of the first item and remove this and anything later from the REST results
Combine the two streams like this:
combinedObservable = truncatedRestData.ToObservable().Concat(socketObservable);

ASP.NET Web API Async Reponses & Writes

I have an application runs ASP.NET Web API on the backend. There's a specific controller that looks something like
public class MyController : ApiController
{
dynamic Post(dynamic postParams)
{
var data = retrieveData(postParams.foo, postParams.bar);
writeDataToDb(data);
return data;
}
}
So there are three operations, the retrieval of data based on the post parameters, writing the retrieved data to a DB, and returning the retrieved data as JSON to the client.
This could be greatly sped up if the DB write step was not blocking returning the JSON to the client, as they are not dependent on one another.
My concern is that if I spin off the DB write off in another thread that the Post method returning will cause the process to tear-down and could potentially kill the DB write. What's the correct way to handle this scenario?
Should I have a separate handler that I shoot an async web request to so that the DB write will occur in its own process?
I would generate a seperation of concerns. Let one method query the data from the DB (suitable for a GET method), and another for updating the data in your DB (suitable for a POST or UPDATE method). That way, retrieving your data would be a lighter operation and less time conuming.
As a side note, spinning of new threads without registering them with the ASP.NET ThreadPool is dangerous, as IIS may decide to recycle your app at certain times. Therefore, if you're on .NET 4.5.2, make sure to use HostingEnvironment.QueueBackgroundWorkItem to queue work on the ThreadPool. If not, you can use stephan clearys BackgroundTaskManager. Also, i suggest reading more in stephans article Fire and Forget on ASP.NET

Proper way to handle thousands of calls to external service from asp.net (mvc)

I'm tasked to create a web application. I'm currently using c# & asp.net (mvc - but i doubt its relevant to the question) - am a rookie developer and somewhat new to .net.
Part of the logic in the application im building is to make requests to an external smsgateway by means of hitting a particular url with a request - either as part of a user-initiated action in the webapp (could be a couple of messages send) or as part of a scheduledtask run daily (could and will be several thousand message send).
In relation to a daily task, i am afraid that looping - say - 10.000 times in one thread (especially if im also to take action depending on the response of the request - like write to a db) is not the best strategy and that i could gain some performance/timesavings from some parallelization.
Ultimately i'm more afraid that thousands of users at the same time (very likely) will perform the action that triggers a request. With a naive implementation that spawns some kind of background thread (whatever its called) for each request i fear a scenario with hundreds/thousands of requests at once.
So if my assumptions are correct - how do i deal with this? do i have to manually spawn some appropriate number of new Thread()s and coordinate their work from a producer/consumer-like queue or is there some easy way?
Cheers
If you have to make 10,000 requests to a service then it means that the service's API is anemic - probably CRUD-based, designed as a thin wrapper over a database instead of an actual service.
A single "request" to a well-designed service should convey all of the information required to perform a single "unit of work" - in other words, those 10,000 requests could very likely be consolidated into one request, or at least a small handful of requests. This is especially important if requests are going to a remote server or may take a long time to complete (and 2-3 seconds is an extremely long time in computing).
If you do not have control over the service, if you do not have the ability to change the specification or the API - then I think you're going to find this very difficult. A single machine simply can't handle 10,000 outgoing connections at once; it will struggle with even a few hundred. You can try to parallelize this, but even if you achieve a tenfold increase in throughput, it's still going to take half an hour to complete, which is the kind of task you probably don't want running on a public-facing web site (but then, maybe you do, I don't know the specifics).
Perhaps you could be more specific about the environment, the architecture, and what it is you're trying to do?
In response to your update (possibly having thousands of users all performing an action at the same time that requires you to send one or two SMS messages for each):
This sounds like exactly the kind of scenario where you should be using Message Queuing. It's actually not too difficult to set up a solution using WCF. Some of the main reasons why one uses a message queue are:
There are a large number of messages to send;
The sending application cannot afford to send them synchronously or wait for any kind of response;
The messages must eventually be delivered.
And your requirements fit this like a glove. Since you're already on the Microsoft stack, I'd definitely recommend an asynchronous WCF service backed by MSMQ.
If you are working with SOAP, or some other type XML request, you may not have an issue dealing with the level of requests in a loop.
I set up something similar using a SOAP server with 4-5K requests with no problem...
A SOAP request to a web service (assuming .NET 2.0 and superior) looks something like this:
WebServiceProxyClient myclient = new WebServiceProxyClient();
myclient.SomeOperation(parameter1, parameter2);
myclient.Close();
I'm assuming that this code will will be embedded into your business logic that you will be trigger as part of the user initiated action, or as part of the scheduled task.
You don't need to do anything especial in your code to cope with a high volume of users. This will actually be a matter of scalling on your platform.
When you say 10.000 request, what do you mean? 10.000 request per second/minute/hour, this is your page hit per day, etc?
I'd also look into using an AsyncController, so that your site doesn't quickly become completely unusable.

Categories