I have an application runs ASP.NET Web API on the backend. There's a specific controller that looks something like
public class MyController : ApiController
{
dynamic Post(dynamic postParams)
{
var data = retrieveData(postParams.foo, postParams.bar);
writeDataToDb(data);
return data;
}
}
So there are three operations, the retrieval of data based on the post parameters, writing the retrieved data to a DB, and returning the retrieved data as JSON to the client.
This could be greatly sped up if the DB write step was not blocking returning the JSON to the client, as they are not dependent on one another.
My concern is that if I spin off the DB write off in another thread that the Post method returning will cause the process to tear-down and could potentially kill the DB write. What's the correct way to handle this scenario?
Should I have a separate handler that I shoot an async web request to so that the DB write will occur in its own process?
I would generate a seperation of concerns. Let one method query the data from the DB (suitable for a GET method), and another for updating the data in your DB (suitable for a POST or UPDATE method). That way, retrieving your data would be a lighter operation and less time conuming.
As a side note, spinning of new threads without registering them with the ASP.NET ThreadPool is dangerous, as IIS may decide to recycle your app at certain times. Therefore, if you're on .NET 4.5.2, make sure to use HostingEnvironment.QueueBackgroundWorkItem to queue work on the ThreadPool. If not, you can use stephan clearys BackgroundTaskManager. Also, i suggest reading more in stephans article Fire and Forget on ASP.NET
Related
I am using Azure function : Timer Trigger and Cosmos DB feed trigger.
Timer Trigger Function
I have Output binding of Cosmos DB in Timer Trigger function. I am creating connection with Service Bus client and reading messages in batch and uploading in Cosmos DB as shown below. My question is for every runtime Cosmos DB connection will be created? if yes, how i can share the connection? How can i improve SB connection as well so every runtime it will not create new connection. Am I doing it right way in terms of performance?
Cosmos DB feed trigger
In this function I have Cosmos DB as an trigger and outbound with SB.
For every request new connection will be created or function will reuse existing object for both Cosmos and SB connection?
AFAIK, there are three recommended ways to share expensive data between functions on a server to improve performance:
Use static client variables: Static variables are reused for every function invocation, instead of creating a new one, this saves memory and gives performance benefits. When the load is less, only one server instance is created for functions in background, so static variables are reused for multiple functions invocations within the same server instance. But if more than one servers are created, every server instance will have its own static variable, which will be reused by function invocations handled within the same server instance. This is still much better than creating a new connection for every invocation.
Check out this detailed blog for performance load testing proofs of this as well.
Use MemoryCache: This would allow you to share a cache between functions. For example:
static MemoryCache memoryCache = MemoryCache.Default;
public static async Task<object> Run(HttpRequestMessage req, TraceWriter log)
{
var cacheObject = memoryCache["cachedCount"];
var cachedCount = (cacheObject == null) ? 0 : (int)cacheObject;
memoryCache.Set("cachedCount", ++cachedCount, DateTimeOffset.Now.AddMinutes(5));
log.Info($"Webhook triggered memory count {cachedCount}");
return ...
}
Here the code is trying to find the count in the cache, increment it, and save it with a five minute expiry. If we copy this same code to two functions within the same Azure Function App, then sure enough they each can see the count set by the other one. Note that, this cache will lose its contents every time you edit your code.
Check out this blog for more details around this.
Use Dependency Injection: Use DI to create Singleton instances and use them. Check out Use dependency injection in .NET Azure Functions. ServiceBusClient can be registered for dependency injection with the ServiceBusClientBuilderExtensions.
Note: The disposing of the static clients will be automatically done by .NET core runtime, if they implement IDisposable interface. Don't manually dispose it, otherwise it wont be reused. Further, ensure that the static clients are thread safe. As the MS Doc says:
Establishing a connection is an expensive operation that you can avoid
by reusing the same factory and client objects for multiple
operations. You can safely use these client objects for concurrent
asynchronous operations and from multiple threads.
It is safe to instantiate once and share ServiceBusClient.
Additional links:
https://learn.microsoft.com/en-us/azure/azure-functions/manage-connections#static-clients
Let me know if you have any follow-up questions.
Another alternative would be to have a Service Bus trigger and add a document as a message arrives instead of executing a timer and handling a batch. This approach includes the following benefits:
No need to worry about the broker connection as it's handled by Functions.
Less/simpler code. For example, you can receive the message deserialized into a POCO w/o going through manual deserialization.
Alignment with an event-driven approach and not a time-based batch.
Sorry ahead of time, this is a bit of a lengthy setup/question. I am currently working on an API using C# ASP.NET Core 2.1. I have a POST endpoint which takes about 5-10 seconds to execute (which is fine). I need to add functionality which could take a considerable amount of time to execute. My current load testing takes an additional 3 minutes. To be honest production could take a bit longer because I can't really get a good answer as to how many of these things we can expect to have to process. From an UX perspective, it is not acceptable to wait this long as the front end is waiting for the results of the existing POST request. In order to maintain an acceptable UX.
All services are set up as transient using the default ASP.NET Core DI container. This application is using EF Core and is set up in the same fashion as the services (sorry I am not at work right now and forgot the exact verbiage within the Setup file).
I first tried to just create a background worker, but after the response was sent to the client, internal objects would start to be disposed (i.e. entity db context) and it would eventualy throw errors when continuing to try executing code using said context (which makes sense since they were being disposed).
I was able to get a background worker mostly working by using the injected IServiceScopeFactory (default ASP.NET Core implementation). All my code executes successfully until I try saving to the DB. We have overridden the SaveChangesAsync() method so that it will automatically update the properties CreatedByName, CreatedTimestamp, UpdatedByName, and UpdatedTimestamp to the currently tracked entities respectively. Since this logic is used by an object created from the IServiceScopeFactory, it seems like it does not share the same HttpContext and therefore, does not update the CreatedByName and UpdatedByName correctly (tries to set these to null but the DB column does not accept null).
Right before I left work, I created a something that seemed to work, but it seems very dirty. Instead of using the IServiceScopeFactory within my background worker to create a new scope, I created an impersonated request using the WebClient object which pointed to an endpoint within the same API that was currently being executed. This did allow the response to be sent back to the client in a timely manor, and this did continue executing the new functionality on the server (updating my entities correctly).
I apologize, I am not currently at work and cannot provide code examples at this moment, but if it is required in order to fully answer this post, I will put some on later.
Ideally, I would like to be able to start my request, process the logic within the existing POST, send the response back to the client, and continue executing the new functionality using the same context (including the HttpContext which contains identity information). My question is, can this be done without creating an impersonated request? Can this be accomplished with a background worker using the same context as the original thread (I know that sounds a bit weird)? Is their another approach that I am completely missing? Thanks ahead of time.
Look in to Hangfire pretty easy to use library for background tasks.
I am wondering how to implement a solution that will retrieve data that I have scraped, and use it to display in an ASP.NET MVC web application.
The current implementation scrapes the data and displays it from the controller to the view, however by doing so, the request to view the web page will take very long due to the scraper running when a request to view the page with scraped data is processed.
Is there any implementation I can do to separate the data retrieval and the website?
Currently I have a console application scraper class that scrapes data, and a ASP.NET MVC web application that will display the data. How can I couple them together easily?
Based on system size I think you can do 2 things:
Periodically scrape data and save it in the memory
Periodically scrape data and save it in the database
It is oblivious that if scrapped data is big you need to store it in database, otherwise you can keep it memory and highly boost performance.
Running tasks in asp.net periodically is covered by background workers. Some easy way to periodically run tasks is to initiate thread in Application_Start. I don't go more deeply in implementation, because it is already answered. You can reed it here: Best way to run scheduled tasks
For saving data in memory you can use something like this:
public static class Global
{
public static ConcurrentBag<ScrapedItem> ScrapedItems;
}
*Note, it is necessary to use thread-safe collection, because of getting and adding to this collection will be done from different threads: one from background worker, one from request. Or you can use lock object when getting/setting to not thread safe collection.
Recently in C# 4.0 the task based approach for async programming was unveiled. So we were trying to develop some of our functions which used callbacks earlier.
The problem we are facing is with the implementation of multiple responses for the functions using tasks. E.g. we have a function which fetches some data from a thirdparty API. But before fetching the data from API we first check whether we already have it in our in-memory cache or in DB then only we go to the API. The main client application sends a request for a list of symbols for which data to fetch. If we find data for some symbols in cache or in DB we send it immediately via the callback. For remaining symbols we request the API.
This gives a feeling of real-time processing on client application for some symbols. And for other symbols the user gets to know that it will take time. If I do not send responses to the client instantly and first collect all the data and then only send response for the whole list then the user will be stuck for 99 symbols even if only 1 symbol is to be fetched from API.
How can I send multiple responses using the task based approach?
It seems like you want to have an async method that returns more than once. The answer is you can't.
What you can do is:
Call 2 different methods with the same "symbols". The first only checks the cache and DB and returns what it can, and the second one only calls the remote API. This way you get what you can fast from a cache and the rest more slowly.
Keep using callbacks as a "mini producer consumer" design so you can call it as many times you like.
I could try for a more concrete answer if you post the code you're using.
I have an application that once started will get some initial data from my database and after that some functions may update or insert data to it.
Since my database is not on the same computer of the one running the application and I would like to be able to freely move the application server around, I am looking for a more flexible way to insert/update/query data as needed.
I was thinking of using an website API on a separated thread on my application with some kinda of list where this thread will try to update the data every X minutes and if a given entry is updated it will be removed from the list.
This way instead of being held by the database queries and the such the application would run freely queuing what has to be update/inserted etc
The main point here is so I can run the functions without worrying about connectivity issues to the database end, or issues related, since all the changes are queued to be updated on it.
Is this approach ok ? bad ? are the better recommendations for this scenario ?
On "can access DB through some web server instead of talking directly to DB server": yes this is very common and recommended approach. It is much easier to limit set of operations exposed through custom API (web services, REST services, ...) than restrict direct communication with DB.
On "sync on separate thread..." - you need to figure out what are requirements of the synchronization. Delayed sync may be ok if you don't need to know latest data and not care if updates from client are commited to storage immediately.