First of all, I couldn't make the title more explanatory, I will try to lay out the problem then provide my solution for it
I'm implementing a backend in asp core for our game, we have few requests that are somewhat large, like requesting the items we provide in the store, every user starts the game loads the store info which makes a database trip to pull the entire store info, which RARELY change -less than once a month-, so we are making thousands of database trip that aren't needed.
on top of that we return timestamps for when was the last time an item image has changed, the images are stored in a blob which makes me query the blob for change date, which makes the request way costlier
so to solve all of this, I implemented a small class to cache the request until we need to update it,for this request and some others, but I'm not sure if I'm looking at this correctly
here is the base abstract class:
public abstract class CachedModel<T>
{
protected T Model { get; set; }
private readonly SemaphoreSlim semaphore = new SemaphoreSlim(1,1);
protected abstract Task ThreadSafeUpdateAsync();
protected abstract bool NeedsUpdate();
public async Task<T> GetModel()
{
if (NeedsUpdate())
{
try
{
await semaphore.WaitAsync();
if(NeedsUpdate()) // not sure if this is needed, can other threads enter here after the first one already updated the object?
await ThreadSafeUpdateAsync();
}
finally
{
semaphore.Release();
}
}
return Model;
}
}
and then I implement this class per request like this:
public class CachedStoreInfo : CachedModel<DesiredModel>
{
protected override async Task ThreadSafeUpdateAsync()
{
// make the trip to db and Blob service
Model = some result
}
protected override bool NeedsUpdate()
{
return someLogicToDecideIfNeedsUpdate;
}
}
finally, in the asp controller all what I need to do is this:
[HttpGet]
public async Task<DesiredModel> GetStoreInfo()
{
return await cachedStoreInfo.GetModel();
}
Is this a proper implementation ? and is this even necessary or there is a smarter way to achieve this? getting the time stamps from the blob was the main reason I though about caching the result
Your implementation looks correct. Of course the instance of CachedStoreInfo should be a singleton in a required scope (as I understand in your case it should be a singleton in scope of application).
can other threads enter here after the first one already updated the object?
As Kevin Gosse noted other threads can enter here. Your second check for NeedsUpdate() is a part of Double-checked locking pattern. And it might be a good optimization.
and is this even necessary or there is a smarter way to achieve this?
As for me your implementation is minimalist and smart enough
Related
How can I prevent synchronous database access with Entity Framework Core? e.g. how can I make sure we are calling ToListAsync() instead of ToList()?
I've been trying to get an exception to throw when unit testing a method which calls the synchronous API. Are there configuration options or some methods we could override to make this work?
I have tried using a DbCommandInterceptor, but none of the interceptor methods are called when testing with an in-memory database.
The solution is to use a command interceptor.
public class AsyncOnlyInterceptor : DbCommandInterceptor
{
public bool AllowSynchronous { get; set; } = false;
public override InterceptionResult<int> NonQueryExecuting(DbCommand command, CommandEventData eventData, InterceptionResult<int> result)
{
ThrowIfNotAllowed();
return result;
}
public override InterceptionResult<DbDataReader> ReaderExecuting(DbCommand command, CommandEventData eventData, InterceptionResult<DbDataReader> result)
{
ThrowIfNotAllowed();
return result;
}
public override InterceptionResult<object> ScalarExecuting(DbCommand command, CommandEventData eventData, InterceptionResult<object> result)
{
ThrowIfNotAllowed();
return result;
}
private void ThrowIfNotAllowed()
{
if (!AllowSynchronous)
{
throw new NotAsyncException("Synchronous database access is not allowed. Use the asynchronous EF Core API instead.");
}
}
}
If you're wanting to write some tests for this, you can use a Sqlite in-memory database. The Database.EnsureCreatedAsync() method does use synchronous database access, so you will need an option to enable this for specific cases.
public partial class MyDbContext : DbContext
{
private readonly AsyncOnlyInterceptor _asyncOnlyInterceptor;
public MyDbContext(IOptionsBuilder optionsBuilder)
: base(optionsBuilder.BuildOptions())
{
_asyncOnlyInterceptor = new AsyncOnlyInterceptor();
}
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder.AddInterceptors(_asyncOnlyInterceptor);
base.OnConfiguring(optionsBuilder);
}
public bool AllowSynchronous
{
get => _asyncOnlyInterceptor.AllowSynchronous;
set => _asyncOnlyInterceptor.AllowSynchronous = value;
}
}
Here are some helpers for testing. Ensure you aren't using sequences (modelBuilder.HasSequence) because this is not supported by Sqlite.
public class InMemoryOptionsBuilder<TContext> : IOptionsBuilder
where TContext : DbContext
{
public DbContextOptions BuildOptions()
{
var optionsBuilder = new DbContextOptionsBuilder<TContext>();
var connection = new SqliteConnection("Filename=:memory:");
connection.Open();
optionsBuilder = optionsBuilder.UseSqlite(connection);
return optionsBuilder.Options;
}
}
public class Helpers
{
public static async Task<MyDbContext> BuildTestDbContextAsync()
{
var optionBuilder = new InMemoryOptionsBuilder<MyDbContext>();
var context = new MyDbContext(optionBuilder)
{
AllowSynchronous = true
};
await context.Database.EnsureCreatedAsync();
context.AllowSynchronous = false;
return context;
}
}
How can I prevent synchronous database access with Entity Framework Core?
You can not. Period. THere is also no reason for this ever. You basically assume programmers using your API either are idiots or malicious - why else would you try to stop them from doing something that is legal in their language?
I have tried using a DbCommandInterceptor, but none of the interceptor methods are
called when testing with an in-memory database
There are a TON of problems with the in memory database. I would generally suggest not to use it - like at all. Unless you prefer a "works possibly" and "never actually use advanced features of the database at all". It is a dead end - we never do unit testing on API like this, all our unit tests actually are integration tests and test end to end (vs a real database).
In memory has serious no guarantee to work in anything non trivial at all. Details may be wrong - and you end up writing fake tests and looking for issues when the issue is that the behavior of the in memory database just is a little different than the real database. And let's not get into what you can do with the real database that in memory has no clue how to do to start with (and migrations also do not cover). Partial and filtered indices, indexed views are tremendous performance tools that can not be properly shown. And not get into detail differences for things like string comparisons.
But the general conclusion is that it is not your job to stop users from calling valid methods on EfCore etc. and you are not lucky to actually do that - not a scenario the team will ever support. There are REALLY good reasons at time to use synchronous calls - in SOME scenarios it seems the async handling is breaking down. I have some interceptors (in the http stack) where async calls just do not work. Like never return. Nothing I ever tried worked there - so I do sync calls when I have to (thank heaven I have a ton of caching in there).
You can prevent it at compile-time to some degree by using the Microsoft.CodeAnalysis.BannedApiAnalyzers NuGet package. More information about it here.
Methods that end up doing synchronous queries can then be added to BannedSymbols.txt, and you will get a compiler warning when attempting to use them. For example adding the following line to BannedSymbols.txt gives a warning when using First() on an IQueryable<T>:
M:System.Linq.Queryable.First`1(System.Linq.IQueryable{``0});Use async overload
These warnings can also be escalated to become compiler errors by treating warnings as errors as explained here:
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-options/errors-warnings
Unfortunately not all synchronous methods can be covered by this approach. For example since ToList() is an extension on IEnumerable<T> (and not on IQueryable<T>), banning it will not allow any use of ToList() in the same project.
I can't really find a good Google answer for you. So my suggestion in the meantime is that you start doing peer-review, aka Code Reviews and any time you find a .Tolist(), you change it to await .ToListAsync().
It's not the most high tech solution, but it does keep everyone honest, but it also allows others to become familiar with your work should they ever need to maintain it while you're booked off sick.
I have an ASP.NET Core 3.1 based project written using C#. I am aware that the best time to use await and async is when accessing external resources like pulling data from the database, accessing files, or making an HTTP request. This frees up the thread so more work is done instead for the thread to sit around waiting for code to finish.
However, I am trying to figure out at what point using async/await will hurt performance? Does the process of releasing the thread when await is called and retrieving a thread when the task complete have cost?
The code found next, is called asynchronously. In reality, that code does not need to be called asynchronously since all the code is executed in memory and no external requests are made.
public interface ILocator
{
Task<Model> ExampleAsync();
}
public class Example : Controller
{
public ILocator Locator { get; set; }
public Example(ILocator locator)
{
Locator = locator;
}
public async Task<IActionResult> Example()
{
Model model = await Locator.ExampleAsync();
return View(model);
}
}
public class Locator : ILocator
{
pubilc Task ExampleAsync()
{
Example model = new Example();
model.Status = "New";
return Task.CompletedTask;
}
}
Here is the synchronous version of the above code
public interface ILocator
{
Model Example();
}
public class Example : Controller
{
public ILocator Locator { get; set; }
public Example(ILocator locator)
{
Locator = locator;
}
public IActionResult Example()
{
Model model = Locator.Example();
return View(model);
}
}
public class Locator : ILocator
{
pubilc Example()
{
Example model = new Example();
model.Status = "New";
}
}
Will the asynchronous version of the code have higher cost/lower performance than the synchronous version due to the unnecessary await/async usage?
When will async/await do more harm than good?
You typically use async/await when performing I/O bound tasks like reading from a stream, reading from a DB, sending something over the network or waiting for a response.
This makes the thread available to do some other (CPU related work).
Technically, async/await is slower in terms of raw performance, however, it increases the scalability of your application since it allows threads to be available for other work while others are waiting for I/O bound operations.
I have an ASP.NET Core 3.1 based project written using C#.
That means that async/await will allow you to increase your capacity (in requests pers second) a great deal.
at what point using async/await will hurt performance?
it will add a little bit of overhead that is a fixed amount. It will in general be small compared to the costs of the I/O and not really noticable.
When will async/await do more harm than good?
In a web site/service that will never see a load above ~50 requests / second.
And even then the 'harm' will be very small.
Actual numbers depend on the hardware, amount of I/O work etc.
In reality, that code does not need to be called asynchronously since all the code is executed in memory
In that case it will be faster to handle it synchronously.
But I know teams that prefer to have all Actions async, just for uniformity. And since the overhead is so small I consider that a valid approach too.
So I have a WPF application using the MVVM pattern (Caliburn.Micro). I got the views and view-models wired up and basicly what is missing is the data. The data is to be retrieved "on-demand" either from a WCF service, local storage or from memory/cache - reason being to allow for offline-mode and to avoid uneccessary server communication. Another requirement is that the data is retrieved asynchronously so the UI thread is not blocked.
So I was thinking to create some kind of "AssetManager" that the viewmodels use to request data:
_someAssetManager.GetSomeSpecificAsset(assetId, OnGetSomeSpecificAssetCompleted)
Note that it is an asynchronous call. I run into a few different problems though. If the same asset is requested at (roughly) the same time by different view-models, how do we ensure that we don't do unecessary work and that they both get the same objects that we can bind against?
Not sure I'm having the right approach. I've been glancing a bit at Reactive Framework - but I have no idea how to use it in this scenario. Any suggestions on framework/techniques/patterns that I can use? This seems to be a rather common scenario.
Dictionary<int, IObservable<IAsset>> inflightRequests;
public IObservable<IAsset> GetSomeAsset(int id)
{
// People who ask for an inflight request just get the
// existing one
lock(inflightRequests) {
if inflightRequests.ContainsKey(id) {
return inflightRequests[id];
}
}
// Create a new IObservable and put in the dictionary
lock(inflightRequests) { inflightRequests[id] = ret; }
// Actually do the request and "play it" onto the Subject.
var ret = new AsyncSubject<IAsset>();
GetSomeAssetForReals(id, result => {
ret.OnNext(id);
ret.OnCompleted();
// We're not inflight anymore, remove the item
lock(inflightRequests) { inflightRequests.Remove(id); }
})
return ret;
}
I've had success with method calls that pass in a delegate that gets called when the data is received. You could layer the requirement of keeping everyone with the same data (if a request is currently happening) by checking a boolean field that determines if a request is happening. I would keep a local collection of delegates that need calling so that when the data is finally received, the class that contains the delegates to call can iterate them, passing in the newly received data.
Something along these lines:
public interface IViewModelDataLoader{
void LoadData(AssignData callback);
}
public delegate void AssignData(IEnumerable<DataObject> results);
The class that actually implements this interface could then keep a running tally on who to notify when the data is done (assuming a singleton model):
public class ViewModelDataLoader : IViewModelDataLoader{
private IList<AssignData> callbacksToCall;
private bool isLoading;
public void LoadData(AssignData callback){
callbacksToCall.add(callback);
if (isLoading) { return; }
// Do some long running code here
var data = something;
// Now iterate the list
foreach(var item in callbacksToCall){
item(data);
}
isLoading = false;
}
}
Using the proxy pattern and events you can provide both synchronous and asynchronous data. Have your proxy returned cached values for synchronous calls and also notify view models via events when your receive asynchronous data. The proxy can also be designed to track data requests and throttle server connections (eg 'reference counting' calls, data requested/data received flags, etc)
I would set up you AssetManager like this:
public interface IAssetManager
{
IObservable<IAsset> GetSomeSpecificAsset(int assetId);
}
Internally you would need to return a Subject<IAsset> that you populate asynchronously. Do it right and you only have a single call for each call to GetSomeSpecificAsset.
I need to instantiate a singleton object per web request, so that the data is processed once and is valid throughout the request, I was using HttpContext.Current.Items to share data during HTTP request, everything was fine until we needed the singleton object instance across multiple threads, the first thing that I came up with was to pass the HttpContext instance to the new thread:
HttpContext context = HttpContext.Current;
ThreadPool.QueueUserWorkItem(callback =>
{
HttpContext.Current = context;
// blah blah
});
Which I don't think is a thread-safe approach as noted here.
Using Reflector I figured HttpContext.Current.Items actually uses CallContext to store objects in each logical thread. So I changed the singleton interface to this:
public static SingletonType SingletonInstance
{
get { return CallContext.GetData(key) as SingletonType; }
set { CallContext.SetData(key, value); }
}
And simply overwrite SingletonInstance when starting any new thread! The code works fine, however it seems that somehow under heavy load, CallContext.GetData(key) returns null and the application crashes with with a null reference exception!
I was thinking, if CallContext.GetData is atomic? But it just doesn't seem right, the CallContext is thread specific data storage and must be atomic or I am missing the point!
My other guess is that setting the SingletonInstance (CallContext.SetData) happens in one thread while CallContext.GetData executes in another as noted here but I don't know how/why?
update:
We are keeping an instance of each online user in an array on the server. The singleton object is actually a reference to the object representing current user. Current user must be unique and available in each thread for database querying, logging, error handling and more, this is how it is done:
public static ApplicationUser CurrentUser
{
get { return CallContext.GetData("ApplicationUser") as ApplicationUser ; }
set { CallContext.SetData("ApplicationUser", value); }
}
ASP.NET may migrate request between threads if it's under load. Once request is received page constructor may execute on one thread and page load on another. In this thread switch CallContext and ThreadStatic are not migrated, but luckaly HttpContext is.
This may be misleading as HttpContext is call context, but this is a little quirk in ASP.NET, probably due to cutting corners to improve performance.
You'll have to remove dependencies to CallContext and use HttpContext entire way through.
You can read more details in this terrific blog post by Piers7.
This was resolved during a chat session.
In essence it involves long-running tasks and a suggestion of using an external service (Web, or regular Windows Service) was decided as the best solution to the problem.
Thread-safing your second method is the best approach.
This is thread-safe version of your singletone:
public sealed class SingletonType
{
#region thread-safe singletone
private static object _lock = new object();
private SingletonType() { }
public static SingletonType SingletonInstance
{
get
{
if (CallContext.GetData(key) == null)
{
lock (_lock)
{
if (CallContext.GetData(key) == null)
CallContext.SetData(key, new SingletonType());
}
}
return CallContext.GetData(key) as SingletonType;
}
}
#endregion
//
//
// SingletoneType members
//
//
}
NOTE : using a lock { } block is the key.
I was wondering if anyone could help me understand if what I am doing is a lot of overhead or not. It is currently working but I am not sure if this could slow down the site or not.
I have a workflowobj class in which i set all the session variables. This class in instantiated on the pages that need it:
WorkFlowObj wfo = new WorkFlowObj(this.Session, this.Response);
wfo.VendorRedirect();
I need this because I need to be able to keep track of session variables and at the same time be able to keep track of a more complicated page workflow in one place. This solution already already works for me, but the only problem is that I am not sure if passing around the session and the response objects creates a lot of OVERHEAD. Can anyone tell me if this is terribly inefficient?? Below is the code for the workflowobj class.
public class WorkFlowObj
{
private System.Web.SessionState.HttpSessionState _pagesession;
private HttpResponse _HttpResponse;
private int _userid;
private string _vendorname;
///private other vars here
}
public int UserID
{
get
{
return _userid;
}
}
public WorkFlowObj(System.Web.SessionState.HttpSessionState pagesession, HttpResponse _response)
{
_pagesession = pagesession;
_HttpResponse = _response;
Initialize();
}
private void Initialize()
{
//initialize variables from session
_userid=_pagesession["userid"].ToString();
}
public void VendorRedirect()
{
switch (this._vendorname)
{
case "1":
this._HttpResponse.Redirect(page1);
break;
case "2":
this._HttpResponse.Redirect(page2);
break;
//etc
default:
//dostuff;
break;
}
}
}
As Rick says, I wouldn't create dependencies to System.Web in your middle-tier objects if you can avoid it.
But if you can't avoid it, you can avoid passing around the Session object by using the static System.Web.HttpContext class. This lets you do something like:
userid = (String)System.Web.HttpContext.Current.Session["userid"];
As long as it's executing on the same thread (and therefore in the same context) as the request from the browser.
I would not create dependencies to System.Web in your workflow objects, just pass the variables that the workflow objects need to make decision and execute business logic. There is no overhead passing objects around, they are just pointers under the hood.
One issue I could see happening is accidental use of statics in another layer that get tied to your Page state, thus not allowing the GC to clean up ie: classic out of memory exception or app pool recycle.