how to cache data in a stateful service?

how to cache data in a stateful service? - c#

The cluster needs access to a dataset that lives in sql server, that is outside of the cluster.
Rather than forcing remote calls to the database for every request, I would like to create a stateful service that will periodically refresh its cache with data from the remote database.
Would we be looking at something like this following?
internal sealed class StatefulBackendService : StatefulService
{
public StatefulBackendService(StatefulServiceContext context)
: base(context)
{
}
/// <summary>
/// Optional override to create listeners (like tcp, http) for this service instance.
/// </summary>
/// <returns>The collection of listeners.</returns>
protected override IEnumerable<ServiceReplicaListener> CreateServiceReplicaListeners()
{
return new ServiceReplicaListener[]
{
new ServiceReplicaListener(
serviceContext =>
new KestrelCommunicationListener(
serviceContext,
(url, listener) =>
{
ServiceEventSource.Current.ServiceMessage(serviceContext, $"Starting Kestrel on {url}");
return new WebHostBuilder()
.UseKestrel()
.ConfigureServices(
services => services
.AddSingleton<IReliableStateManager>(this.StateManager)
.AddSingleton<StatefulServiceContext>(serviceContext))
.UseContentRoot(Directory.GetCurrentDirectory())
.UseServiceFabricIntegration(listener, ServiceFabricIntegrationOptions.UseUniqueServiceUrl)
.UseStartup<Startup>()
.UseUrls(url)
.Build();
}))
};
}
}
Within this stateful service, how would I load data from a remote database and serve it through controllers?
Let's assume we have a simple model:
Create table Account (varchar name, int key)
I imagine that the operations would be in the following order:
Load Account table into memory
respond to requests such as http://statefulservice/account?$top=10
refresh data in the service on a time interval basis
What are the datatypes that I should be using in order to cache this data? What would be the process of loading the data into the stateful service from a sql server database>?

IMHO, even though it's possible to use Statefull services as a cache backed up by some database, the real power comes when you keep your data in the reliable collections only. With Service Fabric and Reliable Collections, you can store data directly in your service without the need for an external persistent store. See Application scenarios. Aside from providing high availability and low latency, the state is reliably replicated across multiple nodes so it could survive a node failure, and moreover, there is a Back up and restore feature that allows you to deal even with the entire cluster outage.
There are many things you should know about when dealing with Reliable Services. Service partiotioning, Transactions and lock modes, Guidelines and recommendations, etc.
As for the data types, explore Reliable Collection object serialization and Serialization and Upgrade.
Another thing you also should be aware of, is the Reliable Dictionary periodically removes least recently used values from memory, which could increase read latencies in certain cases. See more here - Service fabric reliable dictionary linq query very slow.
A simple example of integrating controllers and StateManager you could find in this article.

l--''''''---------''''''''''''
Here's a little more info related to your comment...
Hey m8... the reliable collections are designed to run multiple instances (the run on more than one node at a time)... Within each instance the data is partitioned into one or more groups (how you decide to partition is entirely up to you)... So there is load distribution and fail over, there is more to say... but I don't want to muddy the waters so I'm attempting to keep it high level. This type of service data in reliable collections exists in memory and can be "backed up"... If you want your data formally written to disc and have more control over WHEN it is written to disc you will need to take a look at Actors. This is a good (very simple) collection of examples of service fabric, reliable collections, and wiring up internal communications. Only think funky about looking at this one is there are a lot of different 'recipes' used to facilitate back-end and communication from the back-end to the public (stateless) side.
I see you added to your question and changed the intent a little... I will pointedly tell you what I 'think' you need for what you are really after... You want one or multiple 'Stateful Service's (this is your data service layer, this can be abstracted into 3 components if you want... the stateful service itself, and 2 class libraries one for your service interface and one for your contracts ... or rather your data models... basically this is a POCO), you would include the 2 class libraries in your stateful service and use them to create dictionary entries (probably something like new IReliableDictionary... and bind the interface. You will want to use (add to) the IService interface (you will need to grab a nuget package 'Service Fabric Remoting' for the interface project you created for your service interface, there is plenty of info out there on how to achieve remoting within service fabric as it is a standard communication method. There is more, but simply building this would be a viable experiment and would effectively take the place or you database. You can formally persist the data to disc using Actors or a simple backup method that is canned with service fabric. Essentially I suggest you build this in order to firm up the fact you can completely remove the database from this scenario... you really don't need it. What I have described above takes the place of the db ONLY... without writing a front-end for this (that uses remoting to communicate with your backend) this would not be accessible to the public... at least not easily.
TL;DR - Basically I'm agreeing with what one of your other contributors is stating... My opinion is less humble so I'll simple state it. You application will be less complicated, faster and more reliable if you handle your data within service fabric... Still TL;DR? - Ditch the db my man. If you are really nervous about it only existing in memory, use Actors

Related

Microservices design part in WebApi

Hi I am trying to create a project skeleton uses CQRS pattern and some external services. Below are the structure of the solution.
WebApi
Query Handlers
Command Handlers
Repository
ApiGateways ( here is the interfaces and implementation of microservice calls)
We want to keep controller as thin. So we are using query handlers and command handlers to handle respective operations.
However, we use a external microservices to get the data we are calling them from Query handlers.
All the http clinet construction and calls will be abstracted in them.The response will be converted to a view model and pass it back to Query handler.
We name it as a ApiGateways. But it is not composing from multiple services.
How do we call this part in our solution? Proxy or something? Any good example for thin controllers and microservice architecture

We name it as API Gateways. But it is not composed from multiple
services. How do we call this part in our solution? Proxy or
something? Any good example for thin controllers and microservice
architecture
Assumption:
From the image you attached, I see Command Handler and Query Handler are calling "external/micro-services". I guess by this "external/micro-services" you mean that you are calling another micro-service from your current micro-service Handler(Command and Query). These "external/micro-services" are part of your architecture and deployed on the same cluster and not some external system that just exposes a public API?
If this is correct I will try to answer based on this assumption.
API Gateway would probably be misleading in this case as the concept of API Gateway is something different then what you are trying to do here.
API Gateway per definition:
Quote from here:
An API Gateway is a server that is the single entry point into the
system. It is similar to the Facade pattern from object-oriented
design. The API Gateway encapsulates the internal system architecture
and provides an API that is tailored to each client. It might have
other responsibilities such as authentication, monitoring, load
balancing, caching, request shaping and management, and static
response handling.
What you actually are trying to do is to call from your Command or Query Handler from one of your micro-service A another micro-service B. This is an internal micro-service communication that should not be done through API Gateway as that would be the approach for outside calls. For example, with "outside calls" I mean frontend application API or public API calls that are trying to call your micro-services. In that case, you would use API Gateways.
A better name for this component would be something like "CrossMicroServiceGateway" or "InterMicroServiceGateway"; if you want to have it as the full CQRS way you could have it like a direct call to other Command or Query and then you could use something like "QueryGate" or "CommandGate" or similar.
Other suggestions:
WebApi
Query Handlers
Command Handlers
Repository
API Gateways ( here is the interfaces and implementation of
microservice calls)
This sounds reasonable except for the point of API Gateway which I described above. Of course, it is hard for me to tell based on the limited information that I have about your project. To give you a more precise suggestion here I would need to know whether you use DDD or not? How do you use CQRS and other information?
However, we use an external microservices to get the data we are
calling them from Query handlers. All the HTTP client construction and
calls will be abstracted in them. The response will be converted to a
view model and pass it back to Query handler.
You could extract all this code/logic that handles the cross micro-service communication over HTTP or other protocols, handling general responses and similar to some core library and include it into each of your micro-service as a package. In this way, you will reuse the solution for all your micro-service. You can extend that and add all core domain-agnostic things (like data access or repository base classes, wrappers around HTTP, unit-test infrastructure setup, and similar) to that or other shared libraries. This way your micro-services will only focus on the part of the Domain it is supposed to do.

I think CQRS is the right choice to keep the reading and writing operations decoupled.
The integration with third party systems (if it's the case), need some attention.
Do not call these services directly from your handlers, this could lead to various performance and/or maintainability issues.
You have to keep these integrations very well separated, because them are out of your domain. They may be subject to inefficiencies, changes or a number of problems out of your control.
One solution that I could recommend is a "Middleware" service.
In your application context this can be constituted by another service (always REST for example) that will have the task of 'talk' (only him) with external systems, acting as a single point of integration between your domain and the external environment. This can be realized from scratch or using a commercial/opens-source solution like (just as example) this.
This lead to many benefits, same of these:
A middleware is a unique mockable point during integration test of your application.
You can change the middleware implementation in the future without touch your handlers.
Of course, changing 3pty providers won't affect your domain services.
Middleware is the unique point dedicated to manage 3pty service interruptions.
Your services remain agnostic compared to the outside world.
Focus on these questions can be useful to design your integration middleware service:
Which types of 3pty data do they provide? Are they on time? This might help you figure out whether to introduce a cache system into your integration service.
Can 3pty be subject to frequent interruptions? Then you must ensure that your system must tolerate any disruption of external services. In other words, you must ensure a certain resilience of your services. There are many techniques to do that.
Do you really need to interrogate these 3pty services all the time? Maybe a more or less sophisticated cache system could speed up your services a lot.
Finally, it is also very important to understand if the need to have a microservices-oriented system is a real and immediate need.
Due to the fact these architectures are more expensive and complex then the classic ones, it might be reasonable to think about starting by building a monolith system and then moving towards a more segmented solution later.
Thinking (organizing) your system as many "bounded context" does not prevent you from creating a good monolith system and at the same time, it prepares you for a possible switch to microservices-oriented one.
As a summary advice, start by keeping things as separate as possible. Define a language to speak about your business model. These lead to you potentially change a lot when needs will come without to much effort during the inevitable evolution of your software. "Hexagonal" architecture is a good starting point to do that for both choises (Microservices vs Monolith).
Recently, Netflix posted a nice article about this architecture with a lot of ideas for a fresh start.

I will give my answer from DDD and the clean architecture perspective. Ideally, you application should have following layers.
Api (ideally very thin layer of Controllers).The controller will create queries and commands and push them on a common channel. (refer MediatR)
Application This will be your orchestration layer. It will contain definitions of queries and command and their handlers. For queries, you will directly interact form your infrastructure layer. For commands, you will interact with domain and then save them through repositories in infrastructure.
Domain Depends upon your business logic and complexity, this layer will contain all your business models.
Infrastructure It will contain mostly two types of objects Providers and Repositories. Providers should be used with queries and will return DAO. Repositories should be used where ever domain is involved, ideally commands in CQRS. Repositories should always receive and return only domain objects.
So after setting the base context about different layers on clean architecture, the answer to your original question is --> I would create third party interactions in the provider layer. For example, you need to connect with a user microservice, I will create a UserProvider in the provider folder in the infrastructure layer and consume it through a interface.

Configuration per Service Fabric Instance

I'm designing a service fabric stateless service, which requires configuration data for each instance. My initial thought was creating named partitions, and using PartitionInfo to get the named key, with a shared read only dictionary to load settings per instance. Problem is, now accessing this instance internally (From other services) requires a partition key. Since all partitions using this method will serve the same data internally, it doesn't matter which partition I connect to (I'd want it to be random). So, this gives me many possible ways to fix this problem:
Accessing the partitions (in my attempt above) randomly using ServiceProxy.Create.
The following solutions that don't involve partitions:
A configuration based per instance. This post doesn't give much help in coming up with a solution. A configuration section unique to each instance would be the most ideal solution.
Create named instances, and use the name as the username (Basically attach a string to a non-partitioned instance)
Get an instance by index, and use the index against a shared read-only dictionary to get the username.
Somehow use InitializationData (See this post) to get a username string (If InitializationData can be unique per instance).
All of the above will solve my issue. Is any of these ways possible?
EDIT: An example of a service I'm trying to create:
Let's say we have a stackoverflow question service (SOQS for short). For the sake of this example, let's say that one user can be connected to stackoverflow's websocket at any one time. SOQS internal methods (Published to my service fabric) has one method: GetQuestions(). Each SOQS would need to connect to stackoverflow with a unique username/password, and as new questions are pushed through the websocket, they added to an internal list of questions. SOQS's GetQuestions() method (Called internally from my service fabric), would then give the same question list. I can then load balance by adding more instances (As long as I have more username/passwords) and the load internally to my fabric could then be distributed. I could call ServiceProxy.Create<SOQS>() to connect to a random instance to get my question list.

It sounds like what you are looking for to have a service type that has multiple actors with each actor having its own configuration. They wouldn't be multiple copies of the same service with unique configurations, it would be one (with replicas of course) instance of the service as a singleton, and individual actors for each instance.
As an example you could have the User Service (guessing what it is since you mention username string) read from some external storage mechanism the list of usernames and longs for instance ids for each to use for internal tracking. The service would then create an actor for each, with its own configuration information. Then the User Service would be the router for messaging to and from the individual actors.

I'm not entirely sure that this is what you're looking for, but one alternative might be to create an additional configuration service to provide the unique configs per instance. On startup of your stateless service, you simply request a random (or non-random) configuration object such as a json string, and bootstrap the service during initialization. That way, you don't have to mess with partitions, since each stateless instance will fire it's own Startup.cs (or equivalent).

Implementing caching of services in Domain project being used by a Web API

My question is: how do I implement caching in my domain project, which is working like a normal stack with the repository pattern.
I have a setup that looks like the following:
ASP.NET MVC website
Web API
Domain project (using IoC, with Windsor)
My domain project for instance have:
IOrderRepository.cs
OrderRepository.cs
Order.cs
My ASP.NET MVC website calls the Web API and gets back some DTO classes. My Web API then maps these objects to business objects in my domain project, and makes the application work.
Nowhere in my application have I implemented caching.
Where should be caching be implemented?
I thought about doing it inside the methods in the OrderRepository, so my Get, GetBySpecification and Update methods has to call some generic cache handler injected by the OrderRepository.
This obviously gives some very ugly code, and isn't very generic.
How to maintain the cache?
Let's say we have a cache key like "OrderRepostory_123". When I call the Update method, should I call cacheHandler.Delete("OrderRepository_123") ? Because that seems very ugly as well
My own thoughts...
I can't really see a decent way to do it besides some of the messy methods I have described. Maybe I could make some cache layer, but I guess that would mean my WebAPI wouldn't call my OrderRepository anymore, but my CacheOrderRepository-something?

Personally, I am not a fan of including caching directly in repository classes. A class should have a single reason to change, and adding caching often adds a second reason. Given your starting point you have at least two likely reasonable options:
Create a new class that adds caching to the repository and exposes the same interface
Create a new service interface that uses one or more repositories and adds caching
In my experience #2 is often more valuable, since the objects you'd like to cache as a single unit may cross repositories. Of course, this depends on how you have scoped your repositories. A lot may depend on whether your repositories are based on aggregate roots (ala DDD), tables, or something else.

There are probably a million different ways to do this, but it seems to me (given the intent of caching is to improve performance) implementing the cache similar to a repository pattern - where the domain objects interact with the cache instead of the database, then perhaps a background thread could keep the database and cache in sync, and the initial startup of the app pool would fill the cache (assuming eager loading is desired). A whole raft of technical issues start to crop up, such as what to do if the cache is modified in a way that violates a database constraint. Code maintenance becomes a concern where any data structure related concerns possibly need to be implemented in multiple places. Concurrency issues start to enter the fray. Just some thoughts...

SQLCacheDependency with System.Web.Caching.Cache, http://weblogs.asp.net/andrewrea/archive/2008/07/13/sqlcachedependency-i-think-it-is-absolutely-brilliant.aspx . This will get you caching that gets invalidated based on other systems applying updates also.

there are multiple levels of caching depending on the situation however if you are looking for generic centralized caching with low number of changes I think you will be looking for EF second level caching and for more details check the following http://msdn.microsoft.com/en-us/magazine/hh394143.aspx
Also you can use caching on webapi level
Kindly consider if MVC and WebAPI the network traffic if they are hosted in 2 different data centers
and for huge read access portal you might consider Redis http://Redis.io

It sounds like you want to use a .NET caching mechanism rather than a distributed cache like Redis or Memcache. I would recommend using the System.Runtime.Caching.MemoryCache class instead of the traditional System.Web.Caching.Cache class. Doing this allows you to create your caching layer independent of your MVC/API layer because the MemoryCache has no dependencies on System.Web.
Caching your DTO objects would speed up your application greatly. This prevents you from having to wait for data to be assembled from a cache that mirrors your data layer. For example, requesting Order123 would only require a single cache read rather than to several reads to any FK data. Your caching layer would of course need to contain the logic to invalidate the cache on UPDATEs you perform. A recommended way would be to retrieve the cached order object and modify its properties directly, then persist to the DB asynchronously.

wcf decision: one service multiple contracts or many services

I am using .NET 4 to create a small client server application for a customer. Should I create one giant service that implements many contracts (IInvoice, IPurchase, ISalesOrder, etc) or should I create many services running one contract each on many ports? My questions specifically is interested in the pros/cons of either choice. Also, what is the common way of answering this question?
My true dilemma is that I have no experience making this decision, and I have little enough experience with wcf that I need help understanding the technical implications of such a decision.

Don't create one large service that implements n-number of service contracts. These types of services are easy to create, but will eventually become a maintenance headache and will not scale well. Plus, you'll get all sorts of code merging conflicts if there's a development group competing for check-ins/check-outs.
Don't create too many services either. Avoid the trap of making your services too fine-grained. Try to create services based on a functionality. The methods exposed by these services shouldn't be fine-grained either. You're better off having fewer methods that do more. Avoid creating similar functions like GetUserByID(int ID), GetUserByName(string Name) by creating a GetUser(userObject user). You'll have less code, easier maintenance and better discoverability.
Finally, you're probably only going to need one port no matter what you do.
UPDATE 12/2018
Funny how things have changed since I wrote this. Now with the micro-services pattern, I'm creating a lot of services with chatty APIs :)

You would typically create different services for each main entity like IInvoice, IPurchase, ISalesOrder.
Another option is to seperate queries from commands. You could have a command service for each main entity and implement business operations accepting only the data they need in order to perform the operation (avoid CRUD-like operations); and one query service that returns the data in the format required by the client. This means that the command part uses the underlying domain model/business layer; while the query service directly operates on the database (bypassing the business, which is not needed for querying). This simplifies your querying a lot and makes it more flexible (return only what the client needs).

In real time applications you have one service contract for each entity like Invoice, Purchase and SalesOrder will have separate ServiceContract
However for each service contract there will be heterogeneous clients like Invoice will be called by backoffice through windows application using netNamedPipeBinding or netTcpBinding and same time client application needs to call the service using basicHttpBinding or wsHttpBindings. Basically you need to create multiple endpoints for each service.

Its seems that you are mixing between DataContract(s) and ServiceContract(s).
You can have one ServiceContract and many DataContract(s) and that would perfectly suit your needs.

The truth is that splitting up WCF services - or any services is a balancing act. The principle is that you want to to keep downward pressure on complexity while still considering performance.
The more services you create, the more configuration you will have to write. Also, you will increase the number of proxy classes you need to create and maintain on the client side.
Putting too many ServiceContracts on one service will increase the time it takes to generate and use a proxy. But, if you only end up with one or two Operations on a contract, you will have added complexity to the system with very little to gain. This is not a scientific prescription, but a good rule of thumb could be say about 10-20 OperationContracts per ServiceContract.
Class coupling is of course a consideration, but are you really dealing with separate concerns? It depends on what your system does, but most systems deal with only a few areas of concern, so splitting things up may not actually decrease class coupling that much anyway.
Another thing to remember, and this is ultra important is to always make your methods as generic as possible. WCF deals in DataContracts for a reason. DataContracts mean that you can send any object to and from the server so long as the DataContracts are known.
So, for example, you might have 3 OperationContracts:
[OperationContract]
Person GetPerson(string id);
[OperationContract]
Dog GetDog(string id);
[OperationContract]
Cat GetCat(string id);
But, so long as these are all known types, you could merge these in to one operation like:
[OperationContract]
IDatabaseRecord GetDatabaseRecord(string recordTypeName, string id);
Ultimately, this is the most important thing to consider when designing service contracts. This applies for REST if you are using a DataContract serialization like serialization method.
Lastly, go back over your ServiceContracts every few months and delete operations that are not getting used by the clients. This is another big one!

You should take the decision based the load expected, extensibility needed and future perspective. As you wrote " small client server application for a customer" it is not giving clear idea of intended use of the development in hand. Mr. Big's answer must be considered too.
You are most welcome to put forward further question backed with specific data or particulars about the situation in hand. Thanks.

Wcf services and Db settings?

I was asked to build different WCF services where each do other work against sql.
We have 5 db's. All db's+connection string are in 1 xml file. ( file-system file)
The services are hosted under WAS iis 7.5.
since each service should read from db , each service references a DAL dll file.
So here are our components :
I would like to read the xml data to CACHE ( at the first request) and from now on - read from cache. (reading the file each reqeust is out of the question).
idea #1 = the dll , in his ctor , at first request will read the xml file and load it to its cache.
so the dal will look like this :
so now each service can access the DLL's cache object via property. ( one advantage is when dealing with cache dependency on a single file - so when it is changing , we should reload only to one location).
idea #2 = when service is up , load the xml into its cache.
so now , each service will look like this :
Service #1 :
Service #2 :
..
the downside is many cache dependencies on the same file
Question :
By the best practice experience and by design pattern POV : which is the preferred way ?
p.s. the xml file change frequency 1/(1 month)

First of all, when it comes to file system, on Windows Server OS, there's a built in cache layer above the disk. So you probably won't feel much difference regarding disk reads. Of course, parsing the same input again and again is not a good practice, so the parsed (tokenized) xml should be cached.
The design needs more clarifications:
Is there only a single instance of a DAL class, shared among the 5
services? Or maybe the property described in idea 1 is static?
In idea 2: when the file changes and, say, connection string 4 is
changed (and everything else remains the same) - only service 4
should be reloaded?
If a specific service is reloaded - does it cause some kind of
inconsistency with other (non fresh) services?
Update:
I'm still not sure I fully understand the scenario, but here's what I'd do as far as I understand:
The DAL should expose an interface for all data related operations.
Let's say it's IDataGateway
Now, each service has should have a reference to an instance that implements IDataGateway. The service should not be aware of the caching mechanism at all. It just consumes data from the interface.
So all of the caching is done outside the service, in terms of classes and code organization.
Now, the caching layer, in turn, implements IDataGateway, and also consumes a non cached instance of IDataGateway. That's called Decorator pattern. The non cached instance is to be injected in the constructor.
Now, I suggest each service has its own instance of a cached IDataGateway. It's simpler than a singleton (to me, at least). And since data is not shared between services, then we're cool. If, however, data is shared between the services, than a single instance should be used.
Back to those 5 instances, and to the xml file.
We want to monitor this file once it changes, right? We could easily write our own file monitor, or use the one that comes with the framework, or we could see the source code of the CacheDependency class.
The simplest way to do it is to have 5 monitors watching the same file. That's not much of a performance penalty, since timers are quite "cheap".
If, however, you'd like to reduce the resources being used by your system, then you could use a single monitor, having it raise its event of FileChanged or something like that. Each of the 5 cached implementations (those 5 instances) of IDataGateway should have this monitor injected in its constructor, and wire up its own event listener to the FileChanged event.
Once this event is triggered, all of the 5 cached instances of IDataGateway would invalidate their inner cache, thus they should clear their in-memory entries.
On the next call, the cached implementation of IDataGateway would try to take the non existing data from its in-memory cache, but obviously nothing would be there, so it should go on executing the same method in the non-cached implementation of IDataGateway, and populate its cache.
That's my design, HTH...

For me the question comes down to who really needs to know about connection strings: the DAL or the Service? Obviously it's the DAL. The service doesn't (or shouldn't) care what kind of data store the DAL is using - could be a bunch of CSV's on the disk (yikes!) for all it cares. So, it wouldn't make sense to put the connection strings in the services. The DAL needs the connection info, so the DAL should take care of finding it and caching it.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.