I'm using Entity Framework. I have a list of Requests. Each Request has a list of Approvals. When a user is logged in, I need to find a list of Requests that the user is involved in (is a member of a group who's GroupId is in an Approval in the Request). To figure out which groups a user belongs to, I call CheckGroups(groupIds) where groupIds is a list of strings I want to check, and it returns a list of strings that the user belongs to. This method is relatively slow, as it has to make a network call (it's an Azure Active Directory Graph API call). Also, groupIds has a max size of 20.
public class MyDbContext : DbContext
{
public virtual DbSet<Request> Requests;
public virtual DbSet<Approval> Approvals;
}
public class Request
{
public int RequestId;
// several irrelevant properties
public virtual ICollection<Approval> Approvals;
}
public class Approval
{
public int ApprovalId;
public int RequestId;
// several irrelevant properties
public string GroupId;
}
This is what I'm thinking so far:
Go through MyDbContext.Approvals and get list of all unique GroupId.
Pass 20 of them to CheckGroups().
Store returned strings to a list.
Repeat steps 2 and 3 until all unique groups sent.
Go through MyDbContext.Approvals and if the GroupId matches the list from step 3, add the RequestId to a list.
Get list of all Requests that have a RequestId in the list from step 5.
Seems really inefficient. Is there a better way to do this? Trying to minimize time (database calls for entity framework and calls to CheckGroups() are the bottlenecks). As the database grows larger (more Requests added with multiple Approvals per Request) this could get ugly.
Based on my understanding, the network request has the most effect for the performance. Especially, you would repeat the request until the groups sent.
I also suggest that you get all the groups belonged users first then compare the groups locally to see whether the performance was improved.
Or you may consider making your request asynchronously to improve the performance of network request.
Related
I am not exactly sure which datatype/approach would be best for given the following scenarios.
I will be storing full json objects and each of these objects will have multiple properties, one of them will be an ID (int field).
public class Event
{
public int EventId {get; set}
public DateTime EventDate {get; set;}
public string Title {get; set;}
public int TypeId {get; set;}
}
I will need to be able to look up individual objects by this id, I assume this will be just be stored as a key/value pair, key being "something"+id and the value being a serialized json object.
I would like to be able to get list of the above objects in a paged manner, say first page, and page size is 20. (Hashset or Sorted Set)
number 2 above Can also be done with paging but filtered by one of the fields first then return results
I would like to only have one copy of each json object to satisfy the above scenarios, from everything I read so far it seems that I will be creating multiple copies of each object to satisfy all of the above scenarios.
So in short, I like my list of objects stored to be retrieved by a
Retrieve a single item by an ID (property of the json object
Paged lists without filters
Paged lists with paging and filtering of json objects
At any time any of the event objects can be changed by the user, so the cache needs to be updated (invalidate cache/update cache)
I am writing the code in .NET so if that makes any difference.
Seems like on top of doing simple key value queries, what you need it some additional logic to run on the server (Redis) side. You can use Lua scripts in Redis to perform such tasks.
If I understand your requirements correctly, here is how I'd approach:
Store objects in a sorted set (if you the returned objects in a specific order)
You can now querying single and unfiltered objects by native Redis commands:
Redis - Sorted set, find item by property value
For filtered objects you can look into Lua scripts
Redis does not provide anything out of the box to invalidate/update cache stored in lists. And you'll have to write additional code to handle cache updation. Read more here: https://quickleft.com/blog/how-to-create-and-expire-list-items-in-redis
You can also check ignite, it has a couple of features built in that may be of interest to you:
Binary marshaller :
"It enables you to read an arbitrary field from an object's serialized form without full object deserialization."
https://apacheignite.readme.io/docs/binary-marshaller
Read/Write through ability for cache updation: https://ignite.apache.org/use-cases/caching/database-caching.html
I'm using CQRS + ES and I have a modeling problem that can't find a solution for.
You can skip the below and answer the generic question in the title: Where would you query data needed for business logic?
Sorry of it turned out to be a complex question, my mind is twisted at the moment!!!
Here's the problem:
I have users that are members of teams. It's a many to many relationship. Each user has an availability status per team.
Teams receive tickets, each with a certain load factor, that should be assigned to one of the team's members depending on their availability and total load.
First Issue, I need to query the list of users that are available in a team and select the one with the least load since he's the eligible for assignment.(to note that this is one of the cases, it might be a different query to run)
Second Issue, load factor of a ticket might change so i have to take that into consideration when calculating the total load per user . Noting that although ticket can belong to 1 team, the assignment should be based on the user total load and not his load per that team.
Currently a TicketReceivedEvent is received by this bounded context and i should trigger a workflow to assign that ticket to a user.
Possible Solutions:
The easiest way would be to queue events and sequentially send a command AssignTicketToUser and have a service query the read model for the user id, get the user and user.assignTicket(Ticket). Once TicketAssignedEvent is received, send the next assignment command. But it seems to be a red flag to query the read model from within the command handler! and a hassle to queue all these tickets!
Have a process manager per user with his availability/team and tickets assigned to that user. In that case we replace the query to the read side by a "process manager lookup" query and the command handler would call Ticket.AssignTo(User). The con is that i think too much business logic leaked outside the domain model specifically that we're pulling all the info/model from the User aggregate to make it available for querying
I'm inclined to go with the first solution, it seems easier to maintain, modify/extend and locate in code but maybe there's something i'm missing.
Always (well, 99.99% of cases) in the business/domain layer i.e in your "Command" part of CQRS. This means that your repositories should have methods for the specific queries and your persistence model should be 'queryable' enough for this purpose. This means you have to know more about the use cases of your Domain before deciding how to implement persistence.
Using a document db (mongodb, raven db or postgres) might make work easier. If you're stuck with a rdbms or a key value store, create querying tables i.e a read model for the write model, acting as an index :) (this assumes you're serializing objects). If you're storing things relationally with specific table schema for each entity type (huge overhead, you're complicating your life) then the information is easily queryable automatically.
Why can't you query the aggregates involved?
I took the liberty to rewrite the objective:
Assign team-ticket to user with the lowest total load.
Here we have a Ticket which should be able to calculate a standard load factor, a Team which knows its users, and a User which knows its total load and can accept new tickets:
Update: If it doesn't feel right to pass a repository to an aggregate, it can be wrapped in a service, in this case a locator. Doing it this way makes it easier to enforce that only one aggregate is updated at a time.
public void AssignTicketToUser(int teamId, int ticketId)
{
var ticket = repository.Get<Ticket>(ticketId);
var team = repository.Get<Team>(teamId);
var users = new UserLocator(repository);
var tickets = new TicketLocator(repository);
var user = team.GetUserWithLowestLoad(users, tickets);
user.AssignTicket(ticket);
repository.Save(user);
}
The idea is that the User is the only aggregate we update.
The Team will know its users:
public User GetGetUserWithLowestLoad(ILocateUsers users, ILocateTickets tickets)
{
User lowest = null;
foreach(var id in userIds)
{
var user = users.GetById(id);
if(user.IsLoadedLowerThan(lowest, tickets))
{
lowest = user;
}
}
return lowest;
}
Update: As a ticket may change load over time, the User needs to calculate its current load.
public bool IsLoadedLowerThan(User other, ILocateTickets tickets)
{
var load = CalculateLoad(tickets);
var otherLoad = other.CalculateLoad(tickets);
return load < otherLoad;
}
public int CalculateLoad(ILocateTickets tickets)
{
return assignedTicketIds
.Select(id => tickets.GetById(id))
.Sum(ticket.CalculateLoad());
}
The User then accepts the Ticket:
public void AssignTicket(Ticket ticket)
{
if(ticketIds.Contains(ticket.Id)) return;
Publish(new TicketAssignedToUser
{
UserId = id,
Ticket = new TicketLoad
{
Id = ticket.Id,
Load = ticket.CalculateLoad()
}
});
}
public void When(TicketAssignedToUser e)
{
ticketIds.Add(e.Ticket.Id);
totalLoad += e.Ticket.Load;
}
I would use a process manager / saga to update any other aggregate.
You can query the data you need in your application service. This seems to be similar to your first solution.
Usually, you keep your aggregates cross-referenced, so I am not quite sure where the first issue comes from. Each user should have a list of teams it belongs to and each group has the list of users. You can complement this data with any attributes you want, including, for example, availability. So, when you read your aggregate, you have the data directly available. Surely, you will have lots of data duplication, but this is very common.
In the event sourced model never domain repositories are able to provide any querying ability. AggregateSource by Yves Reynhout is a good reference, here is the IRepository interface there. You can easily see there is no "Query" method in this interface whatsoever.
There is also a similar question Domain queries in CQRS
I'm using Stack Exchange .Net Redis provider to store and retrieve values. I would like to know how can I search for certain records inside Redis (like any database, search needs to be executed in Redis instance not in .Net application)
Example:
public class Employee
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int Age { get; set; }
public int Salary {get;set;}
}
If I have 100,000 records of employees stored as .Net "List<Employee> lstEmployee = new List<Employee>();" in Redis cache server and would like to fetch only the record whose age >50 and salary > 5000, how should I code for it?
Disclosure: I'm just getting started with Redis using this example.
First, a "cache server" is not intended to be used as a queryable store. If we assume instead that you mean simply a nosql backend, then ... well, frankly, that doesn't sound like the sort of query I would try and do via redis. The point of redis is that you build whatever indexes you need yourself. If you want ordered range queries (the age / salary), then a sorted set and ZRANGEBYSCORE is probably a viable option; however, intersecting these two queries is more difficult. You could try asking the same question ib the redisdb google-group, but just as a general redis question - not specific to any client library such as SE.Redis. If the operations exist ib redis, then you can use the client library to invoke them.
I'm wondering, however, whether "elastic" might be a better option for what you describe.
I have a list of product that I have stored in asp.net cache but I have a problem in refreshing the cache. As per our requirement I want to refresh cache every 15 minutes but I want to know that if in the mean time when the cache is being refreshed if some user ask for the list of product then should he get error or the old list or he have to wait until the cache is refreshed.
the sample code is below
public class Product
{
public int Id{get;set;}
public string Name{get;set;}
}
we have a function which gives us list of Product in BLL
public List<Product> Products()
{
//////some code
}
Cache.Insert("Products", Products(), null, DateTime.Now.AddMinutes(15), TimeSpan.Zero);
I want to add one more situation here, Let say I use static object instead of cache object then what will happen and which approach is best if we are on a stand alone server and not on cluster
Sorry - this might be naive/obvious but just have a facade type class which does
if(Cache["Products"] == null)
{
Cache.Insert("Products", Products(), null, DateTime.Now.AddMinutes(15), TimeSpan.Zero);
}
return Cache["Products"];
There is also a CacheItemRemoveCallback delegate which you could use to repopulate an expired cache. As an alternative
ALSO
use the cache object rather than static objects. More efficient apparently (Asp.net - Caching vs Static Variable for storing a Dictionary) and you get all your cache management methods (sliding expiration and so on)
EDIT
If there is a concern about update times then consider two cache objects plus a controller e.g.
Active Cache
Backup Cache - this is the one that will be updated
Cache controller (another cache object?) this will indicate which object is active
So the process to update will be
Update backup cache
Completes. Check is valid
Backup becomes active and visa versa. The control now flags the Backup cache as being active
There needs to be a method which will fire when the products cache object is populated. I would probably use the CacheItemRemoveCallback delegate to initiate the cache repopulation. Or do an async call in the facade type class - you wouldn't want it blocking the current thread
I'm sure there are many other variants of this
EDIT 2
Actually thinking about this I would make the controller class something like this
public class CacheController
{
public StateEnum Cache1State {get;set;}
public StateEnum Cache1State {get;set;}
public bool IsUpdating {get;set;}
}
The state would be active, backup, updating and perhaps inactive and error. You would set the IsUpdating flag when the update is occurring and then back to false once complete to stop multiple threads trying to update at once - i.e. a race condition. The class is just a general principle and could/should be amended as required
Not sure if it's the best title for the question... maybe someone could rename it for me?
My question is regarding performance of reading and combining data in c# ServiceStack wrapper for Redis and how the calls work internally.
I will explain two scenarios that will hopefully yield in a final result. One scenario has the list of category id's attached to the Transaction so that the Category can be stored independently.
Question: My end goal is to retrieve all transactions that have category 'food'.
I have tried to number other points where clarity would help my understanding. Consider there being 10,000 transactions and each transaction had on average 3 categories.
Note: There is a related question at ServiceStack.Net Redis: Storing Related Objects vs. Related Object Ids however doesn't explain the efficiency.
Example A
public class Transaction
{
public List<string> CategoryIds;
}
Example B
public class Transaction
{
public List<string> CategoryNames;
}
Code
var transactionClient = redisClient.GetTypedClient<Transaction>();
//1. is this inefficient returning all transactions?
// is there any filtering available at this part?
var allTransactions = transactionClient.GetAll();
//2. In the case of Example A where the categories are stored as id's
// how would I map the categories to a transaction?
// maybe I have a List that has a container with the Transaction associated with a
// list of Categories, however this seems inefficient as I would have to loop
// through all transactions make a call to get their Categories and then
// populate the container datatype.
//3. If we are taking Example B how can I efficiently just retrieve the transactions
// where they have a category of food.
The efficiency is less network calls vs more data. Data in Redis just gets blobbed, most of the time a single API call maps 1:1 with a redis server operation. Which means you can think about the perf implications as simply downloading a json dataset blob from a remote server's memory and deserializing it on the client - which is effectively all that happens.
In some APIs such as GetAll() it requires 2 calls, 1 to fetch all the ids in the Entity set, and the other to fetch all the records with those ids. The source code of the Redis Client is quite approachable so I recommend having a look to see exactly what's happening.
Because you've only got 3 categories, it's not that much extra data you're saving by trying to filter on the server.
So your options are basically:
Download the entire entity dataset and filter on the client
Maintain a custom index mapping from Category > Ids
More Advanced: Use a server-side LUA operation to apply server side filtering (requires Redis 2.6)