Loading /Lazy loading of related entities - c#

Scenario:
I have a (major) design problem. I have DTO classes to fill data from DB and use in the UI. The scenario I have is that:
I have a HouseObject which has TenantObject (one to many) with each tenant has AccountObject (one to many again) and so on (Example Scenario Only)
Problem:
Now my issue is, while retrieving data from DB for HouseObject, Should I get list of all TenantObjects and inturn list of all AccountObjects and so on? because of the one to many relationship, for one HouseObject potentially we are retrieving huge data for Tenants, Accounts and so on.
Should we just retrieve just HouseObject and fire off individual dependent queries per dependency? or should I get all data at once in single call and bind it on screen. Which is the desired solution?
Please advice.

If you looking for performance, and this is what I think you're looking for, you have to think in broad terms, not just Lazy/not lazy. You have to think, how much data you have, and how often it gets updated; where is it stored? Where application runs, how it is utilized, etc.
I see few scenarios:
Lazy load of small chunks. I like this one because it is useful when your data is modified often. You have fresh data all the time.
Caching in application layer. Here can be 2 sub-scenarios
a. Caching ready-models. You prepare your models, fully initialized.
b. Caching separate segments of data (houses, tenants, accounts) and query it in memory.
Prepare your denormalized joined data and store it in Materialized (Oracle) or Indexed (Sql Server) view. You still will have a trip to DB but it will be more efficient than join data each time, or make multiple calls each time.
Combination of #1 and #2.b and I like it the most. It requires more coding but gets you best results in performance and concurrency. Even if your data is mutating, you can have a mechanism to dump the cache. And if your Account changes, you don't need to dump Tenant.
And one more thing, if you need to update your data, remember - use different models. Your display models should be only ViewModel while you should have separate model for your Save. For example, your Save model may have field updatedBy while your View model doesn't.
You really don't have a "major" problem. It is normal daily problem for developers. You need to think of all aspects of your system.

Related

Domain Driven Design query between aggregates

Im new in DDD and I would like yours advise.
In my UI I need to view data from 2 aggregates.Im using EF Core and as I have read its better to keep only one navigation between entities so not to mix two aggregates and avoids serialization issues due to circular references.
How should I make the query?
Do I need to create a new view whenever I need data from 2 aggregates?
If needs to create views in which layer this view can exist? In infrastructure persistance layer or domain?
Thank you
How should I make the query?
With the simplest and fastest technology you can use. I mean: if building the query with EF Core requires several steps and a lot of extra objects, change approach and try with a direct SQL request. It's query, something you can test fast and you can change equally fast, whenever you need to do.
Do I need to create a new view whenever I need data from 2 aggregates?
You don't. With a view you hide away (in the view) the complexity oft the data read (at the code to change the DB every time the data to show should change), with the illusion/feeling that you manage an entity. Or course it should be clear that the data comes from a view. A query, on the other side, is more code related (to change the data shown you just change the query), but you also show "directly" that that data come from several sources.
Note: I've used EF Core years ago, and for a a really simple project. If with view you mean instead a view of the EF Core, than I would say yes. But only if building it doesn't require several steps/joins to gather the information. I would always think about a direct approach, when it looks that the code starts to be a bit too complex to show some data.
Here, anyway, the things can go really deep: do you have all your entities (root) in the same project? Or you have several microservices? With microservices, how do you share the data and how do you store it? Maybe a query is not viable, or it reads partially old data. As you can see, there're several thing to take into account when you have to read the data.
If needs to create views in which layer this view can exist? In infrastructure persistance layer or domain?
As stated before, if you mean a view within the EF Core, I would put really close to the layer where you're going to use it. But, it could depend. You could have a look here.
Personally I use 3 layers: domain, application and infrastructure. My views are in the application layer, because I have several queries that I reuse for different purposes. But before going into the infrastructure (where the requests are) I transform the results into the format required for UI.
DDD is about putting together all the business logic that otherwise is spread around several entities, services and even controllers. With this solution, all the actions that the domain offers could be performed without requiring extra logic outside the domain itself. Of course you need to implement the services that the domain is going to use, this is obvious.
On the other side is clear, at least for me, that the reuse is limited to the domain itself. I mean:
I can build a big query, that collects a lot of information from different sources, and reuse it for several UI views, but I've to be ready to pay the price of something that I have to touch every time something in the UI changes (anyway I need to transform this into a view related object);
I can build small, specialized queries that I use for 1, 2 (if they are the same) UI views, paying the price of more code (but simple and specialized, and really fast to test!) to maintain (here the query can produce close to/equal to view related object).
The second approach is the basic of CQRS, and I prefer that one. Remember, you can do CQRS even without event store and eventually consistency: you just take part of it, not the whole. We design to simplify our lives, not to make them harder.

What is the best way to ensure there is only a single ORM model instance tied to a row in a database?

The Problem
We have an app that stores hierarchical data in a database. We have defined a POCO object which represents a row of data.
The problem is we need certain properties to be dependent on the item's children and others on their ancestors. As an example, if a ((great)grand)child has incomplete state, then implicitly all of its parents are also incomplete. Similarly, if a parent has a status of disabled, then all children should be implicitly disabled as well.
On the database side of things, everything works thanks to triggers. However, the issue we're having is then synching those changes to any in-memory ORM objects that may have been affected.
That's why we're thinking to do all of this, we need to ensure there is only ever one model instance in memory for any specific row in the database. That's the crux of the entire problem.
We're currently doing that with triggers in the DB, and one giant hash-set of weak references to the objects keyed on the database's ID for the in-memory ORM objects, but we're not sure that's the proper way to go.
Initial Design
Our 'rookie' design started by loading all objects from the database which quickly blew out the memory, let alone took a lot of time loading data that may never actually be displayed in the UI as the user may never navigate to it.
Attempt 2
Our next attempt expanded on the former by dynamically loading only the levels needed for actual display in the UI, which greatly sped up loading, but now doesn't allow the state of the hierarchy to be polled without several calls to the database.
Attempt 2B
Similar to above, but we added persistent 'implicit status' fields which were updated via triggers in the database. That way if a parent was disabled, a trigger updated all children accordingly. Then the model objects simply refreshed themselves with the latest values from the database. This has the down-side of putting some business logic in the model layer and some in the database triggers as well as making both database writes and reads needed for every operation.
Fully Dynamic
This time we tried to make our models 'dumb' and removed our business layer completely from the code, moving that logic entirely to the database. That way there was only single-ownership of the business rules. Plus, this guaranteed bad data couldn't be inserted into the database in the first place. However, here too we needed to constantly poll the database for the 'current' values, meaning some logic did have to be built in to know which objects needed to be refreshed.
Fully Dynamic with Metadata
Similar to above, but all write calls to the database returned an update token that told the models if they had to refresh any loaded parents or children.
I'm hoping to get some feedback from the SO community on how to solve this issue.

Query DB with the little things, or store some "bigger chunks" of results and filter them in code?

I'm working on an application that imports video files and lets the user browse them and filter them based on various conditions. By importing I mean creating instances of my VideoFile model class and storing them in a DB table. Once hundreds of files are there, the user wants to browse them.
Now, the first choice they have in the UI is to select a DateRecorded, which calls a GetFilesByDate(Date date) method on my data access class. This method will query the SQL database, asking only for files with the given date.
On top of that, I need to filter files by, let's say, FrameRate, Resolution or UserRating. This would place additional criteria on the files already filtered by their date. I'm deciding which road to take:
Only query the DB for a new set of files when the desired DateRecorded changes. Handle all subsequent filtering manually in C# code, by iterating over the stored collection of _filesForSelectedDay and testing them against current additional rules.
Query the DB each time any little filter changes, asking for a smaller and very specific set of files more often.
Which one would you choose, or even better, any thoughts on pros and cons of either of those?
Some additional points:
A query in GetFilesByDate is expected to return tens of items, so it's not very expensive to store the result in a collection always sitting in memory.
Later down the road I might want to select files not just for a specific day, but let's say for the entire month. This may give hundreds or thousands of items. This actually makes me lean towards option two.
The data access layer is not yet implemented. I just have a dummy class implementing the required interface, but storing the data in a in-memory collection instead of working with any kind of DB.
Once I'm there, I'll almost certainly use SQLite and store the database in a local file.
Personally I'd always go the DB every time until it proves impractical. If it's a small amount of data then the overhead should also be small. When it gets larger then the DB comes into its own. It's unlikely you will be able to write code better than the DB although the round trip can cost. Using the DB your data will always be consistent and up to date.
If you find you are hitting the BD too hard then you can try caching your data and working out if you already have some or all of the data being requested to save time. However then you have aging and consistency problems to deal with. You also then have servers with memory stuffed full of data that could be used for other things!
Basically, until it becomes an issue, just use the DB and use your energy on the actual problems you encounter, not the maybes.
If you've already gotten a bunch of data to begin with, there's no need to query the db again for a subset of that set. Just store it in an object which you can query on refinement of the search query by the user.

Implementing object change tracking in an N-Tier WCF MVC application

Most of the examples I've seen online shows object change tracking in a WinForms/WPF context. Or if it's on the web, connected objects are used, therefore, the changes made to each object can be tracked.
In my scenario, the objects are disconnected once they leave the data layer (Mapped into business objects in WCF, and mapped into DTO on the MVC application)
When the users make changes to the object on MVC (e.g., changing 1 field property), how do I send that change from the View, all the way down to the DB?
I would like to have an audit table, that saves the changes made to a particular object. What I would like to save is the before & after values of an object only for the properties that we modified
I can think of a few ways to do this
1) Implement an IsDirty flag for each property for all Models in the MVC layer(or in the javascript?). Propagate that information all the way back down to the service layer, and finally the data layer.
2) Having this change tracking mechanism within the service layer would be great, but how would I then keep track of the "original" values after the modified values have been passed back from MVC?
3) Database triggers? But I'm not sure how to get started. Is this even possible?
Are there any known object change tracking implementations out there for an n-tier mvc-wcf solution?
Example of the audit table:
Audit table
Id Object Property OldValue NewValue
--------------------------------------------------------------------------------------
1 Customer Name Bob Joe
2 Customer Age 21 22
Possible solutions to this problem will depend in large part on what changes you allow in the database while the user is editing the data.
In otherwords, once it "leaves" the database, is it locked exclusively for the user or can other users or processes update it in the meantime?
For example, if the user can get the data and sit on it for a couple of hours or days, but the database continues to allow updates to the data, then you really want to track the changes the user has made to the version currently in the database, not the changes that the user made to the data they are viewing.
The way that we handle this scenario is to start a transaction, read the entire existing object, and then use reflection to compare the old and new values, logging the changes into an audit log. This gets a little complex when dealing with nested records, but is well worth the time spent to implement.
If, on the other hand, no other users or processes are allowed to alter the data, then you have a couple of different options that vary in complexity, data storage, and impact to existing data structures.
For example, you could modify each property in each of your classes to record when it has changed and keep a running tally of these changes in the class (obviously a base class implementation helps substantially here).
However, depending on the point at which you capture the user's changes (every time they update the field in the form, for example), this could generate a substantial amount of non-useful log information because you probably only want to know what changed from the database perspective, not from the UI perspective.
You could also deep clone the object and pass that around the layers. Then, when it is time to determine what has changed, you can again use reflection. However, depending on the size of your business objects, this approach can impose a hefty performance penalty since a complete copy has to be moved over the wire and retained with the original record.
You could also implement the same approach as the "updates allowed while editing" approach. This, in my mind, is the cleanest solution because the original data doesn't have to travel with the edited data, there is no possibility of tampering with the original data and it supports numerous clients without having to support the change tracking in the UI level.
There are two parts to your question:
How to do it in MVC:
The usual way: you send the changes back to the server, a controller handles them, etc. etc..
The is nothing unusual in your use case that mandates a change in the way MVC usually works.
It is better for your use case scenario for the changes to be encoded as individual change operations, not as a modified object were you need to use reflection to find out what changes if any the user made.
How to do it on the database:
This is probably your intended question:
First of all stay away from ORM frameworks, life is too complex as it.
On the last step of the save operation you should have the following information:
The objects and fields that need to change and their new values.
You need to keep track of the following information:
What the last change to the object you intend to modify in the database.
This can be obtained from the Audit table and needs to be saved in a Session (or Session like object).
Then you need to do the following in a transaction:
Obtain the last change to the object(s) being modified from the database.
If the objects have changed abort, and inform the user of the collision.
If not obtain the current values of the fields being changed.
Save the new values.
Update the Audit table.
I would use a stored procedure for this to make the process less chatty, and for greater separations of concerns between the database code and the application code.

How to deal with large objects?

I have 5 types of objects: place info (14 properties),owner company info (5 properties), picture, ratings (stores multiple vote results), comments.
All those 5 objects will gather to make one object (Place) which will have all the properties and information about all the Place's info, pictures, comments, etc
What I'm trying to achieve is to have a page that displays the place object and all it's properties. another issue, if I want to display the Owner Companies' profiles I'll have object for each owner company (but I'll add a sixth property which is a list of all the places they own)
I've been practicing for a while, but I never got into implementing and performance experience, but I sensed that it was too much!
What do you think ?
You have to examine the use case scenarios for your solution. Do you need to always show all of the data, or are you starting off with displaying only a portion of it? Are users likely to expand any collapsed items as part of regular usage or is this information only used in less common usages?
Depending on your answers it may be best to fetch and populate the entire page with all of the data at once, or it may be the case that only some data is needed to render the initial screen and the rest can be fetched on-demand.
In most cases the best solution is likely to involve fetching only the required data and to update the page dynamically using ajax queries as needed.
As for optimizing data access, you need to strike a balance between the number of database requests and the complexity of each individual request. Because of network latency it is often important to fetch as much as possible using as few queries as possible, even if this means you'll sometimes be fetching data that you do not always need. But if you include too much data in a single query, then computing all the joins may also be costly. It is quite rare to see a solution in which it is better to first fetch all root objects and then for every element go fetch some additional objects associated with that element. As such, design your solution to fetch all data at once, but include only what you really need and try to keep the number of involved tables to a minimum.
You have 3 issues to deal with really, and they are often split into DAL, BLL and UI
Your objects obviously belong in the BLL and if you're considering performance then you need to consider how your objects will be created and how they interface to the DAL. I have many objects with 50-200 properties so 14 properties is really no issue.
The UI side of it is seperate, and if you're considering the performance of displaying a lot of information onto a single page you'll consider tabbed content, grids etc.
Tackle it one thing at a time and see where your problems lie.

Categories