Audit history for changes in Db Row

Audit history for changes in Db Row - c#

Scenario:
I have a database table, any changes on data of any column of this table needs to be audit logged for comparison purposes.
What I have tried:
I have a history table with the same values columns as the parent table and any change to the database gets recorded into the new table using triggers and I eventually achieve what I want.
Problem
The issue is multi-fold:
I am using triggers which I do not want to use.
If I have to have audit comparison for n more tables, then I need to have one history table per parent table and which just swells my database and makes it bulky with so many tables.
Is there a better approach of achieving this, please suggest?

The answer strongly relates to where you business or domain logic sits.
If your business logic is within your database (stored procedures and triggers) then, then I feel your approach is correct to have the database triggers write to the relevant audit tables.
I am using triggers which I do not want to use.
Audit Appoach:
If your domain logic is in a domain layer within your c# code, then you're quite right in saying you don't want your audits to be in triggers. Mixing business logic between your domain layer and database may lead to maintenance nightmare o_O
Assuming you're logic is in a domain layer: An idea would be to have a base class for your domains or services which handles writing to audit trails:
public class DomainBase<T>
{
public DomainBase(bool isAuditEnabled)
{
this.IsAuditEnabled = isAuditEnabled;
}
public bool IsAuditEnabled { get; set; }
public void AddNew(T newEntity)
{
// default code for adding an entity
this.Audit_Create(newEntity);
}
public void Audit_Create(T newEntity)
{
if (IsAuditEnabled)
{
// ...
}
}
}
Your base class can have standard AddNew, Update, Delete methods which in turn each call the relevant Audit methods. Then also having you might want to consider an IsAuditEnabled switch to allow you to easily turn on/off specific audits. This way you only audit the changes you care about and nothing else.
Each custom domain method can opt to write to the audit trail or not. This is also why it's not a good idea in your scenario to put audit trails in the DAL (Data Access Layer), since business logic decides if and what must be audited and the DAL should not have to make these type of logic decisions.
If I have to have audit comparison for n more tables, then I need to
have one history table per parent table and which just swells my
database and makes it bulky with so many tables.
Size of database concern
As mentioned already, only auditing what you need will decrease the amount of
audit data written.
If the audit data is too much or grows too fast, you could opt for a separate audit database. That way your production database stays lightweight and optimized and you only can for rare cases (hopefully) query the audit database. This way you can go BIG and audit the life out of everything that moves without being concerned about performance (don't even include indexes in the audit database to allow quick and efficient writes).
Also if you don't really care about ALL data in a table, you can create audit tables with only the fields which are important to you. So you may end up with a table of 50 columns only auditing 5 or 10 columns which are crucially for historic purposes.

Related

Logic place in Domain Driven Design

I read much about DDD, but I can't steel see how use it in a real life. I make some examples (with C# and Entity Framework) for things which I can't understand.
Add money for User. User has just amount of money.
Ok, It's simple. It's example of model User
class User
{
public decimal Balance {get; private set; }
public void AddMoney(decimal sum)
{
if(sum>0)
Balance+=sum;
}
}
But how can I use it?
Fetch user from database
Update user - performed by domain model
Save changes
So, the first question is where I should perform fetching and saving data from database (repository)? I can't do this inside my domain model.
User has history of transaction, not just simple sum
class User
{
public ICollection<Transaction> Transactions {get; private set; }
public void AddMoney(decimal sum)
{
if(sum>0)
Transactions.Add(new Transaction(sum));
}
}
In this case I should fetch user from database and then EF will add new Entity to collection. But it isn't efficient, more efficient is do something like this:
transactionsRepository.Add(new Transaction(sum, userId));
but it isn't DDD-way.
Get money from one user and transfer to another
In this case operation affects multiple models. Where I should put logic which works with multiple models? (Maybe this example isn't good enough).
Get users' current balance
User's balance is a sum of all transactions
decimal Balance() => transactionsRepository.Get().Sum(x=>x.TransactionSum);
In this case query contains logic - how I should fetch data to do something, not simple fetch\save entities like in other examples. Where I should place queries with logic? Get total balance, get last unread messages etc.

So, the first question is where I should perform fetching and saving data from database (repository)? I can't do this inside my domain model.
You do this in an Application service in a Layered architecture or a command handler in a CQRS architecture .
But it isn't efficient, more efficient is do something like this
It is more efficient but indeed not the DDD way. Aggregates should have no dependency to repository. They work only with state that is stored in memory. An application service is responsible fir loading and storing an aggregate.
Where I should put logic which works with multiple models?
In Sagas/Process managers.
Where I should place queries with logic?
It depends on what do you query.
If not using CQRS:
If you query data from an Aggregate in an Aggregate's method.
If you query a specific lists of Aggregates them you put that logic in the repository.
If using CQRS then any query is done on a read-model/projection.

How to design system around state of object to not duplicate mechanisms in code and back-end?

System I am working with ATM is C# and oracle however problem I am having is system agnostic (could happen to system with java and mysql or any other front-end and back-end combination):
I have TransactionDetail object that can have 9 statuses
Open,
Complete,
Cancelled,
No Quote,
Quoted,
Instructed,
Declined,
Refunded,
Removed
From my experience when one has to deal with statuses in front-end code he should do everything he can to avoid object status having a setter. It is because status is inherent quality and has to be determined at the moment when it is being needed - in other words status should always be determined by a method or get only property and not set.
So statuses are being retrieved with mechanisms like this (this is only a fragment of code but should give you indication how it works)
public TransactionStatus TransactionStatus()
{
if (db.DeclinedTransactions.Any(o => o.TransactionId == this.TransactionId))
return TransactionStatus.Declined;
}
MI is asking for these transaction statuses in a SQL view that would also contain all the data related to transaction.
If object status can be determined only from data of object itself creating computed columns can solve this problem in database. But what about objects like TransactionDetail that spans multiple tables - there isn't computed column mechanism that would allow to 'peek' into other tables.
The only solution I can think of is adding SQL function that determines state and then create a SQL view that contains function + data from table. What I don't like about this approach is that it requires to duplicate logic in code and in database.
How one should design system around state of object which to be determined requires information from more than one table, in a way that would not require to duplicate mechanisms in code and back-end?

If this were a project I was working on, I would not be looking to create a View to calculate this data.
I would be looking at my application business logic.
Whilst a fully normalised database makes perfect sense to the DBAs, there are cases where application performance and scalability can benefit greatly from a little de-normalization.
If you have a reliable framework of business logic (i.e. well defined business objects, good encapsulation, reliable unit tests) then I would personally be looking to add this to the business objects.
This then allows you to define you Status behaviour in code and update an explicit Status. For example, if a change is made to a business object that puts it into a different TransactionStatus then you can explicitly make the change to that status on the business object and persist the entire change to your database.
The usual response to this kind of design suggestion is that you then have to ensure you have the burden of keeping the two things in sync (explicit status vs state of the object) - the answer to that is making sure there is only one piece of logic to carry out these changes and that your business logic is water-tight as described before.
An example:
Invoice contains one or more InvoiceItem
InvoiceItem has a value.
Invoice, when displayed, needs an invoice value total
Usual way that this is done is to use SUM() to calculate the Invoice total "on the fly" in the database to populate an Invoice.Total value.
But if my business logic is well defined - perhaps I add InvoiceItem to an Invoice object in code, and the Add logic also takes the value from InvoiceItem and adds it to an Invoice.Total value - then when I commit the changes, I can also commit that Invoice.Total value.
When I want to display the total, I have a single value, rather than having to aggregate in the database.

MVC Business logic helper

I have some common functions that apply throughout my application to update particular parts of the database when actions happen (Audit trail, modified dates etc). I'll use AuditTrail as an example.
Where should I be storing these functions?
Currently I am storing them in dbcontext classes
//... my db context class ...
public bool AddAuditEntry(int ID, string objectName)
{
// Here I create a new AuditTrail object, assign values then insert into db.
// This mode doesn't have a controller.
}
// We also have a table that keeps track of modified state for
// client side caching (nothing I have control over)
public bool ModifyObject(int ID)
{
// Here I mark the object id with modified date then save to db
// This particular model doesn't have a controller either.
}
I think they should belong in the model but I'm not quite sure how to implement it. Putting them in the controller isn't the best option as some of these relate only to a particular model class that may have no controller.
My problem with them being in the model is what then is the best way to update entities?

I'm not sure if this is the way other people do it but I actually have two models. I have my business models which contain these kinds of functions and validations. Stuff like an account balance cannot be less than zero or something like that. Once all that's done the models are translated into database models which are responsible for database level validations if necessary and database operations.

Static vs. Instance Write Methods in Data Access Layer

I am creating a Data Access Layer in C# for an SQL Server database table. The data access layer contains a property for each column in the table, as well as methods to read and write the data from the database. It seems to make sense to have the read methods be instance based. The question I have is regarding handling the database generated primary key property getter/setter and the write method. As far as I know I have three options...
Option 1: Using a static method while only allowing a getter on the primary key would allow me to enforce writing all of the correct values into the database, but is unwieldy as a developer.
Option 2: Using and instance based write method would be more maintainable, but I am not sure how I would handle the get/set on the primary key and it I would probably have to implement some kind of validation of the instance prior to writing to the database.
Option 3: Something else, but I am wary of LINQ and drag/drop stuff, they have burned me before.
Is there a standard practice here? Maybe I just need a link to a solid tutorial?

You might want to read up on active record patterns and some examples of them, and then implement your own class/classes.
Here's a rough sketch of a simple class that contains some basic concepts (below).
Following this approach you can expand on the pattern to meet your needs. You might be OK with retrieving a record from the DB as an object, altering its values, then updating the record (Option2). Or if that is too much overhead, using a static method that directly updates the record in the database (Option1). For an insert, the database (SP/query) should validate the natural/unique key on the table if you need to, and probably return a specific value/code indicating a unique constraint error). For updates, the same check would need to be performed if allowing natural key fields to be updated.
A lot of this depends on what functionality your application will allow for the specific table.
I tend to prefer retrieving an object from the DB then altering values and saving, over static methods. For me, it's easier to use from calling code and can handle arcane business logic inside the class easier.
public class MyEntityClass
{
private int _isNew;
private int _isDirty;
private int _pkValue;
private string _colValue;
public MyEntityClass()
{
_isNew = true;
}
public int PKValue
{
get {return _pkValue;}
}
public string ColValue
{
get {return _colValue;}
set
{
if (value != _colValue)
{
_colValue = value;
_isDirty = true;
}
}
}
public void Load(int pkValue)
{
_pkValue = pkValue;
//TODO: query database and set member vars based on results (_colVal)
// if data found
_isNew = false;
_isDirty = false;
}
public void Save()
{
if (_isNew)
{
//TODO: insert record into DB
//TODO: return DB generated PK ID value from SP/query, and set to _pkValue
}
else if (_isDirty)
{
//TODO: update record in DB
}
}
}

Have you had a look at the Entity Framework. I know you said you are wary of LINQ, but EF4 takes care of a lot of the things you mentioned and is a fairly standard practice for DALs.

I would stick with an ORM Tool (EF, OpenAccess by Telerik, etc) unless you need a customized dal that you need (not want) total control over. For side projects I use an ORM - at work however we have our own custom DAL with provider abstractions and with custom mappings between objects and the database.

Nhibernate is also a very solid tried and true ORM with a large community backing it.

Entity Framework is the way to go for your initial DAL, then optimize where you need it: Our company actually did some benchmarking in comparing EF vs SQL reader, and found that for querying the database for one or two tables worth of information, the speed is about 6's (neither being appreciably faster than the other). After two tables there is a performance hit, but its not terribly significant. The one place that writing your own SQL statements became worthwhile was in batch commit operations. At which point EF allows you to directly write the SQL queries. So save your self some time and use EF for the basic heavy lifting, and then use its direct connection for the more complicated operations. (Its the best of both worlds)

Reducing Repositories to Aggregate Roots

I currently have a repository for just about every table in the database and would like to further align myself with DDD by reducing them to aggregate roots only.
Let’s assume that I have the following tables, User and Phone. Each user might have one or more phones. Without the notion of aggregate root I might do something like this:
//assuming I have the userId in session for example and I want to update a phone number
List<Phone> phones = PhoneRepository.GetPhoneNumberByUserId(userId);
phones[0].Number = “911”;
PhoneRepository.Update(phones[0]);
The concept of aggregate roots is easier to understand on paper than in practice. I will never have phone numbers that do not belong to a User, so would it make sense to do away with the PhoneRepository and incorporate phone related methods into the UserRepository? Assuming the answer is yes, I’m going to rewrite the prior code sample.
Am I allowed to have a method on the UserRepository that returns phone numbers? Or should it always return a reference to a User, and then traverse the relationship through the User to get to the phone numbers:
List<Phone> phones = UserRepository.GetPhoneNumbers(userId);
// Or
User user = UserRepository.GetUserWithPhoneNumbers(userId); //this method will join to Phone
Regardless of which way I acquire the phones, assuming I modified one of them, how do I go about updating them? My limited understanding is that objects under the root should be updated through the root, which would steer me towards choice #1 below. Although this will work perfectly well with Entity Framework, this seems extremely un-descriptive, because reading the code I have no idea what I’m actually updating, even though Entity Framework is keeping tab on changed objects within the graph.
UserRepository.Update(user);
// Or
UserRepository.UpdatePhone(phone);
Lastly, assuming I have several lookup tables that are not really tied to anything, such as CountryCodes, ColorsCodes, SomethingElseCodes. I might use them to populate drop downs or for whatever other reason. Are these standalone repositories? Can they be combined into some sort of logical grouping/repository such as CodesRepository? Or is that against best practices.

You are allowed to have any method you want in your repository :) In both of the cases you mention, it makes sense to return the user with phone list populated. Normally user object would not be fully populated with all the sub information (say all addresses, phone numbers) and we may have different methods for getting the user object populated with different kind of information. This is referred to as lazy loading.
User GetUserDetailsWithPhones()
{
// Populate User along with Phones
}
For updating, in this case, the user is being updated, not the phone number itself. Storage model may store the phones in different table and that way you may think that just the phones are being updated but that is not the case if you think from DDD perspective. As far as readability is concerned, while the line
UserRepository.Update(user)
alone doesn't convey what is being updated, the code above it would make it clear what is being updated. Also it would most likely be part of a front end method call that may signifiy what is being updated.
For the lookup tables, and actually even otherwise, it is useful to have GenericRepository and use that. The custom repository can inherit from the GenericRepository.
public class UserRepository : GenericRepository<User>
{
IEnumerable<User> GetUserByCustomCriteria()
{
}
User GetUserDetailsWithPhones()
{
// Populate User along with Phones
}
User GetUserDetailsWithAllSubInfo()
{
// Populate User along with all sub information e.g. phones, addresses etc.
}
}
Search for Generic Repository Entity Framework and you would fine many nice implementation. Use one of those or write your own.

Your example on the Aggregate Root repository is perfectly fine i.e any entity that cannot reasonably exist without dependency on another shouldn't have its own repository (in your case Phone). Without this consideration you can quickly find yourself with an explosion of Repositories in a 1-1 mapping to db tables.
You should look at using the Unit of Work pattern for data changes rather than the repositories themselves as I think they're causing you some confusion around intent when it comes to persisting changes back to the db. In an EF solution the Unit of Work is essentially an interface wrapper around your EF Context.
With regards to your repository for lookup data we simply create a ReferenceDataRepository that becomes responsible for data that doesn't specifically belong to a domain entity (Countries, Colours etc).

If phone makes no sense w/o user, it's an entity (if You care about it's identity) or value object and should always be modified through user and retrieved/updated together.
Think about aggregate roots as context definers - they draw local contexts but are in global context (Your application) themselves.
If You follow domain driven design, repositories are supposed to be 1:1 per aggregate roots.
No excuses.
I bet these are problems You are facing:
technical difficulties - object relation impedance mismatch. You are struggling with persisting whole object graphs with ease and entity framework kind a fails to help.
domain model is data centric (as opposed to behavior centric). because of that - You lose knowledge about object hierarchy (previously mentioned contexts) and magically everything becomes an aggregate root.
I'm not sure how to fix first problem, but I've noticed that fixing second one fixes first good enough. To understand what I mean with behavior centric, give this paper a try.
P.s. Reducing repository to aggregate root makes no sense.
P.p.s. Avoid "CodeRepositories". That leads to data centric -> procedural code.
P.p.p.s Avoid unit of work pattern. Aggregate roots should define transaction boundaries.

This is an old question, but thought worth posting a simple solution.
EF Context is already giving you both Unit of Work (tracks changes) and Repositories (in-memory reference to stuff from DB). Further abstraction is not mandatory.
Remove the DBSet from your context class, as Phone is not an aggregate root.
Use the 'Phones' navigation property on User instead.
static void updateNumber(int userId, string oldNumber, string newNumber)
static void updateNumber(int userId, string oldNumber, string newNumber)
{
using (MyContext uow = new MyContext()) // Unit of Work
{
DbSet<User> repo = uow.Users; // Repository
User user = repo.Find(userId);
Phone oldPhone = user.Phones.Where(x => x.Number.Trim() == oldNumber).SingleOrDefault();
oldPhone.Number = newNumber;
uow.SaveChanges();
}
}

If a Phone entity only makes sense together with an aggregate root User, then I would also think it makes sense that the operation for adding a new Phone record is the responsibility of the User domain object throught a specific method (DDD behavior) and that could make perfectly sense for several reasons, the immidiate reason is we should check the User object exists since the Phone entity depends on it existence and perhaps keep a transaction lock on it while doing more validation checks to ensure no other process have deleted the root aggregate before we are done validating the operation. In other cases with other kinds of root aggregates you might want to aggregate or calculate some value and persist it on column properties of the root aggregate for more efficient processing by other operations later on. Note though I suggest the User domain object have a method that adds the Phone it doesn't mean it should know about the existence of the database or EF, one of the great feature of EM and Hibernate is that they can track changes made to entity classes transparently and that also means adding of new related entities by their navigation collection properties.
Also if you want to use methods that retrieve all phones regardless of the users owning them you could still though it through the User repository you only need one method returns all users as IQueryable then you can map them to get all user phones and do a refined query with that. So you don't even need a PhoneRepository in this case. Beside I would rather use a class with extensions method for IQueryable that I can use anywhere not just from a Repository class if I wanted to abstract queries behind methods.
Just one caveat for being able to delete Phone entities by only using the domain object and not a Phone repository you need to make sure the UserId is part of the Phone primary key or in other words the primary key of a Phone record is a composite key made up of UserId and some other property (I suggest an auto generated identity) in the Phone entity. This makes sense intuively as the Phone record is "owned" by the User record and it's removal from the User navigation collection would equal its complete removal from the database.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.