Tracking Changes done to a Class without using Entity Framework - c#

I am in need of tracking any changes done to a complex model (a very complex model must I say with all kinds of relationships). Once I have identified these changes, I must save them into a separate table, in order to be approved by an administrator at a later stage.
I've tried using the change tracker of Entity Framework and have even tried to customize it but it has just been giving me problem after problem.
What do you suggest I could use in order to track these changes, which does not involve Entity Framework?
UPDATE: I ended up solving this by creating my own custom checker. Took more time but in the end it was more worth it as I had total control over the changes.
Thanks for you opinions,
Steve :)

Sorry for not providing code example. As commented this is more of an idea (to broad for this Exchange) but it is a high level way that I have done before. Back when "reflection" was highly frowned upon we called it "meta data" but essentially employed reflection - and for that reason, today it is known as meta programming.
Your problem, is a lovely use case for meta programming. Reflection used to be very slow in "80's" only due to low memory and restricted CPU.
Serialises, such as JSON use reflection or the infamously slow XML (but not anymore)
Dependency Injection is the mother of meta programming
Helpers like auto mapper is mostly reflection too.
Today it has been highly optimised and works extremly well due to excellent computational power. As long as you do not write hacky code, or try to optimise it further you will be OK. You should trust the framework and compilers for that.
You can do some fancy things such as intercepting changes but that can get quite complex. To keep it a bit simpler all you have to do is follow a bit of DDD
Your classes should only allow changes via the properties you expose. Each Set or operation that mutates the state can then be sent your lovely state tracking code.
in NET 4.5 reflection is really fast and meta programming is already used in Dependency Injection allover the show.
To remember changes use an optimised collection like maybe a Dictionary or HashSet. Depends on your needs. Using GetType store that as the key and the value can be the new value, or a class that hold metadata like. Old Value, New Value, Version (for rolling back), etc etc.
Once you get that going in your class you then move all the logic into singleton, and define some generic methods that you will reuse on all your "entities"

Related

Why we need Reflection at all?

I was studying Reflection, I got some of it but I am not getting everything related to this concept. Why do we need Reflection? What things we couldn't achieve that we need Reflection?
There are many, many scenarios that reflection enables, but I group them primarily into two buckets.
Reflection enables us to write code that analyzes other code.
Consider for example the most basic question about an assembly: what types are in it? Assemblies are self-describing and reflection is the mechanism by which that description is surfaced to other code.
Suppose for example you wanted to write a program which took an assembly and did a graphical display of the relationships between the various classes in that assembly, to help you understand that code. There are such tools. They're in Visual Studio. Someone wrote those tools. They did not appear by magic. Reflection is the mechanism designed into the .NET framework that enables you or me or anyone else to write tools that understand code.
Reflection enables us to move compile time bindings to runtime.
Suppose you have a static method Foo.Bar(). When you put a call to Foo.Bar() in your program, you know with 100% certainty that the method you think is going to be called is actually going to be called. We call static methods "static" because the binding from the name Bar to the code that gets called can be understood statically -- that is, without running the program.
Now consider a virtual method Blah() on a base class. When you call whatever.Blah() you don't know exactly which Blah() will be called at compile time, but you know that some method Blah() with no arguments will be called on some type that is the runtime type of whatever, and that type is equal to or derived from the type which declares Blah(). (In fact you know more: you know that it is equal to or derived from the compile time type of whatever.) Virtual binding is a form of dynamic binding, but it is not fully dynamic. There's no way for the user to decide that this call should be to a different method on a different type hierarchy.
Reflection enables us to make calls that are bound entirely at runtime, based entirely on user choices if we like. We pay a performance penalty, and we lose compile-time type safety, but we gain the flexibility to decide 100% at runtime what code we call. There are scenarios where that's a reasonable tradeoff.
Reflection is such a deep part of the .NET framework that you often don't know that you're doing it (see Attributes and LINQ for instance). And when you do know you're doing it, even if it feels wrong, it might be the only way to achieve a particular objective.
Apart from the two broad areas that Eric mentioned here are a few others. There are lots more, these are just some that come to mind immediately.
Serialization (and similar)
Whether you're using XML or JSON or rolling your own, serializing objects is much easier when you don't have to write specific code for each class to enable serialization. Reflection enables you to enumerate the properties in your object that have been flagged for (or not flagged against) serailization and write them to the output.
This isn't about saving state though. Reflection allows us to write generic methods that can produce business output too, like CSV or XLSX files from an arbitrary collection. I get a lot of mileage out of my ToCSV(...) and ToExcel(...) extensions for things like producing downloadable versions of data sets on my web-based reporting.
Accessing Hidden Data
Yes, I know, this is a dodgy one. And yeah, Eric is probably going to slap me for this, but...
There's a lot of code out there - I'm looking at you, ASP.NET - that hides interesting and useful stuff behind private or protected. Sometimes the only way to get them out is to use reflection. Sometimes it's not the only way, but it can be the simpler way.
Attributes
Every time you tag an Attribute onto one of your classes, methods, etc. you are implicitly providing data that is going to be accessed through reflection. Want to use those attributes yourself? Reflection is the only way you can get at them.
LINQ and Other Expressions
This is really important stuff these days. If you've ever used LINQ to SQL, Entity Frameworks, etc. then you've used Expression in some way. You write a simple little POCO to represent a row in your database table and everything else gets handled by reflection. When you write a predicate expression the system is using the reflection model to build structures that are then processed (visited) to build an SQL statement.
Expressions aren't just for LINQ either, you do some really interesting things yourself, once you know what you're doing. I have code to generate line parsers for CSV import that run pretty damn quickly when compiled to Func<string, TRecord>. These days I tend to use a mapper somebody else wrote, but at the time I needed to slice a few more % off the total import time for a file with 20K records that was uploaded to a website periodically.
P/Invoke Marshalling
This one is a big deal behind the scenes and occasionally in the foreground too. When you want to call a Windows API function or use a native DLL, P/Invoke gives you ways to achieve this without having to mess about with building memory buffers in both directions. The marshalling methods use reflection to do translation of certain things - strings and so on being the obvious example - so that you don't have to get your hands dirty. All based on the Type object that is the foundation of reflection.
Fact is, without reflection the .NET framework wouldn't be what it is. No Attributes, no Expressions, probably a lot less interop between the languages. No automatic marshalling. No LINQ... at least in the way we often use it now.

Which serialization to use for my c# objects to save them in a SQL database

I'm looking for some advice, it may be that there is no hard and fast answer but any thoughts are appreciated. The application is very typical. Its a c# code, currently using VS2010. This code is used amongst other things for providing data from a model to end users. The model itself is based on some known data at fixed points (a "cube") various parameters and settings. In standard programming practice the user accesses the data via public "get" functions which in turn rely on private member variables such as the known data and the settings. So far so standard. Now I want to save the class providing this data to the users into an sql database - primarily so all users can share the same data (or more precisely model generated data).
Of course I could just take each member variable of the class and write these into the db using sql database and reinstantiate the class from these. But I dont want to miss out on all the goodies .net & c# has to offer. So what I'm thinking of doing is serializing the object and using linq to sql to squirt this into the db. The linq to sql section is straightforward, but I'm a total newbie when it comes to serialization and I'm a bit concerned/confused about it. It seems the obvious thing is to format the object into xml and write this into the database as a column in the table with sql datatype "xml". But that leaves me with a few questions.
To my understanding the standard XMLserializer will only write the public members of the class into the xml. That looks like a non-starter to me since my class design is predicated on keeping much of the class private (writing classes with all public members is outside of my experience - who does that ?). This has led me to consider the DataContractSerializer in which you can opt-in variables for serialization. However this seems to have some WCF dependencies and I'm wondering what are the potential drawbacks of using it. Additionally there is the SoapFormatter, which seems to be more prevalent for web applications and also JSON. I'm not really considering these, but maybe I should ? Maybe there are better ways of solving the problem ? Perhaps a bit verbose but hopefully all the context can help so people can shoot me some advice.
I have had requirements similar to yours and have done quite a bit of research in this area. I did a number of proof-of-concept projects using XMLSerialization, JSON, BinraryFormatter and not to forget some home grown hacks. I had almost decided to go with JSON (JSON.NET), until I found protobuf-net! It is fast, small in size, version independent, supports inheritance and easy to implement without much changes to your code. Recommend heavily.
If you store an object as XML, it will be very hard to use from the database. For example, if you store customer objects as XML, how would you write the following?
select * from Customers where LastName = 'Obama'
It can be done, but it's not easy.
How to map objects to a database is a subject of some controversy. Frameworks that are easy to get started with can become overly complex in the application's later life. Since most applications spend more time in maintenance than in initial development, I'd use the simplest model that works.
Plain ADO.NET or Dapper are good contenders. You'll write a bit more boilerplate code, but the decrease in complexity more than makes up for that.

Undo Redo in WPF/C# in an already functional application

I have done some research already as to how I can achieve the title of this question. The app I am working on has been under development for a couple of years or so (slow progress though, you all know how it is in the real world). It is now a requirement for me to put in Undo/Redo multiple level functionality. It's a bit late to say "you should have thought about this before you started" ... well, we did think about it - and we did nothing about it and now here it is. From searching around SO (and external links) I can see that the two most common methods appear to be ...
Command Pattern
Memento Pattern
The command pattern looks like it would be a hell of a lot of work, I can only imagine it throwing up thousands of bugs in the process too so I don't really fancy that one.
The Memento pattern is actually a lot like what I had in my head for this. I was thinking if there was some way to quickly take a snapshot of the object model currently in memory, then I would be able to store it somewhere (maybe also in memory, maybe in a file). It seems like a great idea, the only problem I can see for this, is how it will integrate with what we have already written. You see the app as we have it draws images in a big panel (potentially hundreds) and then allows the user to manipulate them either via the UI or via a custom built properties grid. The entire app is linked up with a big observer pattern. The second anything changes, events are fired and everything that needs to update does. This is nice but I cant help thinking that if a user is entering text into a texfield on the properties grid there will be a bit of delay before the UI catches up (seems as everytime the user presses a key, a new snapshot will be added to the undo list). So my question to you is ....
Do you know of any good alternatives to the Memento pattern that might work.
Do you think the Memento pattern will fit in here or will it slow the app down too much.
If the Memento pattern is the way to go, what is the most efficient way to make a snapshot of the object model (i was thinking serialising it or something)
Should the snapshots be stored in memory or is it possible to put them into files?
If you have got this far, thankyou kindly for reading. Any input you have will be valuable and very much appreciated.
Well , Here is my thought on this problem.
1- You need multi level undo/redo functionality. so you need to store user actions performed which can be stored in a stack.
2- Your second problem how to identify what has been changed by a operation i think through Memento pattern , it is quite a challenge. Memento is all about toring initial object state in your memory.
either , you need to store what is changed by a operation so that you can use this information to undo the opertions.
Command pattern is designed for the Undo/Redo functionality and i would say that its late but its worth while to implement the design which is being used for several years and works for most of the applications.
If performance allows it you could serialize your domain before each action. A few hundred objects is not much if the objects aren't big themselves.
Since your object graph is probably non trivial (i.e. uses inheritance, cycles,...) the integrated XmlSerializer and JsonSerializers are out of question. Json.net supports these, but does some lossy conversions on some types (local DateTimes, numbers,...) so it's bad too.
I think the protobuf serializers need either some form of DTD(.proto file) or decoration of all properties with attributes mapping their name to a number, so it might not be optimal.
BinaryFormatter can serialize most stuff, you just need to decorate all classes with the [Serializable] attribute. But I haven't used it myself, so there might be pitfalls I'm not aware of. Perhaps related to Singletons or events.
The critical things for undo/redo are
knowing what state you need to save and restore
knowing when you need to save the state
Adding undo/redo after the fact is always a painful thing to do - (I know this comment is of no use to you now, but it's always best to design support into the application framework before you start, as it helps people use undo-friendly patterns throughout development).
Possibly the simplest approach will be a memento-based one:
Locate all the data that makes up your "document". Can you unify this data in some way so that it forms a coherent whole? Usually if you can serialise your document structure to a file, the logic you need is in the serialisation system, so that gives you a way in. The down side to using this directly is usually that you will usually have to serialise everything so your undo will be huge and slow. If possible, refactor code so that (a) there is a common serialisation interface used throughout the application (so any and every part of your data can be saved/restored using a generic call), and (b) every sub-system is encapsulated so that modifications to the data have to go through a common interface (rather than lots of people modifying member variables directly, they should all call an API provided by the object to request that it makes changes to itself) and (c) every sub-portion of the data keeps a "version number". Every time an alteration is made (through the interface in (b)) it should increment that version number. This approach means you can now scan your entire document and use the version numbers to find just the parts of it that have changed since you last looked, and then serialise the minimal amount to save and restore the changed state.
Provide a mechanism whereby a single undo step can be recorded. This means allowing multple systems to make changes to the data structure, and then when everything has been updated, triggering an undo recording. Working out when to do this may be tricky, but it can usually be accomplished by scanning your document for changes (see above) in your message loop, when your UI has finished processing each input event.
Beyond that, I'd advise going for a command based approach, because there are many benefits to it besides undo/redo.
You may find the Monitored Undo Framework to be useful. http://muf.codeplex.com/
It uses something similar to the memento pattern, by monitoring for changes as they happen and allows you to put delegates on the undo stack that will reverse / redo the change.
I considered an approach that would serialize / deserialize the document but was concerned about the overhead. Instead, I monitor for changes in the model (or view model) on a property by property bases. Then, as needed, I use the MUF library to "batch" related changes so that they undo / redo as a unit of change.
The fact that you have your UI setup to react to changes in the underlying model is good. It sounds like you could inject the undo / redo logic there and the changes would bubble up to the UI.
I don't think that you'd see much lag or performance degradation. I have a similar application, with a diagram that we render based on the data in the model. We've had good results with this so far.
You can find more info and documentation on the codeplex site at http://muf.codeplex.com/. The library is also available via NuGet, with support for .NET 3.5, 4.0, SL4 and WP7.

Is my idea for an object persistence library useful?

First, I apologize if this is not an appropriate venue to ask this question, but I wasn't really sure where else to get input from.
I have created an early version of a .NET object persistence library. Its features are:
A very simple interface for persistence of POCOs.
The main thing: support for just about every conceivable storage medium. This would be everything from plain text files on the local filesystem, to embedded systems like SQLite, any standard SQL server (MySQL, postgres, Oracle, SQL Server, whatever), to various NoSQL databases (Mongo, Couch, Redis, whatever). Drivers could be written for nearly anything, so for instance you could fairly easily write a driver where the actual backing store could be a web-service.
When I first had this idea I was convinced it was totally awesome. I quickly created an initial prototype. Now, I'm at the 'hard part' where I am debating issues like connection pooling, thread safety, and debating whether to try to support IQueryable for LINQ, etc. And I'm taking a harder look at whether it is worthwhile to develop this library beyond my own requirements for it.
Here is a basic example of usage:
var to1 = new TestObject { id = "fignewton", number = 100, FruitType = FruitType.Apple };
ObjectStore db = new SQLiteObjectStore("d:/objstore.sqlite");
db.Write(to1);
var readback = db.Read<TestObject>("fignewton");
var readmultiple = db.ReadObjects<TestObject>(collectionOfKeys);
The querying interface that works right now looks like:
var appleQuery = new Query<TestObject>().Eq("FruitType", FruitType.Apple).Gt("number",50);
var results = db.Find<TestObject>(appleQuery);
I am also working on an alternative query interface that lets you just pass in something very like a SQL WHERE clause. And obviously, in the NET world it would be great to support IQueryable / expression trees.
Because the library supports many storage mediums with disparate capabilities, it uses attributes to help the system make the best use of each driver.
[TableName("AttributeTest")]
[CompositeIndex("AutoProperty","CreatedOn")]
public class ComplexTypesObject
{
[Id]
public string id;
[QueryableIndexed]
public FruitType FruitType;
public SimpleTypesObject EmbeddedObject;
public string[] Array;
public int AutoProperty { get; set; }
public DateTime CreatedOn = DateTime.Now;
}
All of the attributes are optional, and are basically all about performance. In a simple case you don't need any of them.
In a SQL environment, the system will by default take care of creating tables and indexes for you, though there is a DbaSafe option that will prevent the system from executing DDLs.
It is also fun to be able to migrate your data from, say, a SQL engine to MongoDB in one line of code. Or to a zip file. And back again.
OK, The Question:
The root question is "Is this useful?" Is it worth taking the time to really polish, make thread-safe or connection pooled, write a better query interface, and upload somewhere?
Is there another library already out there that already does something like this, NAMELY, providing a single interface that works across multiple data sources (beyond just different varieties of SQL)?
Is it solving a problem that needs to be solved, or has someone else already solved it better?
If I proceed, how do you go about trying to make your project visible?
Obviously this isn't a replacement for ORMs (and it can co-exist with ORMs, and coexist with your traditional SQL server). I guess its main use cases are for simple persistence where an ORM is overkill, or for NoSQL type scenarios and where a document-store type interface is preferable.
My advice: Write it for your own requirements and then open-source it. You'll soon find out if there's a market for it. And, as a bonus, you'll find that other people will tell you which bits need polishing; there's a very high chance they'll polish it for you.
Ben, I think it's awesome. At the very least post it to CodePlex and share with the rest of the world. I'm quite sure there are developers out there who can use an object persistence framework (or help polish it up).
For what its worth I think its a great idea.
But more importantly, you've chosen a project (in my opinion) that will undoubtedly improve your code construction and design chops. It is often quite difficult to find projects that both add value while improving your skills.
At least complete it to your initial requirents and then open source it. Anything after that it is a bonus!
While I think the idea is intriguing, and could be useful, I am not sure what long-term value it may hold. Given the considerable advances with EF v4 recently, including things like Code-Only, true POCO support, etc. achieving what you are talking about is actually not that difficult with EF. I am a true believer in Code-Only these days, as it is simple, powerful, and best of all, compile-time checked.
The idea about supporting any kind of data store is intriguing, and something that is worth looking into. However, I think it might be more useful, and reach a considerably broader audience, if you implemented store providers for EF v4, rather than trying to reinvent the wheel that Microsoft has now spent years on. Small projects more often than not grow...and things like pooling, thread safety, LINQ/IQueryable support, etc. become more important...little by little, over time.
By developing EF data store providers for things like SqLite, MongoDB, Xml files or flat files, etc. you add to the capabilities of an existing, familiar, accessible framework, without requiring people to learn an additional one.

What is the best approach to versioning the tracked data within workflows?

This is a general question concerning the Workflow Foundation (.NET 3.5) and versioning the data that it works with. We have a lot of custom activities that work with some data and this data may be interesting also for the future analysis of the already completed workflows (provided that we configure the tracking in such a way that it stores it in a serialized form).
It may be necessary to show the data from the past in the UI, but the data inevitably changes the structure (class definition / internal structure if it's dynamic) and the redeployed version of our library will contain the new data definition while the serialized data in the tracking database will be still in the old structure.
Is it better to use dynamic structures that don't change from the beginning (like a property bag) or rather later deal with the redeployment and somehow transform the serialized BLOB into the new one ? Have you ever used some approach in a similar scenario ?
A lot depends on how you deploy your application. If you use a strong name and deploy to the GAC or multiple private assembly paths deserializing a workflow will deserialize the exact version of your class. That means that you code must be able to work with multiple versions and that can be a bit of a pain. Storing data in a property bag is not going to help you there. If you use assembly redirects to point to the current version of an activity solves that part and I suppose using a property bag would make life simpler then. That said I tend to stick with dependency properties and regular serializable classes so far.
I did a series of blog posts about long running workflows and versioning where you run into exactly the same problem. Check here for more details.

Categories