Implement list of objects to be deleted in database

Implement list of objects to be deleted in database - c#

I have a form with few tabs, and in each tab an grid control. When user select a row to be deleted i want to remove it from the grid, and if the object exist in the database remove it too, but not permanent - only if and when user clicks save on form.
For now, if object doesn't exist in db i remove it from the list, and if objects exist in db i delete it from db and remove it from the list. But, if user clicks Cancel button he expects row/s not to be deleted from database.
I have two possible solutions on my mind: 1) - remove object from list, and if objects exist in db add it to the list of objects to be deleted 2) - implement another list, getter will return only objects with state != ToBeDeleted (performance?)
Note: i'm not using ORM tool, working with my own ado.net based data access framework.

I think the case you are descibing just asks pretty much for a Transaction.
ADO.Net handles them easily, provided you are using a reasonable database engine (so: no SqlServerCE for example:))
See for example the TransactionScope class. You construct such object before interacting with the database, and the changes will be "commited" if and only if you call Complete(). If you just leave it alone or if you Dispose() it, the transaction will be cancelled and all changes on the DB will be "rolledback", so, reverted.
So, in your case, you may open the transaction in the Form's ctor or onLoaded(), and Complete() at "save", and Dispose() at any other window closing.
While this is the normal way of handling such things for small systems, especially single-user ones, but be careful: if your system has to handle many concurent useres, you may be not able to use it in this way. The Transaction blocks rows and tables until it is completed or cancelled, and the therefore "other users" may see large delays..
So, how many users do you have to support and how often they will try to edit the same things?
-- edit: (10 users)
With that many users, you will want to avoid long-running transactions. Opening transaction at form-load will be unacceptable, and will lock many users away until that one current user closes the window. But, using transactions at Save() that push all the changes in one batch are OK.
Of course, if you can eliminate transactions at all - that's great! But, it is very hard thing to do if you also need to preserve data integrity.. To eliminate the need of transactions, almost always you have to redesign both the data structure on the DB side, and the way you obtain and work with the data. If you want to redesign both, then I'd really recommend to first try redesigning it to use some existing data-access framework, as even the basic .Net ADO has really nice features for online editing of databases held at SqlClient-compliant databases..
So, assuming you don't want to rewrite/rethink most of your code, you just need to buffer the data and also, delay all of the actual operations on the database.
You may want to do it in a "simple" form: when you display your form, instead of binding your Form directly to the database-driven datasources - download all required data to some BindingList<>s, DataTables, etc - whatever container you like. And bind your form to them instead. Probably you have something like that already set up. But, the important thing is that all those datacontainers must be offline or at least readonly+delayloaded.
Next, you've got to intercept all operations that the user performs on the UI. Surely you have it done already, as I'm assuming the application works:) As your Forms are bound to that offline cached items, your application should perform the operation on that cached data, and don't touch the database at all. But there's more: along with performing them on cached data, you should record what happens to which table.
Then, when finally the user stops playing around and presses CANCEL :) - you just trash everything and close the form. database not changed.
On Save - you open a fresh transaction, then iterate over the list of changes and effectively replay your recorder changes on the database, then commit transaction.
Please note two things though: the database could have changed during the time the users cached the data and the time he pressed Save. You have to detect this and abort, or resolve conflicts. You should do that inside that transaction, either during or before executing the recorded changes. You may detect it by simply comparing the online data with offline cached data (the unchanged original values, not those modified by user), or you may use some other mechanisms like OptimisticLocking and just compare the version tags on the rows.
If you don't like record-replay, you may implement a "DIFF"ing utility that takes the modified offline data and compares it in a generic way with the current-online tables. This is somewhat harder, but has a bonus: with such utility, you can initially doubly-cache the data: one copy for offline reference (just stored and never touched by the user) and one copy for offline editing (all those bound to the Forms). Now, upon Save you open transaction and diff the reference data against the online database. If there are any difference - you've just detected a collision. Solve/merge/abort/etc. If no differences, then you diff the modified data against online-data, and apply all differences found to the database and commit transaction.
Either of those methods has its pros and cons: aside from difficulty of implementation, there's memory issues of caching, latency issues if you dare to copy too large tables, etc.
But - once solved, it would work pretty nice.
And as you finish, you can go and boast that you have just implemented a smaller sis' of the DataSet+DataTable. I'm not joking, and I'm not laughing at you. I'm just trying to show you why everyone is telling you to rewise your DAO layer and try understanding and using the hard work that was already done for you by the platform designers/developers :)
Anyways, I've said you can avoid the clashes and transactions at all if you rethink your data structure.. For example: why do you DELETE the rows at all? I know there's a nifty DELETE statement in the SQL, but, well, do you really need to delete that row? Can't you just add some 'bool isDeleted' column and when user deletes the row from the Grid - just make set that rowcell to True and make the application filter-out any isDeleted=true rows and not show them? and not include them in views and aggregations? Bonus: sys/db admins now have a magic tool: undelete..
Let's take it further: do you need to UPDATE the rows? Maybe you can just APPEND some information that from (this-date) that row should have a new price? of course, the structure must be greatly altered: entities doesn't have properties, but have logs of timestamped property changes (or either the rows must have version numbers and be duplicated..), queries must be done against only the newest versiosn data, etc. Pros: database is now append-only. Transactions, if needed at all, are hyper-short. Cons: SELECT queries are complicated and may be slow, especially when joining many tables..
Pro/Con: and your db actually starts looking very meta- instead of data-base...
Con: and this is really hard task to "upgrade" existing application to such db structure. Writing a new app from scratch and importing data from odl system may be few times faster.
Now, to summarise:
I do not recommend any of the ways described.
First, I recommend you to take some ORM framework like NHibernate, EntityFramework, XPO from DevExpress, or whetever else. Any of them will save you lots of time. Those three I list here even have OptimisticLocking collision detection built-in. Why use SQL-self-written framework when such tools exist?
If not, then next I recommed to use existing tools found in the framework. you use SqlClient, whydontya use DataSet and DataTables? They are provided along with SqlClient and they have many useful mechanisms just built-in, which otherwise you will spend weeks on implementing and testing all by yourself. Learn to use DataSets and its collision detection, and its merging algorithms, and use them. You will loose a bit of time on experimenting and learning, but you will save huge amounts of time on not-reinventing the wheel.
If you really want to do it manually, start with data-caching and record-replay. It is easy to comprehend, it is quite easy to introduce anywhere where you currently use plain SQL queries, and will quickly introduce you to all kinds of cache-syncing and version-checking problems, and you will soon learn in details why all those strange mechanisms in the above-mentioned frameworks were implemented, how they work and what pros/cons they have.
and about the doubly-cached diffing approach.. it will be more tempting to write that record-repay, but please: use it only if you know very well how to detect/solve/merge collisions. Have at least one record-replay approach implemented before you try it..
..and of course yo umay use long-lasting transactions. Dumb-easy to introduce, and they "just irritate" the users.. Well, or even make the system unusable when >90% of the users constantly collide and hit the locks, heh.. No, that was a joke. Don't use long-lasting transactions. They are ok for 1-4 users, or for very sparse databases..

Related

How should I handle a potentially large number of edits in entity framework?

I'm using .NET 4.5.1 with EF 6.0.2 and db-first.
The use case is something like this:
Roughly 50k entities are loaded
A set of these entities are displayed for the user, others are required for displaying the items correctly
The user may perform heavy actions on the entities, meaning the user chooses to perform one action which cascades to actually affect potentially hundreds of entities.
The changes are saved back to database.
The question, then, is what is the best way to handle this? So far I've come up with 2 different solutions, but don't really like either:
Create a DbContext at step 1. Keep it around during the whole process, then finally save changes. The reason I don't necessarily like this, is that the process might take hours, and as far as I know, DbContexts should not be preserved for this long.
Create a DbContext at step 1. Discard it right after. At step 4, create a new DbContext, attach the modified entities to it and save changes. The big problem I see with this approach is how do I figure out which entities have actually be changed? Do I need to build a ChangeTracker of my own to be able to do this?
So is there a better alternative for handling this, or should I use one of the solutions above (perhaps with some changes)?

I would go with option number 1 - use a DbContext for the entire process.
The problem I have is with the assertion that the process might take hours. I don't think this is something you want to do. Imagine what happens when your user has been editing the data for 3 hours, and then face a power blackout before clicking the final save. You'll have users running after you with pitchforks.
You're also facing a lot of concurrency issues - what if two users perform the same lengthy process at once? Handling collisions after a few hours of work is going to be a problem, especially if you tell users changes they've made hours ago can't be saved. Pitchforks again.
So, I think you should go with number 3 - save incremental changes of the editing process, so the user's work isn't lost if something bad happens, and so that you can handle collisions if two users are updating the data at the same time.
You would probably want to keep the incremental changes in a separate place, not your main tables, because the business change hasn't been finalized yet.

and as far as I know, DbContexts should not be preserved for this long.
Häh?
There is nothing in a db context about not preserving it. You may get problems with other people having already edited the item, but that is an inherent architectura problem - generally it isn ot adviced to use optimistic AND pessimistic locking in a "multi hour edit marathon".
The only sensible approach if you have editing over hours is using your own change tracker and using proper logic when changes collode - and / or use a logical locking mechanism (flag in the database).

Entity Framework POCO long-term change tracking

I'm using .NET entity framework 4.1 with code-first approach to effectively solve the following problem, here simplified.
There's a database table with tens of thousands of entries.
Several users of my program need to be able to
View the (entire) table in a GridRow, which implied that the entire Table has to be downloaded.
Modify values of any random row, changes are frequent but need not be persisted immediately. It's expected that different users will modify different rows, but this is not always true. Some loss of changes is permitted, as users will most likely update same rows to same values.
On occasion add new rows.
Sounds simple enough. My initial approach was to use a long-running DbContext instance. This one DbContext was supposed to track changes to the entities, so that when SaveChanges() is called, most of the legwork is done automatically. However many have pointed out that this is not an optimal solution in the long run, notably here. I'm still not sure if I understand the reasons, and I don't see what a unit-of-work is in my scenario either. The user chooses herself when to persist changes, and let's say that client always wins for simplicity. It's also important to note that objects that have not been touched don't overwrite any data in the database.
Another approach would be to track changes manually or use objects that track changes for me, however I'm not too familiar with such techniques, and I would welcome a nudge in the right direction.
What's the correct way to solve this problem?
I understand that this question is a bit wishy-washy, but think of it as more fundamental. I lack fundamental understanding about how to solve this class of problems. It seems to me that long living DbContext is the right way, but knowledgeable people tell me otherwise, which leads me to confusion and imprecise questions.
EDIT1
Another point of confusion is the existance of Local property on the DbSet<> object. It invites me to use a long running context, as another user has posted here.

Problem with long running context is that it doesn't refresh data - I more discussed problems here. So if your user opens the list and modify data half an hour she doesn't know about changes. But in case of WPF if your business action is:
Open the list
Do as many actions as you want
Trigger saving changes
Then this whole is unit of work and you can use single context instance for that. If you have scenario where last edit wins you should not have problems with this until somebody else deletes record which current user edits. Additionally after saving or cancelling changes you should dispose current context and load data again - this will ensure that you really have fresh data for next unit of work.
Context offers some features to refresh data but it only refreshes data previously loaded (without relations) so for example new unsaved records will be still included.
Perhaps you can also read about MS Sync framework and local data cache.

Sounds to me like your users could have a copy (cached) of the data for an indefinate period of time. The longer the users are using cached data the greater the odds that they could become disconnected from the database connection in DbContext. My guess is EF doesn't handle this well and you probably want to deal with that. (e.g. occaisionally connected architecture). I would expect implementing that may solve many of your issues.

Auditing record changes in sql server databases

Using only microsoft based technologies (MS SQL Server, C#, EAB, etc) if you needed keep the track of changes done on a record in a database which strategy would you will use? Triggers, AOP on the DAL, Other? And how you will display the collected data? Is there a pattern about it? Is there a tool or a framework that help to implement this kind of solution?

The problem with Change Data capture is that it isn't flexible enough for real auditing. You can't add the columns you need. Also it dumps the records every three days by default (you can change this, but I don't think you can store forever) so you have to have a job dunping the records to a real audit table if you need to keep the data for a long time which is typical of the need to audit records (we never dump our audit records).
I prefer the trigger approach. You have to be careful when you write the triggers to ensure that they will capture the data if multiple records are changed. We have two tables for each table audited, one to store the datetime and id of the user or process that took the action and one to store the old and new data. Since we do a lot of multiple record processes this is critical for us. If someone reports one bad record, we want to be able to see if it was a process that made the change and if so, what other records might have been affected as well.
At the time you create the audit process, create the scripts to restore a set of audited data to the old values. It's a lot easier to do this when under the gun to fix things, if you already have this set up.

Sql Server 2008 R2 has this built-in - lookup Change Data Capture in books online

This is probably not a popular opinion, but I'm going to throw it out there anyhow.
I prefer stored procedures for all database writes. If auditing is required, it's right there in the stored procedure. There's no magic happening outside the code, everything that happens is documented right at the point where writes occur.
If, in the future, a table needs to change, one has to go to the stored procedure to make the change. The need to update the audit is documented right there. And because we used a stored procedure, it's simpler to "version" both the table and its audit table.

Nhibernate, Domain Model, Changes, Disconnected, Cloned (Need better title - but can't express it clearly!)

Sorry about the title - hopefully the question will make clear what I want to know, then maybe someone can suggest a better title and I'll edit it!
We have a domain model. Part of this model is a collection of "Assets" that the user currently has. The user can then create "Actions" that are possible future changes to the state of these "Assets". At present, these actions have an "Apply" method and a reference to their associated "Asset". This "Apply" method makes a modification to the "Asset" and returns it.
At various points in the code, we need to pull back a list of assets with any future dated actions applied. However, we often need to do this within the scope of an NHibernate transaction and therefore when the transaction is committed the changes to the "Asset" will be saved as well - but we don't want them to be.
We've been through various ways of doing this
Cloning a version of the "Asset" (so that it is disconnected from Nhibernate) and then applying the "Action" and returning this cloned copy.
Actually using Nhibernate to disconnect the object before returning it
Obviously these each have various (massive!) downsides.
Any ideas? Let me know if this question requires further explanation and what on earth I should change the title to!

It's been a while since I had any NHibernate fun, but could you retrieve the Assets using a second NHibernate session? Changes made to the Asset will then not be saved when the transaction on the first session commits.

You could manage this with NHibernate using ISession.Evict(obj) or similar techniques, but honestly it sounds like you're missing a domain concept. I would model this as:
var asset = FetchAsset();
var proposedAsset = asset.ApplyActionsToClone();
The proposedAsset would be a clone of the original asset with the actions applied to it. This cloned object would be disconnected from NHibernate and therefore not persisted when the Unit of Work commits. If applying the actions is expensive, you could even do the following:
asset.ApplyProposedChanges(proposedAsset);

I have been working around a similar problem where performance was also an issue, thus it was not possible to re-load the aggregate using a secondary (perhaps stateless) session. And because the entities that needed to be changed "temporarily" where very complex, I could not easily clone them.
What I ended up with was "manually" rolling back the changes to what would be the assets in your case. It turned out to work well. We stored each action applied to the entity as a list of events (in memory that is). After use the events could be re-read and each change could be rolled back by a counter-action.
If it's only a small variety of actions that can be applied, I would say it's easily manageable to create a counter-action for each, or else it might be possible to create a more generic mechanism.
We had only four actions, so we went for the manual edition.

Sounds like you want to use a Unit of Work wrapper, so you can commit or revert changes as needed.

Sometimes Connected CRUD application DAL

I am working on a Sometimes Connected CRUD application that will be primarily used by teams(2-4) of Social Workers and Nurses to track patient information in the form of a plan. The application is a revisualization of a ASP.Net app that was created before my time. There are approx 200 tables across 4 databases. The Web App version relied heavily on SP's but since this version is a winform app that will be pointing to a local db I see no reason to continue with SP's. Also of note, I had planned to use Merge Replication to handle the Sync'ing portion and there seems to be some issues with those two together.
I am trying to understand what approach to use for the DAL. I originally had planned to use LINQ to SQL but I have read tidbits that state it doesn't work in a Sometimes Connected setting. I have therefore been trying to read and experiment with numerous solutions; SubSonic, NHibernate, Entity Framework. This is a relatively simple application and due to a "looming" verion 3 redesign this effort can be borderline "throwaway." The emphasis here is on getting a desktop version up and running ASAP.
What i am asking here is for anyone with any experience using any of these technology's(or one I didn't list) to lend me your hard earned wisdom. What is my best approach, in your opinion, for me to pursue. Any other insights on creating this kind of App? I am really struggling with the DAL portion of this program.
Thank you!

If the stored procedures do what you want them to, I would have to say I'm dubious that you will get benefits by throwing them away and reimplementing them. Moreover, it shouldn't matter if you use stored procedures or LINQ to SQL style data access when it comes time to replicate your data back to the master database, so worrying about which DAL you use seems to be a red herring.
The tricky part about sometimes connected applications is coming up with a good conflict resolution system. My suggestions:
Always use RowGuids as your primary keys to tables. Merge replication works best if you always have new records uniquely keyed.
Realize that merge replication can only do so much: it is great for bringing new data in disparate systems together. It can even figure out one sided updates. It can't magically determine that your new record and my new record are actually the same nor can it really deal with changes on both sides without human intervention or priority rules.
Because of this, you will need "matching" rules to resolve records that are claiming to be new, but actually aren't. Note that this is a fuzzy step: rarely can you rely on a unique key to actually be entered exactly the same on both sides and without error. This means giving weighted matches where many of your indicators are the same or similar.
The user interface for resolving conflicts and matching up "new" records with the original needs to be easy to operate. I use something that looks similar to the classic three way merge that many source control systems use: Record A, Record B, Merged Record. They can default the Merged Record to A or B by clicking a header button, and can select each field by clicking against them as well. Finally, Merged Records fields are open for edit, because sometimes you need to take parts of the address (say) from A and B.
None of this should affect your data access layer in the slightest: this is all either lower level (merge replication, provided by the database itself) or higher level (conflict resolution, provided by your business rules for resolution) than your DAL.

If you can install a db system locally, go for something you feel familiar with. The greatest problem I think will be the syncing and merging part. You must think of several possibilities: Changed something that someone else deleted on the server. Who does decide?
Never used the Sync framework myself, just read an article. But this may give you a solid foundation to built on. But each way you go with data access, the solution to the businesslogic will probably have a much wider impact...

There is a sample app called issueVision Microsoft put out back in 2004.
http://windowsclient.net/downloads/folders/starterkits/entry1268.aspx
Found link on old thread in joelonsoftware.com. http://discuss.joelonsoftware.com/default.asp?joel.3.25830.10
Other ideas...
What about mobile broadband? A couple 3G cellular cards will work tomorrow and your app will need no changes sans large pages/graphics.
Excel spreadsheet used in the field. DTS or SSIS to import data into application. While a "better" solution is created.
Good luck!

If by SP's you mean stored procedures... I'm not sure I understand your reasoning from trying to move away from them. Considering that they're fast, proven, and already written for you (ie. tested).
Surely, if you're making an app that will mimic the original, there are definite merits to keeping as much of the original (working) codebase as possible - the least of which is speed.
I'd try installing a local copy of the db, and then pushing all affected records since the last connected period to the master db when it does get connected.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.