Best approach to incremently update application data

Best approach to incremently update application data - c#

I have been working on an application for a couple of years that I updated using a back-end database. The whole key is that everything is cached on the client, so that it never requires an network connection to operate, but when it does have a connection it will always pickup the latest updates. Every application updated is shipped with the latest version of the database and I wanted it to download only the minimum amount of data when the database has been updated.
I currently use a table with a timestamp to check for updates. It looks something like this.
ID - Name - Description- Severity - LastUpdated
0 - test.exe - KnownVirus - Critical - 2009-09-11 13:38
1 - test2.exe - Firewall - None - 2009-09-12 14:38
This approach was fine for what I previously needed, but I am looking to expand more function of the application to use this type of dynamic approach. All the data is currently stored as XML, but I do not want to store complete XML files in the database and only transmit changed data.
So how would you go about allowing a fairly simple approach to storing dynamic content (text/xml/json/xaml) in a database, and have the client only download new updates? I was thinking of having logic that can handle XML inserted directly
ID - Data - Revision
15 - XXX - 15
XXX would be something like <Content><File>Test.dll<File/><Description>New DLL to load.</Description></Content> and would be inserted into the cache, but this would obviously be complicated as I would need to load them in sequence.
Another approach that has been mentioned was to base it on something similar to Source Control, storing the version in the root of the file and calculating the delta to figure out the minimal amount of data that need to be sent to the client.
Anyone got any suggestions on how to approach this with no risk for data corruption? I would also to expand with features that allows me to revert possibly bad revisions, and replace them with new working ones.

It really depends on the tools you are using and the architecture you already have. Is there already a server with some logic and a data access layer?
Dynamic approaches might get complicated, slow and limit the number of solutions. Why do you need a dynamic structure? Would it be feasible to just add data by using a name-value pair approach in a relational database? Static and uniform data structures are much easier to handle.
Before going into detail, you should consider the different scenarios.
Items can be added
Items can be changed
Items can be removed (I assume)
Adding is not a big problem. The client needs to remember the last revision number it got from the server and you write a query which get everything since there.
Changing is basically the same. You should care about identification of the items. You need an unchangeable surrogate key, as it seems to be the ID you already have. (Guids may be useful here.)
Removing is tricky. You need to either flag items as deleted instead of actually removing them, or have a list of removed IDs with the revision number when they had been removed.
Storing the data in the client: Consider using a relational database like SQLite in the client. (It doesn't need installation, it is just storing in a file. Firefox for instance stores quite a lot in SQLite databases.) When using the same in the server, you can probably reuse some code. It is also transaction based, which helps to keep it consistent (rollback in case of error during synchronization).
XML - if you really need it - can be stored just as a string in the database.
When using an abstraction layer or ORM that supports SQLite (eg. NHibernate), you may also reuse some code even when there is another database used by the server. Note that the learning curve for such an ORM might be rather steep. If you don't know anything like this, it could be too much.
You don't need to force reuse of code in the client and server.
Synchronization itself shouldn't be very complicated. You have a revision number in the client and a last revision in the server. You get all new / changed and deleted items since then in the client and apply it to the local store. Update the local revision number. Commit. Done.
I would never update only a part of a revision, because then you can't really know what changed since the last synchronization. Because you do differential updates, it is essential to have a well defined state of the client.

I would go with a solution using Sync Framework.
Quote from Microsoft:
Microsoft Sync Framework is a comprehensive synchronization platform enabling collaboration and offline for applications, services and devices. Developers can build synchronization ecosystems that integrate any application, any data from any store using any protocol over any network. Sync Framework features technologies and tools that enable roaming, sharing, and taking data offline.
A key aspect of Sync Framework is the ability to create custom providers. Providers enable any data sources to participate in the Sync Framework synchronization process, allowing peer-to-peer synchronization to occur.

I have just built an application pretty much exactly as you described. I built it on top of the Microsoft Sync Framework that DjSol mentioned.
I use a C# front end application with a SqlCe database, and a SQL 2005 Server at the other end.
The following articles were extremely useful for me:
Tutorial: Synchronizing SQL Server and SQL Server Compact
Walkthrough: Creating a Sync service
Step by step N-tier configuration of Sync services for ADO.NET 2.0
How to Sync schema changed database using sync framework?

You don't say what your back-end database is, but if it's SQL Server you can use SqlCE (SQL Server Compact Edition) as the client DB and then use RDA merge replication to update the client DB as desired. This will handle all your requirements for sure; there is no need to reinvent the wheel for such a common requirement.

Related

.net windows application store data offline and store to db when there is network

I am developing a windows application for agricultural purpose. This application will be used by multiple users to maintain the data. The main issue is there won't be network connectivity on the work location. But however by end of the day they can go and synchronize if there are any option.
I just want to know how can we import and store all the data locally and update the data to database when there is network.
The options that i thought is to have SQL on every machine that runs this application. Store the data to local database when there is no network.
Having a separate button to export the local data to the centralized database when there is network.
Looks like this is complicated. Is there any better and easier option.
I prefer using c#, Visual studio.
Thanks.

You can use SQLite for storing data locally. It's fast, lightweight, and public domain.
You can use whatever the database of choice for the centralized server.

Well, this a quite broad question, as it has many options and scenarios. The questions you should ask yourself are:
Does user handle new information only or any information from any other user from the previous syncing?
Do you have to handle update conflicts?
Do you handle text information only or you have complex types and binary files?
As for the solution, the easiest way, from my point of view, would be using SQL Lite on portable devices, is a lightweight SQL client that will allow you to handle information easily. On the server you can use whatever you want, SQL Server, MySQL or any other SQL flavor you may like. Just make sure there is a connector for your portable device OS.
If you keep thinking of using SQL server on the portable device, it's a battery hogger!!!, you might want to check Microsoft Sync framework, as it provides almost all possible scenarios for handling data syncing, manage conflicts, etc.

Thanks for the answers. Please find the below solution that we implemented.
1) Installed SQL express on all the local machines
2) Used Microsoft Sync framework to sync the data. The sync is configured on demand.
Issues faced:
1) We were using geometry datatype on few tables and this was not supported by sync framework.
2) Any change in the database schema will not reflect on the client machine. We will have to delete all the system generated procedures used to track the table change and regenerate it. I am sure there will be a much better way to do this.
Cheers,
Jebli

Sync Framework v.2.1 and Change Tracking on SQL Express for N-Tier with WCF proxy

I have the pleasure of using SyncFx v.2.1 on an application. The client side presently uses SQLCE and the Server side uses Server 2008 r2. I am using a SyncFx proxy and host the server SyncAdapterBuilder code in the WCF service. The client has the SyncAgent and SyncTables and it works fine. I am using the integrated SQL Change Tracking in lieu of the coupled (aka custom / scoped) change tracking because I am not permitted to modify existing schema.
So my issue is that the requirements for the system have changed and I am required to use SQL Express on the client in order to support stored procedures.
Why not merge replication? The requirements also prohibit modification of the schema or the use of triggers. In fact the original version of the app used merge replication with SQLCE before moving to SyncFx for SQLCE.
So how is this done? I've read a lot of conflicting information and I can only assume that this is in response to the ever evolving versions of SyncFx. There are no direct example of how SQL to SQL Express with Change Tracking on both is accomplished. Plus I am trying to transition from a functional SQLCE implementation to Express with as few changes as possible. The client is already capable of using either type of DB, it is just the current sync process that needs to change.
Here is what I've found, but have not had success. I've read every StackOverflow response on the matter and am still not finding a way to do this that actually works.
Database Sync:SQL Server and SQL Express N-Tier with WCF : This MS example works fine with the SyncOrchistrator but provisions side tracking tables and triggers. I was not able to modify this in such a way that change tracking could be used on the client and server.
Sync framework with SQL Server 2008 Change Tracking : StephaneT suggests here that simply by using the normal SQLCE approach with the SQL Express sample sync provider and SyncFx 2.0 techniques only client side table modification would be required. Unfortunately all links to this sample SQL Express provider seem to be removed and other posts from JuneT and even Liam Cavanagh on MSDN suggest moving forward with the new official SqlServerProvider instead of a customized version of the DbServerProvider. Problem is there are no sample implementations of this anywhere and I haven't been able to figure it out through trial and error.
Syncing SQL Server 2008 Databases over HTTP using WCF & Sync Framework : Raj gives the best example (simple and easily translated to SQLCE processes) unfortunately it also uses the SqlExpressClientSyncProvider that seems to have evaporated from the internet. It also requires an anchor table to track the clients, I think I can get away with that as I am not allowed to modify schema on "existing" tables.
So any examples out there that can help me. Essentially I want to port the existing functioning SQLCE SyncFx via proxy with integrated SQL Change Tracking using a SyncAgent to a version that works for SQL Express without changing existing scheama or using triggers. I should also mention that I use filter parameters heavily as there are 150+ tables in the replication and they would be extremely large without filters. I had read some references that said the SqlExpressClientSyncProvider didn't support filters, but this is impossible for me to verify since I can't find a reference to that code that is still good.
Maybe there is a refresh of Raj's example that uses SqlServerSyncProvider
Thanks in advance to anyone that can point me in the right direction!

check out this link and you might still find some of the download links in the comments area working: http://www.8bit.rs/blog/2009/05/debugging-sql-express-client-sync-provider/
take note that even the sample SqlExpressClientSyncProvider uses triggers to track the deletes in the tombstone tables. likewise, you need to have columns in your table to track when a row has been inserted or updated (datetime or timestamp columns).
with regards to filtering, you can easily modify the queries in the adapter to include a filter clause.
the newer SqlSyncProvider does not support Sql Change Tracking as it implements its own tracking mechanism. the newer providers works in a peer-to-peer scenario so its tracking as well from which replica a particular change has come from.

How to sync two databases for disconnected systems from different companies

Is there a standard messaging protocol(s) / API(s) available to keep databases in sync. Or alternatively API(s) for creating and parsing messages.
Our company is working with another company to provide two different software packages to two different kinds of users. The data sits in two separate databases but parts of it have to remain in sync.
Their system is pretty much a black box to us. And vice versa.
So what would be required would be to track updates, and turn these into messages and send them to a web service, map these back to the destination database fields, and commit them.
The database schemas do not match.
I am aware that we are going to have to roll most of this ourself, but some ideas around messaging or techniques would be good.

One solution : SQL Server Integration Service. It appears from SQL Server 2005. This is exactly what you need. It was called DTS in SQL Server 2000 for Data Transformation Service. This was created to import/export/transform data from one point to an other. This is really easy to use from SQL Server 2005 (DTS is quite horrible).
So basically, you will have to write packages to import data from their database, transform, filter, etc. it exactly how you need it to insert it into your database. And vice versa.
Regarding the black box fact, you should generate the database relational design to make it easier.
EDIT
Just in case of you need to install it, I remember bugs from the SQL Server 2005 installer not installing SSIS at all. I had to satisfy all warnings in the installer system requirements step to obtain it.

You have two problems:
track the changes that have to be synced
apply the changes to the peer
There is a solution that combines a solution to both issues and I'm sure you are aware of it: replication. Merge Replication would allow both sites to update the data and would also provide merge conflict resolution. But replication only works when the table schema is similar and puts a big constraint on development as schema changes have to be carefully coordinated between the sites. In practice, when the sites are operated by independent companies, is quite difficult to maintain for a long term.
If you want to roll your own the change tracking part has built in support in SQL Server:
Change Tracking
Change Data Capture
Both can be used for a sync solution as a mean to detect what changed.
Applying the changes can be resolved by a web service, but there are also built-in solutions in SQL Server that allow for far higher scalability and throughput: Service Broker. Relying on a message defined API for sync allows the two sites to evolve at their own pace and change the schema almost at will, as long as the communication API (the message protocol)remains unchanged.

The answers provided give me some good ideas, but I think we are going to end up doing something a bit different.
We are using MSMQ, and defining a standard messaging system which we will roll ourselves.
As to how we will know what things have changed I am not sure at the moment.

Any ORMs that work with MS-Access (for prototyping)?

I'm in the early stages of a project, and it's not clear yet whether we'll need a "real" database (i.e. SQL Server et al). So I've been doing some prototyping using MS-Access, which is working fine so far. (developing in C#/VS2008/.Net 3.5/MS-Access 2000).
However, the object-relational impedance mismatch is already becoming annoying, and will only get worse as the project evolves.
I have not been able to find an ORM that will work with MS-Access. Any suggestions?
Edit - Follow Up
We ended up using Fluent NHibernate, mainly because it Automaps our object model to a relational database, which has been a huge win for us. Most of the FNH code samples we found used SQLite, and this worked so well that we intend to use it for our production database. (The app is a desktop scientific data collection and analysis package).

MSAccess files can be set up as an ODBC source on Windows machines. Almost any ORM will allow you to use ODBC. Here is a quick tutorial on how to set that up, it's outlined for Win2k but the process is the same for XP+. You also need to have MDAC installed on your box.
NHibernate seems to have native support of MSAccess as well, see here. I've never used it though. It also has an ODBC driver.. Many others support ODBC as well.
And again, as others are saying.. MSAccess does not scale... period. Installing a real database server is fairly easy, so I'd recommend SQL Server Express as others have, or even MySQL or Postgre, whatever is easier to set up.
If this is an application that you intend to deploy to clients, with each client having their own unique database, I would recommend another solution entirely, SQLite. SQLite gives you database power on an app by app basis. If you have a central database server, one of the previously mentioned solutions would be best.

There's only one scenario when choosing the Access Database Engine is a good choice: when building a self-contained Access application using Access Forms (though choosing to use Access in the first place is a questionable choice ;)
The database engine that VS2008 plays nicest with is SQL Server and you will have no problem finding an ORM that plays nice with SQL Server.

Can't give you an answer to your question, but instead of Access you might want to consider one of the following options:
SQL Server Express: is free and compatible with the full SQL Server
SQL Server Compact: also free, does not require any deployment/installation, does not support all features (e.g. no stored procedures).

At this stage, if you are unsure whether you need a "real" database or not, I'd skip MS Access and go straight to sql server express. It's free and still allows you to do everything you need to.
Plus, if you later decide you need to scale up, then you can without any pain.

I recommend you to use something like Microsoft SQL Server or PostgreSQL for prototyping. If you don't want to learn specific SQL syntax and install special tools for designing database schema, you can use ORM that automatically generates database schema from your persistent classes declaration. Anyway this approach is very effective for prototyping.

LLBLGen works with Access

Access is just a bad, bad idea. I believe MS only includes Access in Office to keep legacy users happy.
Even if you find an ORM that will work with an Access database, with few exceptions you're locking yourself into a niche tool that likely will not work out-of-the box with a real database engine. If you decide to switch to a real database engine later on, you'll not only have to deal with migrating the database, but switching to a different ORM.
See this comparison between SQL Server Express and SQL Server Compact. The comparison document also mentions some problems with other data stores, including Access.
If you are REALLY concerned about being able to install SQL Server Express, consider SQL Server Compact:
it can be linked into your redistributable app. No need to install a service (which may require admin rights during install of your application); everything is taken care of when you install your app. This makes the most sense if you need the data to reside on the user's machine instead of a server, and is most analogous to using Access.
It's less powerful than Express (doesn't support views, triggers, stored procedures, which I consider a requirement)
Can be scaled up to Express or other SQL Server versions very easily
Suitable for small-footprint installs like tablets, mobile devices, etc.
Always keep scalability in mind when designing any application. You don't want to wind up having to write a PHP->C++ compiler if/when your app becomes successful just because you picked the wrong tool up front.
While we're at it:
The big issue with Access (or, in this case, the Jet engine, which is the part you'd really be using when integrating an Access database with a .NET app) is that there is no "server" that handles datase requests. The engine, hosted in your app, must read and write directly to a file on disk that contains the database. Whenever this happens, the file must be locked to prevent concurrent writes. Dirty reads become more common as the number of users grows, as does the potential for database corruption.
Imagine having every customer at a large restaurant trying to simultaneously enter the kitchen to write down their orders or retrieve their food. Chaos would result. There'd be a lot of broken dishes, the kitchen would be a mess, you'd be lucky to get what you ordered in any sort of edible condition. With one customer, this probably works fine. With 5, eh, maybe. With 20,50,1000? Not so much.
So, the restaurant industry introduced waiters and managers that buffer IO to the kitchen. The database server application does something roughly analogous to this by restricting access to the files on disk. Everyone gets what they want, faster and in a much more reliable way, and the data store is protected.

Sometimes Connected CRUD application DAL

I am working on a Sometimes Connected CRUD application that will be primarily used by teams(2-4) of Social Workers and Nurses to track patient information in the form of a plan. The application is a revisualization of a ASP.Net app that was created before my time. There are approx 200 tables across 4 databases. The Web App version relied heavily on SP's but since this version is a winform app that will be pointing to a local db I see no reason to continue with SP's. Also of note, I had planned to use Merge Replication to handle the Sync'ing portion and there seems to be some issues with those two together.
I am trying to understand what approach to use for the DAL. I originally had planned to use LINQ to SQL but I have read tidbits that state it doesn't work in a Sometimes Connected setting. I have therefore been trying to read and experiment with numerous solutions; SubSonic, NHibernate, Entity Framework. This is a relatively simple application and due to a "looming" verion 3 redesign this effort can be borderline "throwaway." The emphasis here is on getting a desktop version up and running ASAP.
What i am asking here is for anyone with any experience using any of these technology's(or one I didn't list) to lend me your hard earned wisdom. What is my best approach, in your opinion, for me to pursue. Any other insights on creating this kind of App? I am really struggling with the DAL portion of this program.
Thank you!

If the stored procedures do what you want them to, I would have to say I'm dubious that you will get benefits by throwing them away and reimplementing them. Moreover, it shouldn't matter if you use stored procedures or LINQ to SQL style data access when it comes time to replicate your data back to the master database, so worrying about which DAL you use seems to be a red herring.
The tricky part about sometimes connected applications is coming up with a good conflict resolution system. My suggestions:
Always use RowGuids as your primary keys to tables. Merge replication works best if you always have new records uniquely keyed.
Realize that merge replication can only do so much: it is great for bringing new data in disparate systems together. It can even figure out one sided updates. It can't magically determine that your new record and my new record are actually the same nor can it really deal with changes on both sides without human intervention or priority rules.
Because of this, you will need "matching" rules to resolve records that are claiming to be new, but actually aren't. Note that this is a fuzzy step: rarely can you rely on a unique key to actually be entered exactly the same on both sides and without error. This means giving weighted matches where many of your indicators are the same or similar.
The user interface for resolving conflicts and matching up "new" records with the original needs to be easy to operate. I use something that looks similar to the classic three way merge that many source control systems use: Record A, Record B, Merged Record. They can default the Merged Record to A or B by clicking a header button, and can select each field by clicking against them as well. Finally, Merged Records fields are open for edit, because sometimes you need to take parts of the address (say) from A and B.
None of this should affect your data access layer in the slightest: this is all either lower level (merge replication, provided by the database itself) or higher level (conflict resolution, provided by your business rules for resolution) than your DAL.

If you can install a db system locally, go for something you feel familiar with. The greatest problem I think will be the syncing and merging part. You must think of several possibilities: Changed something that someone else deleted on the server. Who does decide?
Never used the Sync framework myself, just read an article. But this may give you a solid foundation to built on. But each way you go with data access, the solution to the businesslogic will probably have a much wider impact...

There is a sample app called issueVision Microsoft put out back in 2004.
http://windowsclient.net/downloads/folders/starterkits/entry1268.aspx
Found link on old thread in joelonsoftware.com. http://discuss.joelonsoftware.com/default.asp?joel.3.25830.10
Other ideas...
What about mobile broadband? A couple 3G cellular cards will work tomorrow and your app will need no changes sans large pages/graphics.
Excel spreadsheet used in the field. DTS or SSIS to import data into application. While a "better" solution is created.
Good luck!

If by SP's you mean stored procedures... I'm not sure I understand your reasoning from trying to move away from them. Considering that they're fast, proven, and already written for you (ie. tested).
Surely, if you're making an app that will mimic the original, there are definite merits to keeping as much of the original (working) codebase as possible - the least of which is speed.
I'd try installing a local copy of the db, and then pushing all affected records since the last connected period to the master db when it does get connected.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.