C#/SQL Cloud application - clients on different versions - c#

I'm designing a small application for a group of users at different physical locations. The application will connect to a central database in the cloud (well, on a central server - think cloud, but not really cloud). The database is held centrally to facilitate backups in a central location. I'm a seasoned developer, and the connection methods, code and other factors really aren't the issue.
However, I have a need to allow the application be upgraded to a newer version when the user sees fit - not on any kind of schedule. In a new update, the database schema could possibly change. So I'm going to run into the problem of User A downloading the new version, and upgrading the database. Users B, C and D will then get errors when they try to hit the database as tables/views may not be there.
I've thought about maintaining different databases on the same server. When User A upgrades, we'll "push" their database values to DB_V2 from DB_V1 and they'll use that one. Users B, C & D will still be able to use DB_V1 until they decide to upgrade. Eventually, DB_V1 can be removed when all users have upgraded away from that database.
Can I get some thoughts on the best way to handle this in a cloud-esque application? How are DB updates normally done/handled when clients might be on different versions?

Unfortunately, it is hard and there is no silver bullet. The name of the game is versioning, and you must support overlapping versions. Your access API must be versioned. Eg. consider the client communicates with the 'cloud' over a REST interface. The root URL for the API could be somehting like http://example.com/api/v1.1/. When you deploy a new version, you also move the API to http://example.com/api/v1.2/ and expose the new features in this new API, but continue to support the old v1.1 too. you give a period of grace to clients to upgrade to v1.2, and retire v1.1 sometime int he future, after sufficient number of clients have upgraded. The REST API Design Handbook is a good resource on the topic.
Now the real problem is your back end, the code that sits behind the URL service. You have to carefully design each upgrade step as to maintain backward compatibility with previous version. Is very likely that both overlapping versions would use the same storage (same DB). If an user adds an item using v1.2 API he usually expects to find it using the v1.1 API later, albeit possibly missing some attributes specific to v1.2. Your back end has to decide how to handle the v1.2 attributes for items added/edited using v1.1 API (eg. default values, NULLs etc). As I said, it is hard and ther eis no silver bullet. Sometimes you may have to bite the bullet and provide no back compat (items added with v1.2 are not visible to v1.1 API clients, ie. use separate storage, different DBs).
How about the case when your client access directly the DB? Ie. there is no explicit API, the client connects directly to the database. Your chances of success have diminished significantly... You can have your interaction with the DB use an API, eg. use only stored procedures for everything. By using schemas as namespaces you can provide a versioning, eg. exec [v1.1].[getProducts] vs. exec [v1.2].[getProducts]. But is cumbersome, hard, error prone and you can kiss goodbye most of the dev tools wizards, orms and other whistles and bells.

Related

Best approach to incremently update application data

I have been working on an application for a couple of years that I updated using a back-end database. The whole key is that everything is cached on the client, so that it never requires an network connection to operate, but when it does have a connection it will always pickup the latest updates. Every application updated is shipped with the latest version of the database and I wanted it to download only the minimum amount of data when the database has been updated.
I currently use a table with a timestamp to check for updates. It looks something like this.
ID - Name - Description- Severity - LastUpdated
0 - test.exe - KnownVirus - Critical - 2009-09-11 13:38
1 - test2.exe - Firewall - None - 2009-09-12 14:38
This approach was fine for what I previously needed, but I am looking to expand more function of the application to use this type of dynamic approach. All the data is currently stored as XML, but I do not want to store complete XML files in the database and only transmit changed data.
So how would you go about allowing a fairly simple approach to storing dynamic content (text/xml/json/xaml) in a database, and have the client only download new updates? I was thinking of having logic that can handle XML inserted directly
ID - Data - Revision
15 - XXX - 15
XXX would be something like <Content><File>Test.dll<File/><Description>New DLL to load.</Description></Content> and would be inserted into the cache, but this would obviously be complicated as I would need to load them in sequence.
Another approach that has been mentioned was to base it on something similar to Source Control, storing the version in the root of the file and calculating the delta to figure out the minimal amount of data that need to be sent to the client.
Anyone got any suggestions on how to approach this with no risk for data corruption? I would also to expand with features that allows me to revert possibly bad revisions, and replace them with new working ones.
It really depends on the tools you are using and the architecture you already have. Is there already a server with some logic and a data access layer?
Dynamic approaches might get complicated, slow and limit the number of solutions. Why do you need a dynamic structure? Would it be feasible to just add data by using a name-value pair approach in a relational database? Static and uniform data structures are much easier to handle.
Before going into detail, you should consider the different scenarios.
Items can be added
Items can be changed
Items can be removed (I assume)
Adding is not a big problem. The client needs to remember the last revision number it got from the server and you write a query which get everything since there.
Changing is basically the same. You should care about identification of the items. You need an unchangeable surrogate key, as it seems to be the ID you already have. (Guids may be useful here.)
Removing is tricky. You need to either flag items as deleted instead of actually removing them, or have a list of removed IDs with the revision number when they had been removed.
Storing the data in the client: Consider using a relational database like SQLite in the client. (It doesn't need installation, it is just storing in a file. Firefox for instance stores quite a lot in SQLite databases.) When using the same in the server, you can probably reuse some code. It is also transaction based, which helps to keep it consistent (rollback in case of error during synchronization).
XML - if you really need it - can be stored just as a string in the database.
When using an abstraction layer or ORM that supports SQLite (eg. NHibernate), you may also reuse some code even when there is another database used by the server. Note that the learning curve for such an ORM might be rather steep. If you don't know anything like this, it could be too much.
You don't need to force reuse of code in the client and server.
Synchronization itself shouldn't be very complicated. You have a revision number in the client and a last revision in the server. You get all new / changed and deleted items since then in the client and apply it to the local store. Update the local revision number. Commit. Done.
I would never update only a part of a revision, because then you can't really know what changed since the last synchronization. Because you do differential updates, it is essential to have a well defined state of the client.
I would go with a solution using Sync Framework.
Quote from Microsoft:
Microsoft Sync Framework is a comprehensive synchronization platform enabling collaboration and offline for applications, services and devices. Developers can build synchronization ecosystems that integrate any application, any data from any store using any protocol over any network. Sync Framework features technologies and tools that enable roaming, sharing, and taking data offline.
A key aspect of Sync Framework is the ability to create custom providers. Providers enable any data sources to participate in the Sync Framework synchronization process, allowing peer-to-peer synchronization to occur.
I have just built an application pretty much exactly as you described. I built it on top of the Microsoft Sync Framework that DjSol mentioned.
I use a C# front end application with a SqlCe database, and a SQL 2005 Server at the other end.
The following articles were extremely useful for me:
Tutorial: Synchronizing SQL Server and SQL Server Compact
Walkthrough: Creating a Sync service
Step by step N-tier configuration of Sync services for ADO.NET 2.0
How to Sync schema changed database using sync framework?
You don't say what your back-end database is, but if it's SQL Server you can use SqlCE (SQL Server Compact Edition) as the client DB and then use RDA merge replication to update the client DB as desired. This will handle all your requirements for sure; there is no need to reinvent the wheel for such a common requirement.

for securing a database that is meant to be accessed by several clients, is using a web service as a proxy an overkill?

we're going to have a database, and a client application that is going to be installed on several machines in a local network, and they must be able to access the DB.
Some of them must be able to edit and modify the DB, and some of them are going to just read them. each of these two groups are separated to several groups too, based on who must be able to access to which table/field.
To create this application, we were gave an advice to deploy a web service to role as a proxy between clients and the DB, in order to secure the DB.
But we're not transferring any sensitive data (such as credit card numbers or...) and we're only afraid of not an unauthorized person be able to modify the DB.
Isn't just using the integrated security option in the app.config sufficient?
Do we really need to hide and secure the connection string?
It could be overkill, but it might not be. Deciding to go to a Service-Oriented Architecture could be based on several factors, among which:
How long are you expecting to maintain this application?
How many client deployments are you expecting?
Do you expect your database to change often?
What are your SLA requirements?
Do you expect the database to eventually be used for other applications?
etc...
The long and short of it is, if you want to be able to change things in the middle tier or database, and you don't want to have to upgrade every client when you do so, adding a Service layer might be the way to go. You also have the advantage of providing a rich API for other client developers (internal or external) while controlling business rules and security in one, centralized location.
SOA definitely adds to the complexity of the project, but in many cases, it can save you a lot of headaches in the future.
For further reading, look at http://en.wikipedia.org/wiki/Service-oriented_architecture, http://www.soapatterns.org/, or Google.
Sounds way overboard to me. If the application is only going to be used internally, and Windows authentication is an option, certainly use it. Building a web service is only going to slow down development and add an unnecessary layer of complexity. The read/write users could be members of a Windows group that has read/write access to the database, and the read-only users could be members of a Windows group that only has read access to the database. Then, if the user is able to gain direct access to the database (without using your front-end) they would only be able to either read or read/write based on their Windows rights.

How to sync two databases for disconnected systems from different companies

Is there a standard messaging protocol(s) / API(s) available to keep databases in sync. Or alternatively API(s) for creating and parsing messages.
Our company is working with another company to provide two different software packages to two different kinds of users. The data sits in two separate databases but parts of it have to remain in sync.
Their system is pretty much a black box to us. And vice versa.
So what would be required would be to track updates, and turn these into messages and send them to a web service, map these back to the destination database fields, and commit them.
The database schemas do not match.
I am aware that we are going to have to roll most of this ourself, but some ideas around messaging or techniques would be good.
One solution : SQL Server Integration Service. It appears from SQL Server 2005. This is exactly what you need. It was called DTS in SQL Server 2000 for Data Transformation Service. This was created to import/export/transform data from one point to an other. This is really easy to use from SQL Server 2005 (DTS is quite horrible).
So basically, you will have to write packages to import data from their database, transform, filter, etc. it exactly how you need it to insert it into your database. And vice versa.
Regarding the black box fact, you should generate the database relational design to make it easier.
EDIT
Just in case of you need to install it, I remember bugs from the SQL Server 2005 installer not installing SSIS at all. I had to satisfy all warnings in the installer system requirements step to obtain it.
You have two problems:
track the changes that have to be synced
apply the changes to the peer
There is a solution that combines a solution to both issues and I'm sure you are aware of it: replication. Merge Replication would allow both sites to update the data and would also provide merge conflict resolution. But replication only works when the table schema is similar and puts a big constraint on development as schema changes have to be carefully coordinated between the sites. In practice, when the sites are operated by independent companies, is quite difficult to maintain for a long term.
If you want to roll your own the change tracking part has built in support in SQL Server:
Change Tracking
Change Data Capture
Both can be used for a sync solution as a mean to detect what changed.
Applying the changes can be resolved by a web service, but there are also built-in solutions in SQL Server that allow for far higher scalability and throughput: Service Broker. Relying on a message defined API for sync allows the two sites to evolve at their own pace and change the schema almost at will, as long as the communication API (the message protocol)remains unchanged.
The answers provided give me some good ideas, but I think we are going to end up doing something a bit different.
We are using MSMQ, and defining a standard messaging system which we will roll ourselves.
As to how we will know what things have changed I am not sure at the moment.

Tools for Building an OCA (Occasionally Connected Application)

I will be building an in-house, Occasionally Connected App (OCA). What technologies would you suggest I employ.
Here are my parameters:
.NET Shop(3.5sp1)
C# for code behind (winform,wpf,silverlight)
SQL Server Backend (2005 or possibly 2008 pending approval)
Solo Developer
Solo SQL Administrator
Low Tech end users
Low bandwidth to 5 Branch offices
This is a LOB app but not a POS.
Majority of users have laptops that they take to Member's Home
The Data for this App is stored in 5 separate Databases, though in one SQL instance.
I am looking for specific recommendations on which path to choose. Merge Replication or Sync Framework database synchronization providers? SQL Express or SQL CE at the Subscriber? Can I use LINQ to SQL for the DAL?
Is a Silverlight 'Offline/Out of Browser App' Example Here, feasible?
This is my first LARGE business application so any experienced comments are welcome.
As requested here is some additional info on the type of Data. My users are Nurses and Social Workers who go to Member's homes and create "Plans" or "Health Assessment Reviews" for them. These are things like a Medication List or a List of there current "Providers". Steps to achieve members' goals or a list of there current/past Diagnosis's. Things like that.
Also the typical Members Name, Address, Phone Number, etc. Mostly this is a Data Storage and Retrieval app that facilitates reporting. Very little "processing" takes place and Nurses and Social Workers work in teams that are assigned members so I usually have very little crossover or potential data conflicts. Nurses and SW's also are responsible for different area's of the MCP(Member Centered Plan)
Additional question; Is Sync Framework really only a viable option if I can use SQL 2008? Seems that way due to the Change Tracking etc....thoughts?
Once you solve the problem of change detection and data movement, everything else is trivial. In other words technologies like WPF, Silverlight, Forms and even WCF are orthogonal to your main problem and your choice should be based on your personal preferences and experience. The real hard nut to crack is working disconnected and synchronizing changes. Which leaves two out-of-the-box avenues: Synch Framework or Replication.
I would say, for your scenario, definetely Synch Framework. Merge replication, like all forms of replication, is designed for systems that are connected continously with intermitent disconnects. And most critically replication can work only over static names. Laptops connecting from various hot-spots and ISPs have a nasty habit of changing FQ names with each connection. Replication can overcome this only if a VPN of sort is used and VPN is usually a major support issue. Replication is just not designed for the high mobility of OCA systems.
Synch Framework will pretty much force you to SQL 2008 back end because of the need to Change Data Capture or Change Tracking, both being SQL 2008 only features.
You will still have plenty of hard problems to solve ahead (authentication, versioning and upgrade, data conflict resolution policies, securing data on the client for accidental media loss etc etc)
Personally, I would say:
.NET 3.5
WCF Data Services (for communication between the client app and your data)
SQL Server 2k5/2k8 (whichever you can use)
Silverlight w/ Out of Browser Functionality
VistaDB (to store data locally on the client until you can push to the server)
use unique-identifier for key if you are creating stuff while offline and not connected and when you do connect, updating the database.
this is going to be way easier than using auto-increment key
Having worked on an occasionally connected application, I'd encourage you to look in to SQL Server CE for the client machines, with Sync Services to handle the connections. Here is a good tutorial.
You could create this stuff from the ground up, it seems.
However, this seems an awful lot like a CRM application, and it wouldn't surprise me if you could find an enterprise software package to do this without starting from scratch and instead modify one of the configurations to meet your business rules.
In a previous life, I was a configuration developer for this thing called Siebel that might be close to what your'e looking for. They even have a built-in synchronization tool called Siebel Remote.
It might be a cheaper route to go than rolling your own from scratch.
I wrote an order taking program for wine sales reps. Here is the video. The client software is installed using click-once. That also installs SQL Server Express and loads the database. I used the Microsoft Sync Framework to sync the local database with the one on the server (see the last section of the video.)
With powerful clients now I don't see any reason to not use SQL Server Express, it is free with a limit of 4GB.
SQL CE had too many limitations - no stored procs being a major one.
You will need to use GUIDs everywhere as the primary key - see the new NewSequentialID().
I love click-once, it is a big time saver.
I'm looking forward to Silverlight, but just haven't had time to look into it. Not sure if I would have done it with Silverlight if doing it now or not.
Having said all this, this is not a project for anyone inexperienced. So I would also get some very experienced help.

Any ORMs that work with MS-Access (for prototyping)?

I'm in the early stages of a project, and it's not clear yet whether we'll need a "real" database (i.e. SQL Server et al). So I've been doing some prototyping using MS-Access, which is working fine so far. (developing in C#/VS2008/.Net 3.5/MS-Access 2000).
However, the object-relational impedance mismatch is already becoming annoying, and will only get worse as the project evolves.
I have not been able to find an ORM that will work with MS-Access. Any suggestions?
Edit - Follow Up
We ended up using Fluent NHibernate, mainly because it Automaps our object model to a relational database, which has been a huge win for us. Most of the FNH code samples we found used SQLite, and this worked so well that we intend to use it for our production database. (The app is a desktop scientific data collection and analysis package).
MSAccess files can be set up as an ODBC source on Windows machines. Almost any ORM will allow you to use ODBC. Here is a quick tutorial on how to set that up, it's outlined for Win2k but the process is the same for XP+. You also need to have MDAC installed on your box.
NHibernate seems to have native support of MSAccess as well, see here. I've never used it though. It also has an ODBC driver.. Many others support ODBC as well.
And again, as others are saying.. MSAccess does not scale... period. Installing a real database server is fairly easy, so I'd recommend SQL Server Express as others have, or even MySQL or Postgre, whatever is easier to set up.
If this is an application that you intend to deploy to clients, with each client having their own unique database, I would recommend another solution entirely, SQLite. SQLite gives you database power on an app by app basis. If you have a central database server, one of the previously mentioned solutions would be best.
There's only one scenario when choosing the Access Database Engine is a good choice: when building a self-contained Access application using Access Forms (though choosing to use Access in the first place is a questionable choice ;)
The database engine that VS2008 plays nicest with is SQL Server and you will have no problem finding an ORM that plays nice with SQL Server.
Can't give you an answer to your question, but instead of Access you might want to consider one of the following options:
SQL Server Express: is free and compatible with the full SQL Server
SQL Server Compact: also free, does not require any deployment/installation, does not support all features (e.g. no stored procedures).
At this stage, if you are unsure whether you need a "real" database or not, I'd skip MS Access and go straight to sql server express. It's free and still allows you to do everything you need to.
Plus, if you later decide you need to scale up, then you can without any pain.
I recommend you to use something like Microsoft SQL Server or PostgreSQL for prototyping. If you don't want to learn specific SQL syntax and install special tools for designing database schema, you can use ORM that automatically generates database schema from your persistent classes declaration. Anyway this approach is very effective for prototyping.
LLBLGen works with Access
Access is just a bad, bad idea. I believe MS only includes Access in Office to keep legacy users happy.
Even if you find an ORM that will work with an Access database, with few exceptions you're locking yourself into a niche tool that likely will not work out-of-the box with a real database engine. If you decide to switch to a real database engine later on, you'll not only have to deal with migrating the database, but switching to a different ORM.
See this comparison between SQL Server Express and SQL Server Compact. The comparison document also mentions some problems with other data stores, including Access.
If you are REALLY concerned about being able to install SQL Server Express, consider SQL Server Compact:
it can be linked into your redistributable app. No need to install a service (which may require admin rights during install of your application); everything is taken care of when you install your app. This makes the most sense if you need the data to reside on the user's machine instead of a server, and is most analogous to using Access.
It's less powerful than Express (doesn't support views, triggers, stored procedures, which I consider a requirement)
Can be scaled up to Express or other SQL Server versions very easily
Suitable for small-footprint installs like tablets, mobile devices, etc.
Always keep scalability in mind when designing any application. You don't want to wind up having to write a PHP->C++ compiler if/when your app becomes successful just because you picked the wrong tool up front.
While we're at it:
The big issue with Access (or, in this case, the Jet engine, which is the part you'd really be using when integrating an Access database with a .NET app) is that there is no "server" that handles datase requests. The engine, hosted in your app, must read and write directly to a file on disk that contains the database. Whenever this happens, the file must be locked to prevent concurrent writes. Dirty reads become more common as the number of users grows, as does the potential for database corruption.
Imagine having every customer at a large restaurant trying to simultaneously enter the kitchen to write down their orders or retrieve their food. Chaos would result. There'd be a lot of broken dishes, the kitchen would be a mess, you'd be lucky to get what you ordered in any sort of edible condition. With one customer, this probably works fine. With 5, eh, maybe. With 20,50,1000? Not so much.
So, the restaurant industry introduced waiters and managers that buffer IO to the kitchen. The database server application does something roughly analogous to this by restricting access to the files on disk. Everyone gets what they want, faster and in a much more reliable way, and the data store is protected.

Categories