C# - XML vs MySQL

C# - XML vs MySQL - c#

In this program I'm writing, it would need frequent database communication, and at the moment I'm using just XML files. Is there really a benefit from using MySQL or SQL in general over XML. Just note that I'm using C# so MySQL is not very fun to deal with in it (from what little experience I have).

In terms of maintaining data stored in XML files vs. a relational database (Mysql, in your case), the database is far more robust than simple XML files. But this is simply an exercise in determining the needs of your application.
MySql, like many other RDBMSs, will provide much more than just a place to park your data. The biggest advantage to using a modern db such as MySql is ACID support. This means you get all-or-nothing transactions, ensuring consistency through your data.
You also get referential integrity to ensure that related records stay intact and don't leave you with abandoned references to other data records. We could go on and on to discuss the value of locking or the power of stored procedures.
But really, you should consider the needs of your application. If you do significant gymnastics to keep your data in order or you care about shared access and file locks while trying to read and write data, you need to punt on your XML file basis. No need trying to find ways around these issues when a basic mysql database will solve those issues.

If there's truly relational data...you'll almost always benefit from using a RDBMS. Retrieving data will be faster with the backing of a query engine rather than tying together XML nodes. You'll also get referential integrity when inserting data into the structure.
There is an ADO.NET provider for MySQL, so you shouldn't have any more difficulty dealing with a MySQL database than MS SQL Server.
You could even download DbLinq and give their LINQ to MySQL functionality a shot. Could make things even easier (or you could use Entity Framework with the MySQL ADO.NET provider).

The size of XML documents can be a large factor. In XML you either produce large and complicated text files with a huge amount of additional data or your data is split up accross several files. Managing these files can be a headache. Using a SQL database will allow you waste less disk space.
SQL is faster than using XML.
Any SQL database will give you access to a whole set of permissions and role capabilities that may be difficult to enforce using XML.

If you have relational data, a database would work. As an alternative to MySQL, if you aren't looking for a centralized solution, you can use SQLite. SQLite runs in-process (meaning the program running it is it's own "database server") and requires no installation other than distributing the DLL file containing it.
Robert Simpson has written System.Data.SQLite, a SQLite Data Provider for the .Net framework. It's free and open source (like SQLite) and works and feels as native as System.Data.SqlClient does. It supports standard ADO.Net conventions, Linq, and the Entity Framework.
I've used System.Data.SQLite for projects at work for applications that need to run fast and cache data locally for comparison between multiple runs (data processing and job scheduling). Firefox is a good example of an application using SQLite, Firefox 3 uses SQLite for it's Cookies, the Downloads history, Form autocomplete, and most importantly your web browsing history.
Again SQLite is meant for direct application use and lacks features like user authentication and schema permissions. It has issues if multiple programs try to write to the same database (those can be worked around but nothing like what a real RDBMS can do). It's biggest advantage is it doesn't need to be installed and set up to work like MySQL does. In the C# case all you have to do is reference System.Data.SQLite and copy the .dll file along with your program and it'll work.

Related

Best approach to incremently update application data

I have been working on an application for a couple of years that I updated using a back-end database. The whole key is that everything is cached on the client, so that it never requires an network connection to operate, but when it does have a connection it will always pickup the latest updates. Every application updated is shipped with the latest version of the database and I wanted it to download only the minimum amount of data when the database has been updated.
I currently use a table with a timestamp to check for updates. It looks something like this.
ID - Name - Description- Severity - LastUpdated
0 - test.exe - KnownVirus - Critical - 2009-09-11 13:38
1 - test2.exe - Firewall - None - 2009-09-12 14:38
This approach was fine for what I previously needed, but I am looking to expand more function of the application to use this type of dynamic approach. All the data is currently stored as XML, but I do not want to store complete XML files in the database and only transmit changed data.
So how would you go about allowing a fairly simple approach to storing dynamic content (text/xml/json/xaml) in a database, and have the client only download new updates? I was thinking of having logic that can handle XML inserted directly
ID - Data - Revision
15 - XXX - 15
XXX would be something like <Content><File>Test.dll<File/><Description>New DLL to load.</Description></Content> and would be inserted into the cache, but this would obviously be complicated as I would need to load them in sequence.
Another approach that has been mentioned was to base it on something similar to Source Control, storing the version in the root of the file and calculating the delta to figure out the minimal amount of data that need to be sent to the client.
Anyone got any suggestions on how to approach this with no risk for data corruption? I would also to expand with features that allows me to revert possibly bad revisions, and replace them with new working ones.

It really depends on the tools you are using and the architecture you already have. Is there already a server with some logic and a data access layer?
Dynamic approaches might get complicated, slow and limit the number of solutions. Why do you need a dynamic structure? Would it be feasible to just add data by using a name-value pair approach in a relational database? Static and uniform data structures are much easier to handle.
Before going into detail, you should consider the different scenarios.
Items can be added
Items can be changed
Items can be removed (I assume)
Adding is not a big problem. The client needs to remember the last revision number it got from the server and you write a query which get everything since there.
Changing is basically the same. You should care about identification of the items. You need an unchangeable surrogate key, as it seems to be the ID you already have. (Guids may be useful here.)
Removing is tricky. You need to either flag items as deleted instead of actually removing them, or have a list of removed IDs with the revision number when they had been removed.
Storing the data in the client: Consider using a relational database like SQLite in the client. (It doesn't need installation, it is just storing in a file. Firefox for instance stores quite a lot in SQLite databases.) When using the same in the server, you can probably reuse some code. It is also transaction based, which helps to keep it consistent (rollback in case of error during synchronization).
XML - if you really need it - can be stored just as a string in the database.
When using an abstraction layer or ORM that supports SQLite (eg. NHibernate), you may also reuse some code even when there is another database used by the server. Note that the learning curve for such an ORM might be rather steep. If you don't know anything like this, it could be too much.
You don't need to force reuse of code in the client and server.
Synchronization itself shouldn't be very complicated. You have a revision number in the client and a last revision in the server. You get all new / changed and deleted items since then in the client and apply it to the local store. Update the local revision number. Commit. Done.
I would never update only a part of a revision, because then you can't really know what changed since the last synchronization. Because you do differential updates, it is essential to have a well defined state of the client.

I would go with a solution using Sync Framework.
Quote from Microsoft:
Microsoft Sync Framework is a comprehensive synchronization platform enabling collaboration and offline for applications, services and devices. Developers can build synchronization ecosystems that integrate any application, any data from any store using any protocol over any network. Sync Framework features technologies and tools that enable roaming, sharing, and taking data offline.
A key aspect of Sync Framework is the ability to create custom providers. Providers enable any data sources to participate in the Sync Framework synchronization process, allowing peer-to-peer synchronization to occur.

I have just built an application pretty much exactly as you described. I built it on top of the Microsoft Sync Framework that DjSol mentioned.
I use a C# front end application with a SqlCe database, and a SQL 2005 Server at the other end.
The following articles were extremely useful for me:
Tutorial: Synchronizing SQL Server and SQL Server Compact
Walkthrough: Creating a Sync service
Step by step N-tier configuration of Sync services for ADO.NET 2.0
How to Sync schema changed database using sync framework?

You don't say what your back-end database is, but if it's SQL Server you can use SqlCE (SQL Server Compact Edition) as the client DB and then use RDA merge replication to update the client DB as desired. This will handle all your requirements for sure; there is no need to reinvent the wheel for such a common requirement.

creating a backend data storage for quick retrieval

I am writing a software which stores all the information of a users interaction in a global session object/class. I would like to store this values collected in a persistent storage. However i cannot use heavy databases such as sql server or mysql in the target pc as i need to keep the installer minimum in size.
I also need to retrieve values from the storage by passing simple Linq queries,etc.
My question is what is the next best thing to databases which can be manipulated by C# code?

Probably either SQLite or SQL Server Compact Edition - these are both fairly full-featured database systems than run entirely in-process and are frequently used these sorts of thing (for example Firefox uses SQLite to store bookmarks).
The next rung down the ladder of complexity would probably be either XML (using LINQ to XML), or just serialisable objects (using LINQ to objects) - you would of course incur performance penalties over a "proper" compact database like SQLite if you started storing a lot of data, however you would probably need store more than you think before it became noticable, and for small data sets the simplicity would even make this faster than SQLite (for example you could restrict your application to storing the last 100 or so actions).

SQL Server CE and SQLite are popular for the scenario you are describing. XML is as well

You could connect to Access MDB files. You don't need an SQL server for this, and it uses the same syntax.
Just need to use OleDb.
Example: DataEasy: Connect to MS Access (.mdb) Files Easily using C#

Which data store for a management service like IIS

I have a C# Windows service which manages some stuff for my server application. This is not the main application, but a helper process used to control my actual application. The user connects to this application via WCF using a WinForms application. It all looks a bit like the IIS manager.
I need a data store for this application.
Currently, I use separate XML files which are loaded at start up, are updated in memory and flushed to disk on every change. I like this because:
We can simply edit the XML files in notepad when issues arise;
I do not have external dependencies to e.g. MSSQL express;
I do not have to update a database schema when the format changes.
However, I find that this is not stable and that the in memory management is very fragile.
What should I use instead that is not overkill (like e.g. MSSQL express would be) without loosing too many of the above advantages?

SQLite is made for occasions like this where you need a solid data store, but do not require the power or scalability of a full database server.
If you do not want to worry about schema changes, you may be best off with your xml method or some variety of NoSQL database. What exactly is unstable about your xml setup?
If you have multiple concurrent processes accessing the xml file, you will have to load it quite often to ensure it remains synchronized. If this is a multiuser situation, xml files may not be feasible past a very very small scale. This is the problem database systems solve fairly effectively.

Try SQL CE or SQLLite.

db4o
One solution would be to use and object database like dB4o. It has an extremely small footprint, is fast as hell and can you can add properties to your persisted objects without needing to make schema changes. Also, you don't have to write any sql.
Storing objects is as easy as:
using(IObjectContainer db = Db4oEmbedded.OpenFile(YapFileName))
{
Pilot pilot1 = new Pilot("Michael Schumacher", 100);
db.Store(pilot1);
}
XML in Database
Another way to do it is using something like SQLLite or SQL CE (as mentioned by other posters) in conjunction with xml data.
Data Contract Serializer
If you're not already using the DataContractSerializer / DataContracts to generate / load your xml files, it's worth considering. It's the same robust framework that you're already using for WCF. It handles versioning pretty well. You could use this to deal with xml files on disk, or use it with a database.

Any ORMs that work with MS-Access (for prototyping)?

I'm in the early stages of a project, and it's not clear yet whether we'll need a "real" database (i.e. SQL Server et al). So I've been doing some prototyping using MS-Access, which is working fine so far. (developing in C#/VS2008/.Net 3.5/MS-Access 2000).
However, the object-relational impedance mismatch is already becoming annoying, and will only get worse as the project evolves.
I have not been able to find an ORM that will work with MS-Access. Any suggestions?
Edit - Follow Up
We ended up using Fluent NHibernate, mainly because it Automaps our object model to a relational database, which has been a huge win for us. Most of the FNH code samples we found used SQLite, and this worked so well that we intend to use it for our production database. (The app is a desktop scientific data collection and analysis package).

MSAccess files can be set up as an ODBC source on Windows machines. Almost any ORM will allow you to use ODBC. Here is a quick tutorial on how to set that up, it's outlined for Win2k but the process is the same for XP+. You also need to have MDAC installed on your box.
NHibernate seems to have native support of MSAccess as well, see here. I've never used it though. It also has an ODBC driver.. Many others support ODBC as well.
And again, as others are saying.. MSAccess does not scale... period. Installing a real database server is fairly easy, so I'd recommend SQL Server Express as others have, or even MySQL or Postgre, whatever is easier to set up.
If this is an application that you intend to deploy to clients, with each client having their own unique database, I would recommend another solution entirely, SQLite. SQLite gives you database power on an app by app basis. If you have a central database server, one of the previously mentioned solutions would be best.

There's only one scenario when choosing the Access Database Engine is a good choice: when building a self-contained Access application using Access Forms (though choosing to use Access in the first place is a questionable choice ;)
The database engine that VS2008 plays nicest with is SQL Server and you will have no problem finding an ORM that plays nice with SQL Server.

Can't give you an answer to your question, but instead of Access you might want to consider one of the following options:
SQL Server Express: is free and compatible with the full SQL Server
SQL Server Compact: also free, does not require any deployment/installation, does not support all features (e.g. no stored procedures).

At this stage, if you are unsure whether you need a "real" database or not, I'd skip MS Access and go straight to sql server express. It's free and still allows you to do everything you need to.
Plus, if you later decide you need to scale up, then you can without any pain.

I recommend you to use something like Microsoft SQL Server or PostgreSQL for prototyping. If you don't want to learn specific SQL syntax and install special tools for designing database schema, you can use ORM that automatically generates database schema from your persistent classes declaration. Anyway this approach is very effective for prototyping.

LLBLGen works with Access

Access is just a bad, bad idea. I believe MS only includes Access in Office to keep legacy users happy.
Even if you find an ORM that will work with an Access database, with few exceptions you're locking yourself into a niche tool that likely will not work out-of-the box with a real database engine. If you decide to switch to a real database engine later on, you'll not only have to deal with migrating the database, but switching to a different ORM.
See this comparison between SQL Server Express and SQL Server Compact. The comparison document also mentions some problems with other data stores, including Access.
If you are REALLY concerned about being able to install SQL Server Express, consider SQL Server Compact:
it can be linked into your redistributable app. No need to install a service (which may require admin rights during install of your application); everything is taken care of when you install your app. This makes the most sense if you need the data to reside on the user's machine instead of a server, and is most analogous to using Access.
It's less powerful than Express (doesn't support views, triggers, stored procedures, which I consider a requirement)
Can be scaled up to Express or other SQL Server versions very easily
Suitable for small-footprint installs like tablets, mobile devices, etc.
Always keep scalability in mind when designing any application. You don't want to wind up having to write a PHP->C++ compiler if/when your app becomes successful just because you picked the wrong tool up front.
While we're at it:
The big issue with Access (or, in this case, the Jet engine, which is the part you'd really be using when integrating an Access database with a .NET app) is that there is no "server" that handles datase requests. The engine, hosted in your app, must read and write directly to a file on disk that contains the database. Whenever this happens, the file must be locked to prevent concurrent writes. Dirty reads become more common as the number of users grows, as does the potential for database corruption.
Imagine having every customer at a large restaurant trying to simultaneously enter the kitchen to write down their orders or retrieve their food. Chaos would result. There'd be a lot of broken dishes, the kitchen would be a mess, you'd be lucky to get what you ordered in any sort of edible condition. With one customer, this probably works fine. With 5, eh, maybe. With 20,50,1000? Not so much.
So, the restaurant industry introduced waiters and managers that buffer IO to the kitchen. The database server application does something roughly analogous to this by restricting access to the files on disk. Everyone gets what they want, faster and in a much more reliable way, and the data store is protected.

Pros and cons of the Access database engine. Life after SQLite

I asked a question a while ago about which local DB was right for my situation. I needed to access the DB from both .NET code and VB6. The overwhelming response was SQLite. However, I decided to pass on SQLite, because the only OLE DB provider for it charges royalties for every deployed copy of my software. It also requires an activation procedure to be run on every single PC.
After evaluating other options (SQL Server Compact edition - barely functional OLE DB provider, Firebird - don't want to have to pay for another driver, etc...), I've come to conclusion that the only viable choice is using .MDB files created by Microsoft Access (or the Jet engine).
I haven't used it since late 90s, so I have the following questions to those who have experience with it.
Have they resolved the problem where the database would corrupt every now and then.
Is access to the MDB from c# accomplished via the ADO.NET OLEDB Provider or is there a native solution (i can't seem to find it).
Is there a viable alternative to the really crappy SQL Editor in Access?
Thanks.

Rather then going "back" to Access, I'd stick with SQLite and use the System.Data.SQLite provider for SQLite data access within the .NET code.
Then I'd just create a simple COM interop .NET class for use by VB6 that wraps any required SQLite data access functionality. Finally, just reference and use it like a standard COM object from your VB6 projects.
My knowledge of Access is probably a bit dated and biased by bad experiences, but within reason I would try most other options before resorting to the Access route.

Have you considered SQL Server 2008 Express Edition (as oppose to SQL Server CE)?
1) Personally, I found that most times that Access DBs corrupted it was due to code that didn't clean up after it self, or there was a faulty Network card involved.
2)
string connectionString = #“Provider = Microsoft.Jet.OLEDB.4.0; " +
#"Data Source = C:\data\northwind.mdb; " +
#"User Id = guest; Password = abc123”
using (OleDbConnection oleDbConnection = New OleDbConnection())
{
oleDbConnection.ConnectionString = connectionString;
oleDbConnection.Open();
...
}
3) SQL Server 2008 Express Edition

MDB corruption is largely due to failures that occur in client machines, file servers, and networks while the database is open. If you put the MDB on a file share this is always a risk, if on a local hard drive and used by one user the problems are much rarer.
I would not expect SQLite to be any different, and if anything worse.
Periodically running JetComp.exe (a Microsoft download) will fix many problems and compact index tables and such. Backups are important no matter what you use.
You don't need MS Access at all to use Jet MDBs. There are some 3rd party tools for designing the database schema and doing interactive queries, both command line and GUI.

Since the MDB format is more or less deprecated, your late 90s knowledge is quite up to date. See this MSDN page

You could also try SQL Anywhere it runs on various OS and has a small footprint. Works for me :)

AngryHacker asked:
Q1. Have they resolved the problem where the database would corrupt every now and then.
Er, what?
There was never any corruption problem in properly engineered apps properly deployed in properly maintained environments. I haven't seen a corrupted MDB in 3 or 4 years, and I have dozens of my apps in full-time production use by many clients in many different types of operating environments.
I think that most people who experience corruption are those who try to share an MDB file among many users (whether split or unsplit). Since you're not contemplating using Access, that's not really an issue.
Q2. Is access to the MDB from c# accomplished via the ADO.NET OLEDB Provider or is there a native solution (i can't seem to find it).
The native solution would be DAO, but that's COM, so you might not want to use that. From C#, I'd say OLEDB is your best bet, but that's not my area of expertise so take it with a grain of salt. I believe that Michael Kaplan reported that the Jet ADO/OLEDB provider was thread-safe, while DAO is not. This doesn't mean he recommended ADO/OLEDB over DAO, though, but his comments also came in an Access context, and not C#.
Q3. Is there a viable alternative to the really crappy SQL Editor in Access?
Why would you be using that when you're not actually using Access? You could use any SQL editor you like as long as you test that the SQL you write is compatible with Jet's SQL dialect.
I, for one, don't see what the issue is with Access's SQL editor (other than the inability to set the font size), but then, I write a lot of my SQL using the QBE and don't ever even look at the SQL view.

To answer your question regarding the really crappy SQL editor in Access - I wholeheartedly agree. The font stinks, MSAccess always badly reformats the query, it sometimes adds in metacharacters that break my SQL, and lastly but worstly, if it can't parse the SQL, it won't let you have access to it!
My solution is to use external code. I use DAO to instantiate MSAccess and can then directly edit the queries using the QueryDefs collection. It lets you do most things - create, rename, edit, etc. There are a couple of things you cannot do this way though - for example, you do not have access to the query metadata (description, hidden, etc).
External code is also great because you can build a suite of test cases, specifying expected return values, etc.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.