Which is faster in C# XML or SQL?

Which is faster in C# XML or SQL? - c#

Which is faster in C#:to read tiny XML files or to read tiny SQL tables with small amount of data ?
I wonder if is really necessary to create a table in SQL then establish a connection just to read 10 or 11 parameters.
What would you reccomend?

It's really depends on what You need. Nothing stops You from even combining the two worlds as XML can easily be stored in SQL Server.
If You want to actually have the authentication of the SQL Server, have it backed up, versioned or whatever, You can easily design a mixed XML, SQL, table solution. If You really need some propertyBag persistence area files are ok, but they still require care ie. access control, taking care when it is not present etc (reading a file still does throw a lot of exceptions and IT does it with some good reason).
Ask Yourself questions like: do I need restricted access, how will I report changes (if any),
do I need version history,
do I read all the parameters or only part of
it?
what Do i need to do if someone
changes an entry?
what should I do when there is no entry?
does it need to be extensible (new parameters added/removed)?
should it be encrypted?
does the database layer needs to know about it?
Just some thought from the top of my head.
Luke

If you just have a handful of 'settings' that you want to read, I would definitely go with a small XML file. I can't say definitively that it would be faster, but given that you would eliminate the over head of establishing the connection, authenticating, etc it would definitely be simpler.
And if you can use LINQ to XML, its really easy to do.

Speed is not the only consideration. You don't have as much admin overhead with XML files as you would with SQL Server.
If the file is local, it will certainly be faster to read using direct file than networked SQL access. Far less between you and the data. No impact on your process from other SQL usage.

reading a lot of files is slow so if you have tons of xml files i would vote for SQL especially if we consider the fact that you have to parse the xml files as well which is way more complicated and more time consuming then making a connection to a DB especially if the DB is on the localhost :)

SQL based method: pros
Easy to migrate, configure
SQL based method: cons
Connection can be down, connection takes time to establish, DB admin will wonder why there is a tiny table that has no meaning, codebase become unnecessarily complex
File based method: pros
Fast, no overhead on DB
File based method:cons
Migration is an issue. Configuration is an issue. Can easily get corrupted.

Related

ADO and Microsoft SQL database backup and archival

I am working on re-engineering/upgrade of a tool. The database communication is in C++(unmanaged ADO) and connects to SQL server 2005.
I had a few queries regarding archiving and backup/restore techniques.
Generally archiving is different than backup/restore . can someone provide any link which explains me that .Presently the solution uses bcp tool for archival.I see lot of dependency on table names in the code. what are the things i have to consider in choosing the design(considering i have to take up the backup/archival on a button click, database size of 100mb at max)
Will moving the entire communication to .net will be of any help? considering lot of ORM tools. also all the bussiness logic and UI is in C#
What s the best method to verify the archival data ?
PS: the questionmight be too high level, but i did not get any proper link to understand this. It will be really helpful if someone can answer. I can provide more details!
Thanks in advance!

At 100 MB, I would say you should probably not spend too much time on archiving, and just use traditional backup strategies. The size of your database is so small that archiving would be quite an elaborate operation with very little gain, as the archiving process would typically only be relevant in the case of huge databases.
Generally speaking, a backup in database terms is a way to provide recoverability in case of a disaster (accidental data deletion, server crash, etc). Archiving mostly means you partition your data.
A possible goal with archiving is to keep specific data available for querying, but without the ability to alter it. When dealing with high volume databases, this is an excellent way to increase performance, as read-only data can be indexed much more densely than "hot" data. It also allows you to move the read-only data to an isolated RAID partition that is optimized for READ operations, and will not have to bother with the typical RDBMS IO. Also, by removing the non-active data from the regular database means the size of the data contained in your tables will decrease, which should boost performance of the overall system.
Archiving is typically done for legal reasons. The data in question might not be important for the business anymore, but the IRS or banking rules require it to be available for a certain amount of time.
Using SQL Server, you can archive your data using partitioning strategies. This normally involves figuring out the criteria based on which you will split the data. An example of this could be a date (i.e. data older than 3 years will be moved to the archive-part of the database). In case of huge systems, it might also make sense to split data based on geographical criteria (I.e. Americas on one server, Europe on another).
To answer your questions:
1) See the explanation written above
2) It really depends on what the goal of upgrading is. Moving it to .NET will get the code to be managed, but how important is that for the business?
3) If you do decide to partition, verifying it works could include issuing a query on the original database for data that contains both values before and after the threshold you will be using for partitioning, then splitting the data, and re-issuing the query afterwards to verify it still returns the same record-set. If you configure the system to use an automatic sliding window, you could also keep an eye on the system to ensure that data will automatically be moved to the archive partition.
Again, if the 100MB is not a typo, I would think your database is too small to really benefit from archiving. If your goal is to speed things up, put the system on a server that is able to load the whole database into RAM, or use SSD drives.
If you need to establish a data archive for legal or administrative reasons, give horizontal table partitioning a look. It's a pretty straight-forward process that is mostly handled by SQL Server automatically.
Hope this helps you out!

SQL Database VS. Multiple Flat Files (Thousands of small CSV's)

We are designing an update to a current system (C++\CLI and C#).
The system will gather small (~1Mb) amounts of data from ~10K devices (in the near future). Currently, they are used to save device data in a CSV (a table) and store all these in a wide folder structure.
Data is only inserted (create / append to a file, create folder) never updated / removed.
Data processing is done by reading many CSV's to an external program (like Matlab). Mainly be used for statistical analysis.
There is an option to start saving this data to an MS-SQL database.
Process time (reading the CSV's to external program) could be up to a few minutes.
How should we choose which method to use?
Does one of the methods take significantly more storage than the other?
Roughly, when does reading the raw data from a database becomes quicker than reading the CSV's? (10 files, 100 files? ...)
I'd appreciate your answers, Pros and Cons are welcome.
Thank you for your time.

Well if you are using data in one CSV to get data in another CSV I would guess that SQL Server is going to be faster than whatever you have come up with. I suspect SQL Server would be faster in most cases, but I can't say for sure. Microsoft has put a lot of resources into make a DBMS that does exactly what you are trying to do.
Based on your description it sounds like you have almost created your own DBMS based on table data and folder structure. I suspect that if you switched to using SQL Server you would probably find a number of areas where things are faster and easier.
Possible Pros:
Faster access
Easier to manage
Easier to expand should you need to
Easier to enforce data integrity
Easier to design more complex relationships
Possible Cons:
You would have to rewrite your existing code to use SQL Server instead of your current system
You may have to pay for SQL Server, you would have to check to see if you can use Express
Good luck!

I'd like to try hitting those questions a bit out of order.
Roughly, when does reading the raw data from a database becomes
quicker than reading the CSV's? (10 files, 100 files? ...)
Immediately. The database is optimized (assuming you've done your homework) to read data out at incredible rates.
Does one of the methods take significantly more storage than the
other?
Until you're up in the tens of thousands of files, it probably won't make too much of a difference. Space is cheap, right? However, once you get into the big leagues, you'll notice that the DB is taking up much, much less space.
How should we choose which method to use?
Great question. Everything in the database always comes back to scalability. If you had only a single CSV file to read, you'd be good to go. No DB required. Even dozens, no problem.
It looks like you could end up in a position where you scale up to levels where you'll definitely want the DB engine behind your data pretty quickly. When in doubt, creating a database is the safe bet, since you'll still be able to query that 100 GB worth of data in a second.

This is a question many of our customers have where I work. Unless you need flat files for an existing infrastructure, or you just don't think you can figure out SQL Server, or if you will only have a few files with small amounts of data to manage, you will be better off with SQL Server.

If you have the option to use a ms-sql database, I would do that.
Maintaining data in a wide folder structure is never a good idea. Reading your data would involve reading several files. These could be stored anywhere on your disk. Your file-io time would be quite high. SQL server being a production database has these problems already taken care of.
You are reinventing the wheel here. This is how foxpro manages data, one file per table. It is usually a good idea to use proven technology unless you are actually making a database server.
I do not have any test statistics here, but reading several files will almost always be slower than a database if you are dealing with any significant amount of data. Given your about 10k devices, you should consider using a standard database.

Programmatically saving a SQL Server database to xml files and restoring it again

I want to save a whole MS SQL 2008 Database into XML files... using asp.net.
Now I am bit lost here.. what would be the best method to achieve this? Datasets?
And I need to restore the database later again.. using these XML files. I am thinking about using datasets for reading the tables and writing to xml and using the SQLBulkCopy class to restore the database again. But I am not sure whether this would be the right approach..
Any clues and tips for me?

If you will need to restore it on the same server type (I mean SQL Server 2008 or higher) and don't care about ability to see actual data inside the XML do the following:
Programmatically backup the DB using "BACKUP DATABASE" T-SQL
Compress the backup
Convert the backup to Base64
Place the backup as the content of the XML file (like: <database name="..." compressionmethod="..." compressionlevel="...">the Base64 content here</database>
On the server where you need to restore it, download the XML, extract the Base64 content, use the attributes to know what compression was used. Decompress and restore using T-SQL "RESTORE" command.
Would that approach work?
For sure, if you need to see the content of the database, you would need to develop the XML scheme, go through each table etc. But, you won't have SPs/Views and other items backed up.

Because you are talking about a CMS, I'm going to assume you are deploying into hosted environments where you might not have command line access.
Now, before I give you the link I want to state that this is a BAD idea. XML is way too verbose to transfer large amounts of data. Further, although it is relatively easy to pull data out, putting it back in will be difficult and a very time consuming development project in itself.
Next alert: as Denis suggested, you are going to miss all of your stored procedures, functions, etc. Your best bet is to use the normal sql server backup / restore process. (Incidentally, I upvoted his answer).
Finally, the last time I dealt with XML and SQL Server we noticed interesting issues that cropped up when data exceeded a 64KB boundary. Basically, at 63.5KB, the queries ran very quickly (200ms). At 64KB, the query times jumped to over a minute and sometimes quite a bit longer. We didn't bother testing anything over 100KB as that was taking 5 minutes on a fast/dedicated server with zero load.
http://msdn.microsoft.com/en-us/library/ms188273.aspx
See this for putting it back in:
How to insert FOR AUTO XML result into table?
For kicks, here is a link talking about pulling the data out as json objects: http://weblogs.asp.net/thiagosantos/archive/2008/11/17/get-json-from-sql-server.aspx
you should also read (not for the faint of heart): http://www.simple-talk.com/sql/t-sql-programming/consuming-json-strings-in-sql-server/
Of course, the commentors all recommend building something using a CLR approach, but that's probably not available to you in a shared database hosting environment.
At the end of the day, if you are truly insistent on this madness, you might be better served by simply iterating through your table list and exporting all the data to standard CSV files. Then, iterating the CSV files to load the data back in ala C# - is there a way to stream a csv file into database?
Bear in mind that ALL of the above methods suffer from
long processing times due to the data overhead; which leads to
a high potential for failure due to the various time outs (page processing, command, connection, etc); and,
if your data model changes between the time it was exported and reimported then you're back to writing custom translation code and ultimately screwed anyway.
So, only do this if you really really have to and are at least somewhat of a masochist at heart. If the purpose is simply to transfer some data from one installation to another, you might consider using one of the tools like SQL Compare and SQL Data Compare from RedGate to handle the transfer.
I don't care how much (or little) you make, the $1500 investment in their developer bundle is much cheaper than the months of time you are going to spend doing this, fixing it, redoing it, fixing it again, etc. (for the record I do NOT work for them. Their products are just top notch.)

Red Gate's SQL Packager lets you package a database into an exe or to a VS project, so you might want to take a look at that. You can specify which tables you want to consider for data.
Is there any specific reason you want to do this using xml?

How to store my data (C#.net)

I'm having a bit of a problem deciding how to store some data. To see it from a simple perspective, it will be a simple table of data but there will be many tables. There will be about 7 columns in each table, but again there will be a lot of tables (and they will be created at runtime, whenever the customer wants a clean grid)
The data has to be stored locally in a file (and there will not be multiple instances of the software running).
I'm using C# 4.0 and I have been looking at using XML files(one file per table, or storing multiple tables in a file), sqlite, sql server CE, access etc. I will be happy if someone here has some comments or suggestions on how to do/not to do. Stability and reliability(e.g. no trashed databases because of unstable third party software) is probably my biggest concern.

If you are looking to store the data locally in a file, I would recommend the sqlite option since it seems your data is created in the form of a database table already. Sqlite is already built to handle multiple tables and columns so it means less mental overhead for you, the developer.
http://web.archive.org/web/20100208133236/http://www.mikeduncan.com/sqlite-on-dotnet-in-3-mins/ is a decent tutorial to give a quick overview on how to set it up and get going.
As for what NOT to do: don't try to make your own scheme to save the data to a file, it's a well understood problem that has been solved many times over, why re-invent the wheel?

XML wont be a good choice if you are planning to make several queries, since loading text files may be painful when they grow (talking about files over 1mb). If you plan to mantain the data low, the xml would be good to keep it simple. I still won't use it, but if you have a background, then the benefits will be heavier than the learning curve.
If you have no expertise in any of them, and the data is light my suggestion is SQLite, I beleive is the best lightweight DB for .Net and the prvider is very good. you can find it easily on Google.
I would tell you that Access is not recommendable, but this is a personal oppinion. Many people use it and I think is for some reason. So you should check it out and try it.
Again, my final recommendation is SQLite, unless you know very well another one, in which case you'll have to think how much your data is going to grow. If you plan to have a DB around 100mb, any of them, except xml would do; If you think it'll grow bigger than that, consider SQLite heavily

.Net Data Handling Suggestions

I am just beginning to write an application. Part of what it needs to do is to run queries on a database of nutritional information. What I have is the USDA's SR21 Datasets in the form of flat delimited ASCII files.
What I need is advice. I am looking for the best way to import this data into the app and have it easily and quickly queryable at run time. I'll be using it for all the standard things. Populating controls dynamically, Datagrids, calculations, etc. I will also need to do user specific persistent data storage as well. This will not be a commercial app, so hopefully that opens up the possibilities. I am fine with .Net Framework 3.5 so Linq is a possibility when accessing the data (just don't know if it would be the best solution or not). So, what are some suggestions for persistent storage in this scenario? What sort of gotchas should I be watching for? Links to examples are always appreciated of course.

It looks pretty small, so I'd work out an appropriate object model, load the whole lot into memory, and then use LINQ to Objects.
I'm not quite sure what you're asking about in terms of "persistent storage" - aren't you just reading the data? Don't you already have that in the text files? I'm not sure why you'd want to introduce anything else.

I would import the flat files into SQL Server and access via standard ADO.NET functionality. Not only is DB access always better (more robust and powerful) than file I/O as far as data querying and manipulation goes, but you can also take advantage of SQL Server's caching capabilities, especially since this nutritional data won't be changing too often.
If you need to download updated flat files periodically, then look into developing a service that polls for these files and imports into SQL Server automatically.
EDIT: I refer to SQL Server, but feel free to use any DBMS.

My temptation would be to import the data into SQL Server (Express if you aren't looking to deploy the app) as it's a familiar source for me. Alternatively you can probably create an ODBC data source using the text file handler to get you a database-like connection.

I agree that you would benefit from a database, especially for rapid querying, and even more so if you are saving user changes to the data. In order to load the flat file data into a SQL Server (including Express), you can use SSIS.

Use Linq or text data to list method
1.create a list.
2.Read the text file line by line (or all lines).
3.process the line - get required data and attach to the list.
4.process the list for any further use.
the persistence storage will be files and List is volatile.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.