What's better: DataSet or DataReader? - c#

I just saw this topic: Datatable vs Dataset
but it didn't solve my doubt .. Let me explain better, I was doing connection with database and needed to show the results in a GridView. (I used RecordSet when I worked with VB6 while ago and DataSet is pretty similar to it so was much easier to use DataSet.)
Then a guy told me DataSet wasn't the best method to do ..
So, should I 'learn' DataReader or keep using DataSet ? DataTable ?
What are the pros/cons ?

That is essentially: "which is better: a bucket or a hose?"
A DataSet is the bucket here; it allows you to carry around a disconnected set of data and work with it - but you will incur the cost of carrying the bucket (so best to keep it to a size you are comfortable with).
A data-reader is the hose: it provides one-way/once-only access to data as it flies past you; you don't have to carry all of the available water at once, but it needs to be connected to the tap/database.
And in the same way that you can fill a bucket with a hose, you can fill the DataSet with the data-reader.
The point I'm trying to make is that they do different things...
I don't personally use DataSet very often - but some people love them. I do, however, make use of data-readers for BLOB access etc.

It depends on your needs. One of the most important differences is that a DataReader will retain an open connection to your database until you're done with it while a DataSet will be an in-memory object. If you bind a control to a DataReader then it's still open. In addition, a DataReader is a forward only approach to reading data that can't be manipulated. With a DataSet you can move back and forth and manipulate the data as you see fit.
Some additional features: DataSets can be serialized and represented in XML and, therefore, easily passed around to other tiers. DataReaders can't be serialized.
On the other hand if you have a large amount of rows to read from the database that you hand off to some process for a business rule a DataReader may make more sense rather than loading a DataSet with all the rows, taking up memory and possibly affecting scalability.
Here's a link that's a little dated but still useful: Contrasting the ADO.NET DataReader and DataSet.

Further to Marc's point: you can use a DataSet with no database at all.
You can fill it from an XML file, or just from a program. Fill it with rows from one database, then turn around and write it out to a different database.
A DataSet is a totally in-memory representation of a relational schema. Whether or not you ever use it with an actual relational database is up to you.

Different needs, different solutions.
As you said, dataset is most similar to VB6 Recordset. That is, pull down the data you need, pass it around, do with it what you will. Oh, and then eventually get rid of it when you're done.
Datareader is more limited, but it gives MUCH better performance when all you need is to read through the data once. For instance, if you're filling a grid yourself - i.e. pull the data, run through it, for each row populate the grid, then throw out the data - datareader is much better than dataset. On the other hand, dont even try using datareader if you have any intention of updating the data...
So, yes, learn it - but only use it when appropriate. Dataset gives you much more flexibility.

DataReader vs Dataset
1) - DataReader is designed in the connection-oriented architecture
- DataSet is designed in the disconnected architecture
2) - DataReader gives forward-only access to the data
- DataSet gives scrollable navigation to the data
3) - DataReader is read-only we can’t make changes to the data present under it
- DataSet is updatable we can make changes to the data present under it and send those changes back to the data source
4) - DataReader does not provide options like searching and sorting of data
- DataSet provides options like searching and sorting of data

To answer your second question - Yes, you should learn about DataReaders. If anything, so you understand how to use them.
I think you're better of in this situation using DataSets - since you're doing data binding and all (I'm thinking CPU cycles vs Human effort).
As to which one will give a better performance. It very much depends on your situation. For example, if you're editing the data you're binding and batching up the changes then you will be better off with DataSets

DataReader is used to retrieve read-only and forward-only data from a database. It read only one row at a time and read only forward, cannot read backward/random. DataReader cannot update/manipulate data back to database. It retrieve data from single table. As it is connected architecture, data is available as long as the connection exists.
DataSet is in-memory tables. It is disconnected architecture, automatically opens the connection and retrieve the data into memory, closes the connection when done. It fetches all the data at a time from the datasource to its memory. DataSet helps to fetch data from multiple tables and it can fetch back/forth/randomly.DataSet can update/insert/manipulate data.

Related

Create an Excel PivotTable from a DataTable

I have a webservice that is hosting a large mass of pricing data, and returns the data relevant to some prescribed query parameters. The data comes back as a Datatable object (in C#) - the object type itself doesn't matter so much as the fact that the data goes directly into memory and is not on a spreadsheet in the host Excel object.
Now, I want to create a pivottable off of this data.
I've been looking high and low on the web, and I can't see anyone explaining how to do this. Is it impossible? It seems foolish to suggest VSTO as the only supported way of consuming webservice data going forward, but make pivottables off of that data impossible.
The only solutions I have are kludges, and I want to make sure there isn't a graceful solution before I do one of these ugly things:
Dump datatable to excel sheet and point pivottable to excel range.
This is far from ideal because I'm either doing rowwise deletion
over the entire dataset (slow as heck) or peaking at 2x memory
consumption
Dump datatable to filesystem and point pivottable to flatfile.
This is even worse but at least doesn't have the memory drawback.
Are these really the only ways to do this operation? There has to be something more graceful.
DataTable: http://msdn.microsoft.com/en-us/library/system.data.datatable.aspx
PivotCache: http://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.pivottable.pivotcache(v=office.11).aspx
Excel has to be able to see and access the data to make a PivotTable from it. So you have to make sure that the data is someplace that the PivotTable loader can read. Further, Excel is COM-based and can neither see nor process .NET objects.
It's pretty much just that simple.
Your choices are:
Load the data into an Excel range
Save the data to a file
Store the data into a database (Access, SQL Server, etc.)
Store the data in a data warehouse (SSAS, offline Cube, etc.)
That's it. The only other remotely possible way would be to implement the COM interfaces necessary to present as an OLE DB or an ODBC data source, but that would be one heck of a lot of work.

Is it possible to keep DataSet automatically synced with a SQLite database?

I'm trying to learn to use SQLite, but I'm very frustrated and confused. I've gotten as far as finding System.Data.SQLite, which is apparently the thing to use for SQLite in C#.
The website has no documentation whatsoever. The "original website", which is apparently obsolete from 2010 onwards, has no documentation either. I could find a few blog tutorials, but from what I can tell their method of operation is basically:
Initialize a database connection.
Feed SQL statements into the connection.
Take out stuff that comes out of the connection.
Close connection.
I don't want to write SQL statements in my C# code, they're ugly and I get no assistance from the IDE because I have to put the SQL code in strings.
Can't I just:
Create a DataSet.
Tell the DataSet that it should correspond to the SQLite database MyDB.sqlite.
Manipulate the DataSet using its member functions.
Not worry about SQLite because the DataSet automatically keeps itself in sync with the SQLite database on disc.
I know that I can fill a DataSet with the contents of a database, but if I want access to the entire database I will have to fill the DataSet with all of its contents. If my database is 1 GB, I have just used up 1 GB of RAM (not to mention the time needed to write all of it at once).
Can't I simply take a SQLite database connection and pretend it's just an ordinary DataSet (that perhaps needs to be asked occasionally if it's done syncing yet)?
The answer to the question is no.
No you cannot simply take a SQLite connection pretend it's just a DataSet.
If you don't want to code SQL statements then consider Entity Framework.
Using SQLite Embedded Database with Entity Framework and Linq-to-SQL
You shouldn't treat a DataSet as a database. It's just a result of a query.
You query the database to get a subset of data (you never want ALL the data from your DB) and this subset is used to populate your DataSet.
You are required to synchronize your changes manually because DataSet doesn't know which updates should be a part of which transaction. This is your system knowledge.
The DataSet is an in memory cache and will only synchronize to the underlying data store when the developer allows it. You could put a timer wrapper around in and do it on a schedule but you still need to keep the Dataset and data store synchronized manually.
Storing 1GB+ of data is really not recommended as the memory usage would be very high and the performance very low. You also don't want to be sending that amount of data over a network or god forbid an internet connection.
Why would you want to keep 1GB of data in memory?

If SqlDataAdapter uses a data reader internally, why do people say that using a SqlDataReader is faster?

I keep reading that SqlDataReaders are much faster than SqlDataAdapters because of their fast-forward, read-only, one-row-at-a-time connected nature, and that they are specifically faster than SqlDataAdapters when to populate a DataTable object (SqlDataAdapter.Fill(dataTable)).
However, here and there somebody will mention "it probably won't make a difference what you use because SqlDataAdapter uses a data reader internally to fill its table." If this is true, how exactly can the adapter be so much slower if it's communicating with the database by using an internal data reader anyway?
I know I could set up some tests and profile the performance of each one, but what I'd really like is for someone to shed some light on the alleged performance discrepancies if we're essentially dealing with the same process either way.
I understand that you'd typically use a reader to create a list of strongly-typed POCOs unlike the data adapter that just fills a table. However, my question is strictly about the details of the performance difference between the two and not O/RM concerns...
If you are using a DataReader, you can react to some information when reading the first row and even disregard the rest of the reading.
If you are using a DataAdapter, you have to first load the entire table and then read the first row in order to react to that same information.

Save a DataSet to a database

When I load from the database I use one store procedure which loads the DataItem and any Data associated with it. This comes back in one DataSet with two tables, the first table has one row and describes the DataItem and each row in the other table describing the related Data.
This DataSet is then used to populate my objects.
My problem comes when I have to save the objects back to the database. I am currently saving the DataItem and then looping through all of my Data and performing a save on each one. Completely horrible way to go about doing it, I know. It's both slow and it's not transactional.
So what I'd ideally like to do is convert my objects back into my DataSet and then save it all back to the database in one efficient transactional operation. What code do I need on the C# side to make this transactional and to allow me to pass back a DataSet. I presume this will involve using a TableAdapter. But given that I have two tables how will this work? What do I use on the SQL side - Can I use store procedures? (I would like to avoid having SQL in my C# project) Would I need to write something that will handle cycling through a datatable to save each record?
What's the best way to go about doing all this? This will form the lynchpin of a project I'm working on so I want it to be as fast and efficient as it can be!
(.NET 4.0 and SQL 2005)
Did not use TableAdapter in the end as it was more effort than it was worth.
From the comments:
http://msdn.microsoft.com/en-us/library/4esb49b4.aspx

DataSet or Reader or What?

When using a Class to get one row of Data from the Database what is best to use:
A DataSet?
A Reader and do what store the data in a Structure?
What else?
Thanks for your time, Nathan
A DataReader is always your best choice--provided that it is compatible with your usage. DataReaders are very fast, efficient, and lightweight--but they carry the requirement that you maintain an active/open db connection for their lifecycle, this means they can't be marshalled across AppDomains (or across webservices, etc).
DataSets are actually populated by DataReaders--they are eager-loaded (all data is populated before any is accessed) and are therefore less performant, but they have the added benefit of being serializable (they're essentially just a DTO) and that means they're easy to carry across AppDomains or webservices.
The difference is sometimes summed up by saying "DataReaders are ideal for ADO.NET ONLINE (implying that it's fine to keep the db connection open) whereas DataSets are ideal for ADO.NET OFFLINE (where the consumer can't necessarily connect directly to the database).
DataAdapter (which fills a DataSet) uses a DataReader to do so.
So, DataReader is always more lightweight and easier to use than a DataAdapter. DataSets and DataTables always have a huge overhead in terms of memory usage. Makes no difference if you are fetching a single row, but makes a huge difference for bigger result sets.
If you are fetching a fixed number of items, in MS SQL Server, output variables from a stored proc (or parameterized command) usually perform best.
if you use a reader you must have a open connection to your database generally a DataReader is used for fetch a combo or dataGrid, but if you want to stock your data in memory and you close our data base connexion you must use Datatable
Note : excuse my english level
If you just want read-only access to the data, then go with a raw DataReader; it's the fasted and most lightweight data access method.
However, if you intend to alter the data and save back to the database, then I would recommend using a DataAdapter and a DataSet (even a typed DataSet) because the DataSet class takes care of tracking changes, additions and deletions to the set which makes saves much easier. Additionally, if you have multiple tables in the dataset, you can model the referential constraints between them in the dataset.

Categories