Datatable vs Dataset - c#

I currently use a DataTable to get results from a database which I can use in my code.
However, many example on the web show using a DataSet instead and accessing the table(s) through the collections method.
Is there any advantage, performance wise or otherwise, of using DataSets or DataTables as a storage method for SQL results?

It really depends on the sort of data you're bringing back. Since a DataSet is (in effect) just a collection of DataTable objects, you can return multiple distinct sets of data into a single, and therefore more manageable, object.
Performance-wise, you're more likely to get inefficiency from unoptimized queries than from the "wrong" choice of .NET construct. At least, that's been my experience.

One major difference is that DataSets can hold multiple tables and you can define relationships between those tables.
If you are only returning a single result set though I would think a DataTable would be more optimized. I would think there has to be some overhead (granted small) to offer the functionality a DataSet does and keep track of multiple DataTables.

in 1.x there used to be things DataTables couldn't do which DataSets could (don't remember exactly what). All that was changed in 2.x. My guess is that's why a lot of examples still use DataSets. DataTables should be quicker as they are more lightweight. If you're only pulling a single resultset, its your best choice between the two.

One feature of the DataSet is that if you can call multiple select statements in your stored procedures, the DataSet will have one DataTable for each.

There are some optimizations you can use when filling a DataTable, such as calling BeginLoadData(), inserting the data, then calling EndLoadData(). This turns off some internal behavior within the DataTable, such as index maintenance, etc. See this article for further details.

When you are only dealing with a single table anyway, the biggest practical difference I have found is that DataSet has a "HasChanges" method but DataTable does not. Both have a "GetChanges" however, so you can use that and test for null.

A DataTable object represents tabular data as an in-memory, tabular cache of rows, columns, and constraints.
The DataSet consists of a collection of DataTable objects that you can relate to each other with DataRelation objects.

Related

Best way to Querying TypedDataSet

I have to done optimization of my code. I am using typedDataset. For querying type dataset what is the best method.
Like: Linq or any thing else..
It depends on what entity you want get at the end of the query.
If you want to get some on-fly created types, then use the Linq queries.
If you just want to have a code analog for sql-statements, use methods of the Dataset, DataTable and so on.
what do you define as best?
if you mean best=flexible i would use dataviews on datatables where you can set filter (similar to sql-where) and sorting (similar to sql-order-by). These values are simple strings that can be stored in setting-files.
however if performance is an issue for you then the database should do the filter/sort stuff for you that is independent of datasets
If you think about performance then take a look at this comparison
http://www.devtoolshed.com/content/performance-benchmarks-linq-vs-sqldatareader-dataset-linq-compiled-queries-part-2

Should I use dataset or datatable?

If I need to fetch one whole column from Table1 in the DB, should I fetch it using datatable or dataset? I can do both ways. I mean ok I should use Datatable. Why is that? What would happen if I use Dataset?
ok that's what I wanted to know. So there's memory issue. Now I am confused. I mean whatever I use be it Datatable or Dataset, both will be fetching only ONE column frommy table in DB. How is Dateset's gonna use more memory then?
Use a DataTable.
A DataSet is an in-memory database while DataTable is an in-memory table.
DataSets are more complicated and heavier-weight; they can contain multiple DataTables and relations between DataTables.
you can better use DataTable(uses less memory).
or you can try with user created value objects or DTO
To answer your edited question, there's more overhead to a dataset. DataTables are better for what you need. If you're doing a lot of data fetching, though, it's easier (and way more maintainable!) to use an ORM of sorts.

which Data object should i use

i have a query that return only one row (always) and i want to convert this row to class object (lets say obi)
i have a feeling that using data table to this kind of query is to much
but i dont realy know which other data object to use
data reader?
is there a way to execute sql command to data row ?
DataReader is the best choice here - DataAdapters and DataSets may be overkill for a single row, although, that said, if performance is not critical then keeping-it-simple isn't a bad thing. You don't need to go from DataReader -> DataRow -> your object, just read the values off of the DataReader and you're done.
A datareader lets you query individual fields. If you want the row as a single object, I believe the DataTable/DataRowView family of objects is in fact the way to go.
You might seriously consider taking a look at Linq-to-Sql or Linq-to-Entities.
The appeal of these frameworks is they provide automatic serialization of your database data into objects, abstract away many of the mundane details of connection management, and have better compile-time support by providing strongly-typed properties which you can use without string keys or column ordinals.
When using Linq, the difference between retrieving a single row vs. retrieving multiple rows often only involves appending .Single() or .First() to your query.
At any rate, if you already use or are willing to learn one of these frameworks, you may see the bulk and difficulty of data access code reduce substantially.
With respect to DataReader vs. DataSet/DataTable, it is correct that it takes more cycles to allocate and populate a data table; however, I highly doubt you will notice the difference unless creating an extremely high volume of database calls.
In case it is helpful, here are documentation examples of data access using data readers and data sets.
DataReader
DataSet

Do ADO.Net DataTables have indexes?

I am using VSTS 2008 + C# + .Net 3.5 + SQL Server 2008 + ADO.Net. If I load a table from a database by using a DataTable of ADO.Net, and in the database table, I defined a couple of indexes on the table. My question is, whether on the ADO.Net DataTable, there is related index (the same as the indexes I created on physical database table) to improve certain operation performance on DataTable?
thanks in advance,
George
Actually George's question is not so "bad" as some people insist it is. (I am more and more convinced that there's no such thing as, "a bad question").
I have a rather big table which I load into the memory, in a DataTable object. A lot of processing is done on lines from this table, a lot of times, on various (and different) subsets which I can easily describe as "WHERE ..." of SELECT clauses. Now with this DataTable I can run Select() - a method of DataTable class - but it is quite inefficient.
In the end, I decided to load the DataTable sorted by specific columns and implemented my own
quick search, instead of using the Select() function. It proved to be much faster, but of course it works only on those sorted columns. The trouble would have been avoided, had a DataTable had indexes.
No, but possibly yes.
You can set up your own indices on a DataTable, using a DataView. As you change the table, the DataView will be rebuilt, so the index should always be up to date.
I did some bench tests for my own app. I use a DataTable to approximate a Boost MultiIndexContainer. To create an index on a column call "Author", I initialise the DataTable, and then the DataView...
_dvChangesByAuthor =
new DataView(
_dtChanges,
string.Empty,
"Author ASC",
DataViewRowState.CurrentRows);
To then pull data by Author from the table, you use the view's FindRows function...
dataRowViews = _dvChangesByAuthor.FindRows(author);
List<DataRow> returnRows = new List<DataRow>();
foreach (DataRowView drv in dataRowViews)
{
returnRows.Add(drv.Row);
}
I made a random large DataTable, and ran queries using DataTable.Select(), Linq-To-DataSet (with forced execution by exporting to list) and the above DataView method. The DataView method won easily. Linq took 5000 ticks, Select took over 26000 ticks, DataView took 192 ticks...
LOC=20141121-14:46:32.863,UTC=20141121-14:46:32.863,DELTA=72718,THR=9,DEBUG,LOG=Program,volumeTest() - Running queries for author >TFYN_AUTHOR_047<
LOC=20141121-14:46:32.863,UTC=20141121-14:46:32.863,DELTA=72718,THR=9,DEBUG,LOG=RightsChangeTracker,GetChangesByAuthorUsingLinqToDataset() - Query elapsed time: 2 ms, 4934 ticks; Rows=65
LOC=20141121-14:46:32.879,UTC=20141121-14:46:32.879,DELTA=72733,THR=9,DEBUG,LOG=RightsChangeTracker,GetChangesByAuthorUsingSelect() - Query elapsed time: 11 ms, 26575 ticks; Rows=65
LOC=20141121-14:46:32.879,UTC=20141121-14:46:32.879,DELTA=72733,THR=9,DEBUG,LOG=RightsChangeTracker,GetChangesByAuthorUsingDataview() - Query elapsed time: 0 ms, 192 ticks; Rows=65
So, if you want indices on a DataTable, I would suggest DataView, if you can deal with the fact that the index is re-built when the data changes.
You can create a primary key for the datatable. Filter operations get a big boost if you are searching in the primary key field. Check out this link: here
I had the same problem with many queries from a large datatable that are not according to the primary key.
The solution I found was to create DataView for each index I wanted to use, and then use it's Find and FindRows methods to extract the data.
DataView creates an internal index on the DataTable and behaves virtually as an index for this purpose.
In my case I was able to reduce 10,000 queries from 40 Seconds to ONE!!!
John above is correct. DataTables are disconnected in memory structures. They do not map to the physical implementation of the database.
The indexes on disk are used to speed up lookups because you don't have all the rows. If you have to load every row and scan them it is slow, so an index makes sense. In a DataTable you already have all the rows, so a comparison is fast already.
The correct answer here to the implicit question of creating an index on a DataTable is that you can't do that, but you can create one or more DataViews for the DataTable, which according to the doc will create an index based on the sorting the DataView specifies:
DataView constructs an index. An index contains keys built from one or more columns in the table or view. These keys are stored in a structure that enables the DataView to find the row or rows associated with the key values quickly and efficiently. Operations that use the index, such as filtering and sorting, see signifcant performance increases. The index for a DataView is built both when the DataView is created and when any of the sorting or filtering information is modified. Creating a DataView and then setting the sorting or filtering information later causes the index to be built at least twice: once when the DataView is created, and again when any of the sort or filter properties are modified.
If you need to do a large number of lookups to an in-memory DataTable, it may be the most straightforward and performant to use a DataView with the Find() or FindRows() method to do indexed key lookups. In particular, if you need to do a number of lookups and modifications to the data this would prevent needing to transform your DataTable into another indexed class like a Dictionary and then transforming it back into a DataTable again.
Others have made the point that a DataSet is not intended to serve as a database system--just a representation of data. If you are working under the impression that a DataSet is a database then you are mistaken and might need to reconsider your implementation.
If you need a client-side database, consider using SQL Compact or SQL Lite, both are free redistributable Database systems which can be used without requiring separate installations or services. If you need something more full-featured the SQL Express is the next step up.
To help clarify though, DataSets/Tables are used in .NET development to temporarily hold data as needed. Think of them as the results of a SELECT query against a database; they are roughly similar to CSV files or other forms of tabular data--you can pull data into them from a database, work with the data, and then push the changes back to a database--but they, on their own, are not databases.
If you have a large collection of items which you need to keep in memory for one reason or another then you might consider building a lightweight DTO (data transfer object, Google it, they're very simple) and loading them into a HashTable. HashTables won't give you any form of relational data, but are very efficient at look-ups.
DataTables have a PrimaryKey field that can serve as an index (they are fast already anyway). This field is not copied from the Primary Keys of the database (although that might be nice).
My reading of the docs is that the correct way to achieve this (if needed) is to use AsDataView to produce a DataView (or LinqDataView) that's bound to the underlying table. If your DataTable is invariant then the DataView can be static to avoid redundant re-indexing.
I am currently investigating Linq to DataSet, and this q was helpful to me, so thanks.
DataTables are indexed if you (the coder) specify one or more DataColumns as the Primary Key. Interally ADO.NET uses a Red-Black tree to form this index giving log-time lookups. This Primary Key is not set automatically based on any underlying keying from the data provider.
George,
The answer is no.
Actually, some sort of indexing may be used internally, but only as an implementation detail. For instance, if you create a foreign key constraint, maybe that's assisted by an index. But it doesn't matter to a developer.

IMultipleResults: how do I deal with multiple result sets from a stored proc when they don't map to types?

This post on SO answers most of the questions I have (much thanks to Pure.Krome for the thorough response) about how to build a query that returns multiple results. However, in the case that I'm working with my tables that are coming back are sort of dependent on how the proc behaves. Can't change the proc. The results that are coming back are a set of datatables that don't map to types at all (for example, the first table is a mish mash of parts of the Customers table and the Orders table, the second table, if present, will be debugging output, then there might be a third table and so on).
Do I have to do this as a dataset/datadapter etc? Or is this possible with LINQ?
LINQ is an ORM (albeit a fairly simple one), with the "O" being (importantly) "object". If you can't predict the layout of the object returned in each grid, then it isn't a good fit for ORM.
Personally I wouldn't jump from LINQ to DataTable (but maybe I'm just biased against DataTable ;-p) - I would use SqlCommand.ExecuteReader and do my own object (etc) mapping. But maybe it might save time to just use a DataSet... YMMV etc.

Categories