GridView Databinding and Paging

GridView Databinding and Paging - c#

What is the best way to retrieve records from the database?
Currently we are grabbing all of them, caching them, and binding them to our GridView control. We incorporate paging using this control.
So what would be better? Retrieving all the records like we are currently doing, or just retrieving the records needed using an index and row count.

That kind of depends on how much data you are talking about. A few dozen to a few hundred and your current solution will likely suffice. Start getting into several hundred to thousands and you may want to look into paging with new stuff in SQL 2005 like the Row_Number and Rowcount features.
Here's a small run through on it:
http://www.asp.net/LEARN/data-access/tutorial-25-cs.aspx
There are several ways to do it but this should get you started at least on considering what you should do.
You could even consider just capping how many records are returned by using the Top syntax IF of course you are using SQL Server. We have done that before and informed users to refine their search if the max result count was reached.
You could throw together a quick test using the above SQL 2005 functionality to see how your performance does and decide from there.

Like klabranche said, it depends on the amount of rows you're talking about. For up to a couple of hundred, you approach is probably fine.
If you're talking about thousands, one option is using the ASP ObjectDataSource. It lets you specify separate methods for getting the row count and the actual rows for the current page:
http://msdn.microsoft.com/en-us/library/system.web.ui.webcontrols.objectdatasource.aspx

Related

Ways to speed up queries SQL Server 2008 R2 without SqlDataSource object

I'm trying to build a product catalog application in ASP.NET and C# that will allow a user to select product attributes from a series of drop-down menus, with a list of relevant products appearing in a gridview.
On page load, the options for each of the drop-downs are queried from the database, as well as the entire product catalog for the gridview. Currently this catalog stands at over 6000 items, but we're looking at perhaps five or six times that when the application goes live.
The query that pulls this catalog runs in less than a second when executed in SQL Server Management Studio, but takes upwards of ten seconds to render on the web page. We've refined the query as much as we know how: pulling only the columns that will show in our gridview (as opposed to saying select * from ...) and adding the with (nolock) command to the query to pull data without waiting for updates, but it's still too slow.
I've looked into SqlCacheDependency, but all the directions I can find assume I'm using a SqlDataSource object. I can't do this because every time the user makes a selection from the menu, a new query is constructed and sent to the database to refine the list of displayed products.
I'm out of my depth here, so I'm hoping someone can offer some insight. Please let me know if you need further information, and I'll update as I can.
EDIT: FYI, paging is not an option here. The people I'm building this for are standing firm on that point. The best I can do is wrap the gridview in a div with overflow: auto set in the CSS.
The tables I'm dealing with aren't going to update more than once every few months, if that; is there any way to cache this information client-side and work with it that way?

Most of your solution will come in a few forms (none of which have to do with a Gridview):
Good indexes. Create good indexes for the tables that pull this data; good indexes are defined as:
Indexes that store as little information as actually needed to display the product. The smaller the amount of data stored, the greater amount of data can be stored per 8K page in SQL Server.
Covering indexes: Your SQL Query should match exactly what you need (not SELECT *) and your index should be built to cover that query (hence why it's called a 'covering index')
Good table structure: this goes along with the index. The fewer joins needed to pull the information, the faster you can pull it.
Paging. You shouldn't ever pull all 6000+ objects at once -- what user can view 6000 objects at once? Even if a theoretical superhuman could process that much data; that's never going to be your median usecase. Pull 50 or so at a time (if you really even need that many) or structure your site such that you're always pulling what's relevant to the user, instead of everything (keep in mind this is not a trivial problem to solve)
The beautiful part of paging is that your clients don't even need to know you've implemented paging. One such technique is called "Infinite Scrolling". With it, you can go ahead and fetch the next N rows while the customer is scrolling to them.

If, as you're saying paging really is not an option (although I really doubt it ; please explain why you think it is, and I'm pretty sure someone will find a solution), there's really no way to speed up this kind of operation.
As you noticed, it's not the query that's taking long, it's the data transfer. Copying the data from one memory space (sql) to another (your application) is not that fast, and displaying this data is orders of magnitude slower.
Edit: why are your clients "firm on that point" ? Why do they think it's not possible otherwise ? Why do they think it's the best solution ?

There are many options to show a big largeset of data on a grid but third parties software.
Try to use jquery/javascript grids with ajax calls. It will help you to render on client a large amount of rows. Even you can use the cache to not query many times the database.
Those are a good grids that will help your to show thousands of rows on a web browser:
http://www.trirand.com/blog/
https://github.com/mleibman/SlickGrid
http://demos.telerik.com/aspnet-ajax/grid/examples/overview/defaultcs.aspx
http://w2ui.com/web/blog/7/JavaScript-Grid-with-One-Million-Records
I Hope it helps.

You can load all the rows into a Datatable on the client using a Background thread when the application (Web page) starts. Then only use the Datatable to populate your Grids etc....So you do not have to hit SQL again until you need to read / write different data. (All the other answers cover the other options)

Sorting in Array vs Sorting in SQL

I have around 1000 rows of data.On the ASPX page, whenever the user clicks the sort button, it will basically sort the result according to a specific column.
I propose to sort the result in the SQL query which is much more easier with just an Order by clause.
However, my manager insisted me to store the result in an array, then sort the data within an array because he thinks that it will affect the performance to call the database everytime the user clicks the sort button.
Just out of curiosity - Does it really matter?
Also, if we disregard the number of rows, performance wise, which of these methods is actually more efficient?

Well, there are three options:
Sort in the SQL
Sort server-side, in your ASP code
Sort client-side, in your Javascript
There's little reason to go with (2), I'd say. It's meat and drink to a database to sort as it returns data: that's what a database is designed to do.
But there's a strong case for (3) if you want to have a button that the user can click. This means it's all done client-side, so you have no need to send anything to the web server. If you have only a few rows (and 1000 is really very few these days), it'll feel much faster, because you won't have to wait for sending the request and getting a response.
Realistically, if you've got so many things that Javascript is too slow as a sorting mechanism, you've got too many things to display them all anyway.
In short, if this is a one-off thing for displaying the initial page, and you don't want the user to have to interact with the page and sort on different columns etc., then go with (1). But if the user is going to want to sort things after the page has loaded, then (3) is your friend.

Short Answer
Ah... screw it: there's no short answer to a question like this.
Longer Answer
The best solution depends on a lot of factors. The question is somewhat vague, but for the sake of simplicity let's assume that the 1000 rows are stored in the database and are being retrieved by the client.
Now, a few things to get out of the way:
Performance can mean a variety of things in a variety of situations.
Sorting is (relatively) expensive, no matter where you do it.
Sorting is least expensive when done in the database, as the database already has the all the necessary data and is optimized for these operations.
Posting a question on SO to "prove your manager wrong" is a bad idea. (The question could easily have been asked without mentioning the manager.)
Your manager believes that you should upload all the data to the client and do all the processing there. This idea has some merit. With a reasonably sized dataset processing on the client will almost always be faster than making a round trip to the server. Here's the caveat: you have to get all of that data to the client first, and that can be a very expensive operation. 1000 rows is already a big payload to send to a client. If your data set grows much larger then you would be crazy to send all of it at once, particularly if the user really only needs a few rows. In that case you'll have to do some form of paging on the server side, sending chunks of data as the user requests it, usually 10 or 20 rows at a time. Once you start paging at the server your sorting decision is made for you: you have no choice but to do your sorting there. How else would you know which rows to send?
For most "line-of-business" apps your query processing belongs in the database. My generalized recommendation: by all means do your sorting and paging in the database, then return the requested data to the client as a JSON object. Please don't regenerate the entire web page just to update the data in the grid. (I've made this mistake and it's embarrassing.) There are several JavaScript libraries dedicated solely to rendering grids from AJAX data. If this method is executed properly your page will be incredibly responsive and your database will do what it does best.

We had a problem similar to this at my last employer. we had to return large sets of data efficiently, quickly and consistently into a datagridview object.
The solution that they came up was to have a set of filters the user could use to narrow down the query return and to set the maximum number of rows returned at 500. Sorting was then done by the program on an array of those objects.
The reasons behind this were:
Most people will not not process that many rows, they are usually looking for a specific item (Hence the filters)
Sorting on the client side did save the server a bunch of time, especially when there was the potential for thousands of people to be querying the data at the same time.
Performance of the GUI object itself started to become an issue at some point (reason for limiting the returns)
I hope that helps you a bit.

From both a data-modeling perspective and from an application architecture pattern, its "best practice" to put sorting/filtering into the "controller" portion of the MVC pattern. That is directly opposed to the above answer several have already voted for.
The answer to the question is really: "It depends"
If the application stays only one table, no joins, and a low number of rows, then sorting in JavaScript on the client is likely going to win performance tests.
However, since it's already APSX, you may be preparing for your data/model to expand.--Once there are more tables and joins, and if the UI includes a data grid where the choice of which column to sort will change on a per-client basis, then maybe the middle-tier should be handling this sorting for your application.
I suggest reviewing Tom Dykstra's classic Contosa University ASP.NET example which has been updated with Entity Framework and MVC 5. It includes a section on Sorting, Filtering and Paging. This example shows the value of proper MVC architecture and the ease of implementing sorting/filtering on multiple columns.
Remember, applications change (read: "grow") over time so plan for it using an architecture pattern such as MVC.

updating a db record each time i retrieve it from the db - is it a good practice?

I'm using an sql server and i have a specific table that can contain ~1million-~10 million recrdords max.
In each record i retrieve I do some checkings (i run a few simple lines of code), and then I want to mark that the records was checked in DateTime.Now;
so what i do is retrieve a record, check some stuff, run an 'update' query to set the 'last_checked_time' field to DateTime.Now, and then move to the next record.
I can then get all the records ordered by their 'last_checked_time' field (ascending), and then i can iterate over them ordered by the their check time..
Is this a good practice ? Can it still remain speedy as long as i have no more than 10 million records on that table ?
I've read somewhere that every 'update' query is actually a deletion and a creation of a new record.
I'd also like to mention that these records will be frequently retrieved by my ASP.net website ..
I was thinking of writing down the 'last_checked_time' on a local txt file/binary file,but i'm guessing it would mean implementing something that the database can already do for you.

If you need that "last checked time" value then the best, most efficient, place to hold it is on the row in the table. It doesn't matter how many rows there are in the table, each update will affect just the row(s) you updated.
How an update is implemented is up to the DBMS, but it is not generally done by deleting and re-inserting the row.

I would recommend retrieving your data or a portion of the data, doing your checks on all of them and sending the updates back in transactions to let the database operate more effectively. This would provide for fewer round trips.
As to if this is a good practice, I would say yes especially since you are using in in your queries. Definitely, do not store the last checked time in a file and try to match up after you load your database data. The database RDBMS is designed to effeciently handle this for you. Don't reinvent the wheel using cubes.

Personally, I see no issues with it. It seems perfectly reasonable to store the last checked time in the database, especially since it might be used in queries (for example, to find records that haven't been checked in over a week).

Maybe (just maybe) you could create a new table containing two rows: the id of the row in the first table and the checked date.
That way you wouldn't alter the original table, but depending on the usage of the data and the check date you would be forced to make a joined query which is maybe something you also don't want to do.

It makes sense to store the 'checked time' as part of the row you're updating, rather than in a separate file or even a separate table in the database. This approach should provide optimal performance and help to maintain consistency. Solutions involving more than one table or external data stores may introduce a requirement for distributed or multi-table transactional updates that involve significant locking, which can negatively impact performance and make it much more difficult to guarantee consistency.
In general, solutions that minimize the scope of transactions and, by extension, locking, are worth striving for. Also, simplicity itself is a useful goal.

server side paging with business logic filtering

i need to show total number of rows at the grid title.
the grid have to deal also with large amount of records.
so i decide to use grid custom paging feature.
i know how to do server side paging with sql2005 ROW_NUMBER etc.
but my difficulty is with the complex row base filtering done at the business logic layer.
i think that doing first the complex filtering (in order to know the items count) on the large amount of records will be not efficient and maybe can cause out of memory exception.
right now this project (asp.net web app) is on production with .net framework1.1,sql2005.
next version on production will be with .net framework4.0.
after that we will upgrade to sql2008.
please help me to find a solution for that problem?
thanks.

I would say if you are afraid of Out of Memory exceptions in production either the HW is undersized for the amount of data you have or your code is really badly wrong :)
I would have everything done is a store procedure, including filtering, paging and sorting. Once you have this sorted out in the server and you have specified the page size and pag index you need to retrieve, the stored proc simply returns the single page of records you are looking for already sorted as well and you can bind this to your UI controls.
Is this what you wanted or did I get you wrong?

if you are using .net 4.0, IQueryable is a viable option. See here for details. Basically, IQueryable delays the execution of the query so you can apply the business logic and then fetch relevant data from the underlying data store (sql server in your case).
But, I would do some micro-benchmarking of the query performance before going down this route.

Using Devart or NHibernate

Dear All
I have project which manage so much data. sometimes I must show data almost 1 million row. If I have 2 choice to solve it and I want to make it more faster when show data what technologies better I choose between Devart or NHibernate?
I`m using PostgreSQL as database and want to show data as fast as possible
rgrds

I can hardly imagine that you really want to show 1 million rows at once.
Even if you have one big table with a million rows, you will probably show them in a form or on a page which allows filtering and/or paging, so your users will only see a few rows at a time.
So I think what you really want is to select, let's say, 50 or 100 rows at once from your big table with a million rows.
For that, you can use ADO.NET or any ORM you want. They all do basically the same, it's just a matter of personal preference and there's no notable performance difference when used with this amount of data.
If you really want to load the whole million rows at once, well...you will get performance problems anyway, no matter what data access technology you use. Even with ADO.NET and a DataReader.
And even if performance would not be an issue...it still makes no sense to me.
What do your users do with a million rows of data, all shown at once? They can't see them all at the same time anyway.

If you are going to show a 1 million of rows then any ORM is not your choice.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.