I have a windows application in which a form is bound with the data.
The form loads slowly because of large data. I am also showing paging in form to navigate through the records.
How can I increase the performance?
Bottom line- your app needs to 'page the data' effectively.
This means, you will need to "lazy load" the data. The UI should only load and display data that it needs to show; therefore load additional data only when needed.
Since you didnt provide much of an information regarding your app and the data that you load.
So, lets assume that your app fetches 10,000,000,01 records.
Load your form
If, for instance your grid shows 25 records per page, so use TOP 100 to fetch the top 100 records, and fill in your first page and next four pages.
Upon each Next or consecutive 'Nexts' you can hit database to fetch next records. Note that, you will require some mechanism(ROW_NUMBER?) to keep track of the records being fetched, row numbers, etc.
This article discusses exactly what you are after and I am referring towards.
It's hard to say for certain without knowing more about your application, but the immediate thing that comes to mind is that if your dataset is large, you should be doing pagination on the database side (by constraining the query using row counts) rather than on the application side.
Databinding is a convenience feature of .NET, but it comes with a severe performance overhead. In general it's only acceptable for working with small datasets of less than a few thousand rows bound to a couple of dozen controls at most. If the datasets grow very large, they take their toll very quickly and no amount of tweaking will make the application speedy. The key is always to constrain the amount of memory being juggled by the data binding system at any given time so that it doesn't overload itself with the meta-processing.
Here are some recommendations:
Find out why you need to bring a large set of data. That much data displayed on the screen will not lead to a good user experience. If this is a search result or something, limit your search results, say 100, and let the user know that there is more but they need a more fine grained search criteria.
Check to make sure that your database query is well optimized and indexed and you are not bringing more data than you need to.
Assuming you are using a DataGridView, see if taking advantage of VirtualMode helps. Below description is from msdn and there is also a link to an example in there.
Virtual mode is designed for use with very large stores of data. When the VirtualMode property is true, you create a DataGridView with a set number of rows and columns and then handle the CellValueNeeded event to populate the cells.
If you are using some other control, you can see if that control provides a similar feature. ListView also has VirtualMode.
Fire up the SQL profiler to see what your applications is needing from the database. You may see some unnecessary calls, opportunities to trim your data needs and lazy load. Also debug and profile your application to see where you spend most of your time.
If you are using SQL server, implement paging using the Commaon table Expressions and ROW_NUBER(). This will allow you to get less data from the Sql server and definitely better performance.
Related
I have an application in which such a large amount of data is loaded at the beginning that the waiting time for the users is no longer justifiable.
At first only data is loaded to fill a listbox explorer, which serves as browser to load the remaining information when selecting the item. So much for the data model.
I now intend to maintain a local data source and only update the data that the user selects, but I have to deal with the question if I should keep the finished objects for the model or the raw data.
Has anyone played around with the different approaches and can say/link to what is best approach in terms of maintenance and performance? I work with .NET
I am doing a small application that queries a web service for in-game prices for items in a particular game. This games obviously has over 200 items in game (with their associated uint IDs), and has different item types (ore, combat, etc). In my application, I have a view that allows the user to specify for with item he wants to query for the price, and it has 2 comboboxes: one for item type and the 2nd one that will show items of that specific type (so when the first combobox changes, the second one shows the items associated to the selected item type).
Also, I do not have direct access to the game's database with all the item types, items and their associated IDs. I would have to replicate that information (that is available online) in a database of my own, or in an XML file, or another container of the sort.
Knowing that, my question is what would be the best: loading the whole database (or parsing the whole XML file) into a List<GameItem> at the opening of the application, or querying the database (or parsing a part of the XML file) each time the user changes the item type combobox? If I do the whole loading at the beginning of the application, maybe I would run into the application taking A LOT of memory for nothing, but on the other hand if I query the database (or parse the XML file) each time the user changes the item type combobox, maybe there would be a problem where there would be a "delay" in the application each time he would do that operation.
I would start an asynchronous method after starting the app, where it loads the game items. This way it won't also block the UI while user do what ever it do in your app. I've done this in my app where user is reading an ebook and it loads 200 books at the same time. This way user is able to continue it reading etc while it load books in a background.
First thing you want to do is establish a high-level interface that doesn't bother with or mention these details so that you can change your mind later if necessary and change as few things as possible in response. Make the interface focus on what it should do rather than how it should do it. Hide away all those 'hows', make them private.
Optimization is best applied in hindsight, with profilers and measurements in your hand, and code that can be optimized without being too intrusive/invasive and creating cascading breakages throughout your codebase (by being tucked under a good interface).
Second, keep in mind that a million 32-bit floating point variables just takes 4 megabytes of RAM. I came originally from a time where 4 megabytes was considered a massive amount of memory. Today we're talking pennies. 200 items is typically nowhere near enough data to concern yourself with the added expense and difficulty of implementing a disk indexing structure unless each item stores like a million sub-elements each.
So unless you're working with exceptionally massive items, I'd suggest starting with a basic solution of loading them into memory on startup.
A bigger concern for your case and scale if there's store logic involved might be security and atomicity much more than performance, to ensure that item transactions are either completed 100% successfully or fails/rolls back 100% as a whole (never half-finished). You might also want to periodically write to disk anyway to make sure that you don't lose the data in the case of an unexpected server shutdown, but you don't necessarily have to be using that file structure for anything more than a safety backup. Though I wasn't clear if you were handling that store-side logic or just providing a convenient client for the customers. If the latter, you can forget about this stuff.
In a Windows C# application form I load more than 500,000 records from a SQL Server database for analysis.
SELECT TOP 500000 * FROM MobileTrans
When I run above query in SQL Server Management Studio data shows up immediately and takes 15 sec to load be completed. But when I run this query in my Windows application, it takes 15 sec without showing anything in the data grid, after that data show in data grid suddenly.
How can I retrieve results of query async same as SQL Server Management Studio in my windows data grid form?
Please send a small sample of code.
You do not need to show 1mln records to anyone. None can see them all contemporary.
So first load reasonable amount of data one could see and operate in your app.
In short: use server side paging of the data if this is only about presentation.
By reducing in this way dramatically amount of data, you may avoid async processing at all.
If you need, by the way, process it in async I would go for populating data retrieved from DB into storage (Queue<T>, List<T>....) which is a source for visual element you visualize data on.
Consider that this can easily jump into fairly complicated scenarios, as it's not absolutely clear to me how complex your app is. So, may be the first solution would reveal as the best one.
EDIT
Here, may be, a useful example about how can be that (defered loading) achieved.
Implementing Virtual Mode with Just-In-Time Data Loading in the Windows Forms DataGridView Control
I have a web project in asp.net/C#/Mysql where the data can be up to some 30,000 rows of data to process.
Its a reporting tool, and I have to show statistics like counts and sum at several levels.
I want to know which would be the better way to go around this.
I can filter my data to limited columns though which I can query.
Now, Is is a good way to get the data (all rows) to my application on load and whenever user queries I can filter that data and do my calculations in the code and show my statistics.
or I can have a stored procedure do all my calculations and every time user queries I can call the stored procedure and get my statistics.
Thanks
Databases are optimized to do this kind of data manipulation. And since you reduce network load as well i would vote for the second option.
Another possibility is to consider some kind of OLAP solution where transactional data is already consolidated into smaller chunks of data which in turn can be easily queried.
I would definetly go with the second option. You are talking about a web application, if You want to get all the data at load, then you must store it somewhere to preserve it during postbacks. If you chose to store it in a session state, you will end up consuming the Web server memory, since you have more than one user accessing your site.
If you store it in view state, then you will end up in a very big client response,which will make your page very slow to load on the client side, and will cause network traffic.
Option 2 is the best, because stored procedures are precompiled, which means they are much better in terms of performance. You will also reduce network traffic.
In my app Datagridview displays object Proxy
Proxy has two properties Address, Status
DataGridview is bound to List which holds the Proxy objects.
The DataGridView and UI becomes unresponsive due to the heavy load on the memory, as the list reaches 1 million proxy count.
The app is harvesting proxies from diffrent websites, how do I scale the application to handle huge lists.
My concern, is harvesting, and implementing paging at the same time.
Paging with SQLCe, is it a good solution?? or will sql ce slow the harvesting process, or is there a better solution, i don't know.
the app harvests arround 500 - 1700 proxies per second, it is a feature, to extract "as fast as possible", I now there are other obvious limitations, bottle necks, but i am ignoring them for now.
Please advice how do i keep the speed, and make it scale, best practices., I am not sure about SQLCe
Now why would you ever want to display 1 million records to the user?! Even if paged, he'd still have to click through, let's say, 10000 pages!
Implement filtering, only display what's necessary and limit it to 7 records. Add float Score to Proxy; express it as a percentage - 0% means google.com didn't load at all, 100% means no slowdown compared to direct connection (haha).
Then it's
var displayedProxies = myProxies.OrderByDescending(Score).Take(7);
Think of potential usage scenarios and make the UI fit. In example, if it's targeted at spammers wanting to send out billions of emails, you just need one button - "Export in (machine-readable format name here)". However, if it's just some user wanting to surf anonymously, you can give him a list of "7 random proxies" with a message, that the scores are updating. Then just replace those 7 random ones in real-time with a list of the 7 best found so far.
I agree, the best approach is to get the data in chunks, calling a stored proc that receives the page number and the number of records that you want to be returned and then binding the records to the grid.
If there are filters applied to the grid, I would also pass them in to the stored proc.
I would disable VIEWSTATE on the datagrid if you are still passing many records (say more than a thousand per page); in fact, if you have too many records and you want this thing to fly, I would prefer a mix of ajax calls to a web service to get the data, coupled with the jquery datatables plugin, which I find fantastic and fairly well documented. Here's the link.
Edit: If you do the jquery datatables/webservice approach, try to convince people not to use IE Version < 9. IE Javascript engine sucks on IE 6 and 7 and less so on IE8 but still pretty bad compared to FF, Chrome, etc.