DataGridView, large scale databinding solutions - c#

I developed an application that uses a DataGridView, and can have upwards of 500k rows in it. It is structured like this currently:
DataGridView.DataSource is a BindingSource
BindingSource.DataSource = AggregateBindingListView
AggBLV.SourceLists = {Lists of Data}
AggBLV.Sort("PropertyName")
AggBLV.ApplyFilter(Predicate)
...
The AggregateBindingListView is a collection that implements:
Component, IBindingListView, IList, IRaiseItemChangedEvents, ICancelAddNew, ITypedList
It is excellent piece of code developed by http://blogs.warwick.ac.uk/andrewdavey
Anyways. We've been using it for 4 years, and its become a performance bottleneck.
So here is my question:
I have a Collection of 500k items, and would like to bind them to a DataGridView. I need it to have Multi-Column sorting, Predicate Filtering, and A priority on performance.
We've just upgraded to C# 4.0.
Can I do better than what I have? I can post timing statistics and the such, but i'd need something for comparison.

You could try to use data virtualization as shown here. It's intended for WPF, but it could probably work in WinForms with minor adaptations. The idea is that instead of holding all the data at once in memory, you only load the necessary "pages" as needed, and unload them when they're not needed anymore.

Related

Is it faster to use AutoGenerateColumns or to define my own columns?

I'm working with a DataGridView that displays a dataset containing >5,000 rows. I have been trying to make it load faster, and have been able to cut the time from ~12 to ~5.5 seconds so far. The next thing I've considered trying is to define all of the columns instead of using AutoGenerateColumns, but I would like to know if this will help it load any faster before I spend time creating the 20+ columns.
Defining the columns should be marginally faster, but maybe you could benefit more from paging or loading the data on demand.
Take a look: How to: Implement Virtual Mode with Just-In-Time Data Loading in the Windows Forms DataGridView Control

What are the benefits of using a bindingsource with bindinglist<business obj> as datasource?

I can directly bind my DataGridView control to a bindinglist of my business objects by setting the DataSource property. My business object implements INotifyPropertyChanged, so the DGV gets updated when a new item is added to the Binding List or existing one is updated.
With respect to dealing with single records, I can bind my business object to textboxes and other relevant controls.
I can also derive from BindingList and create a CustomBindingList class to implement required methods of IBindable, as explained in the link below :
http://msdn.microsoft.com/en-us/library/aa480736.aspx
Alternatively, I have seen people recommend using a BindingSource. BindingSource's Datasource is the business object and the DGV's DataSource is the BindingSource.
In any case, basing it on a BindingSource does not offer me:
Filtering (Filter does not work). Implementation needs to be provided by me.
Sort and Search does not work. Implementation needs to be provided by me.
So, Why is the BindingSource approach recommended?
Broader Picture:
Am new to OOPS concepts and C#. Working with Database applications. Winforms. So far have only used DataSet / DataTable approach. Now trying to create and use my own custom classes.
Typically have a Master/Detail form. When I click on a Detail row in the DGV, I want to edit that record in a separate window. So I need to get a handle on the list item represented by that row in the DGV. Trying to find a solution for that has brought me to this point and this doubt.
Given what I want to do, which approach is better and why?
Some pointers here would really help as I am very new to this.
It is recommended to use a BindingSource when multiple controls on the form use the same datasource (Behind the Scenes: Improvements to Windows Forms Data Binding)
Design-time: I personally find the BindingSource very helpfull when choosing the properties from my business object when databinding to controls.
To get a handle to the currently selected row, try bindingSource1.Current as MyBusinessObject;
As for filtering and searching: I use a third party dll for grids that have that implemented. So can't help you with that, sorry.
When you work with lists of different types of business objects, don't use the list directly
List<IAnimal> animals = new List<IAnimal>();
animals.Add(new Cat());
animals.Add(new Dog());
bindingSource1.DataSource = animals;
Instead use a BindingList like this:
bindingSource1.DataSource = new BindingList<IAnimal>(animals);
That will make sure all accessed objects in the list are of type IAnimal and saves you some exceptions.
Binding to a DataSource could give you benefits when dealing with a large set only a part of which gets displayed. For instance if you look at the Telerik ListView here http://www.telerik.com/help/winforms/listview-databinding.html (there are many of these component packages, this is just the latest one I am using bits and pieces from).
The view is very lightweight and lets your scroll position determine which objects need to be actually displayed. So if you only look at the first 10 objects and never scroll down only 10 get bound and displayed. This potentially avoids a lot of unneeded data access.
Their GridView functions in the same manner. There is the displayed part of the grid which is separate from the potentially huge underlying grid.
As a bonus, you get filtering, sorting, grouping.
As far as I know, if you are working with a database, you use a bindingSource in the middle in order to establish a bilateral bridge between the database and your control. Otherwise you can just use a bindingList as source for your control.

What WPF control should I use when I need to have a spreadsheet/datagrid with MASSIVE amounts of columns and rows with data?

What WPF control should I use when I need to have a spreadsheet/datagrid with MASSIVE amounts of columns and rows with data?
At most there will be over 26000 colums and rows.
Best Regards, Kjetil
I think you need to take a step back and ask "why?"
Is a human really going to scroll through 26k columns looking for information?
Perhaps this is an opportunity to create a better UI metaphor for analyzing all that data. I’d like to help provide a solution, but without knowing the business domain it would be a futile gesture.
I would give DataGrid a shot for this. WPF DataGrid has built-in support for virtualization. You can also try to set the VirtualizationMode property to Recycling, to see if it gives you better performance.
<DataGrid VirtualizingStackPanel.IsVirtualizing="True"
VirtualizingStackPanel.VirtualizationMode="Recycling">
</DataGrid>
Recently, I encountered similar situation with you. I resolved the issue with UI Virtualization and Data Virtualization.
What is the best for me is this, where famous Virtualization approaches, Paul McClean and Vincent Van Den Berghe, were upgraded.
I would recommend a DataGrid, but you can use whatever control you want as long as it's virtualizing.
The issue I see is that if you try to bind to a standard array or other collection as your backing to store the data you'll end up with 676 million cells allocated (26000 times 26000) even if there's little to no data in there. My suggestion for your backing store would be to create your own class based on SortedList<int, SortedList<int, object>> that only contains data for cells that are populated. You can then create a this[row, col] operator that gets or sets cell data and that can be used to bind into your control. Then only the cells that have data will be allocated.
Do not use a datagrid.
Use something that doesn't even attempt to work with such huge datasets. Write something that has two scrollbars - when you scroll, it retrieves/updates the subset of data you want to view and displays it in some useful way.
Dealing with such a large amount of data will require some careful and in-depth design and consideration. I suggest you write your own control. If you are unsure how to tackle this very complex task, I suggest you start from first principles, come up with a plan and give it a go - then come back with the questions you will likely have in managing such a huge amount of information.

WPF Datagrid Lazy load

Details
VS-2008 Professional SP1
Version .net 3.5
Language:C#
I have a WPF Datagrid which loads from Linq-sql query Datacontext data item.The result set contains around 200k rows and its it very slow loading them,sorting,filtering etc.
What is the simple and easy way to improve the speed?
Couple of things I saw searching are
Scrollview,Data virtualization etc people also talk about Paging,Profiling etc
Loading data: 200k rows is a lot of data that no one (user) wants to see in one place. It will definitely reduce your UI user experience. So your best bet is to filter your data just to reduce the amount of it (for example do not show closed orders, just show the open ones). If you can't do so, you should use Virtualization. I didn't see any applications that use pagination in order to show data (Of course except in web). Most of the time it isn't such a good approach. But if you are talking about a type of data that is like search engines results you must use it. But keep in mind that most users won't exceed page 10 in search engines results.
Filtering: I would suggest doing it on your server side for such a huge amount of data (SQL Server here), or as I said first filter the whole 200k to reduce the amount on server side and then filter it (for user) in order to find something, on the client side. You might also find the following link helpful:
http://www.codeproject.com/KB/WPF/DataGridFilterLibrary.aspx
Sorting: Again I would suggest server-client solution but you might also find following links helpful:
http://blogs.msdn.com/b/jgoldb/archive/2008/08/26/improving-microsoft-datagrid-ctp-sorting-performance.aspx
http://blogs.msdn.com/b/jgoldb/archive/2008/08/28/improving-microsoft-datagrid-ctp-sorting-performance-part-2.aspx
http://blogs.msdn.com/b/jgoldb/archive/2008/10/30/improving-microsoft-datagrid-sorting-performance-part-3.aspx
Many people don't use default SortMemberPath of WPF datagrid just because it uses reflection on every single record and this will highly reduce the performance of the sorting process.
Hosein
Here is a very good sample of Data Virtualization (Not UI Virtualization):
http://www.codeproject.com/KB/WPF/WpfDataVirtualization.aspx
Althogh it doesn't support the LINQ IQueryable objects directly but you can use this sample as it is. Of course I'm now wokring to improve it to work with IQueryable objects directly. I think it's not so hard.
Wow, 200K rows is a lot of data. Paging sounds like a good idea. Try to decide how many rows per page you want, say 50. Upon showing the screen the first time, show only the first 50. Then give the user the option to move between pages.
Sorting might be trickier this way though.
Virtualization can be another option, sadly, I have yet to work with virtualization.
Sometimes you may have only ~30 visible rows to load and if those rows + whatever columns are expensive to load due to their number and complexity of the each cell (it's template, or how many wpf elements it has), none of the above comments really make a difference. Each row will take it's sweet time to load!
What helps is to stagger or lazily load each row on the UI, so that the user sees that the ui is doing something rather than just freezing for ~10+ seconds..
For simplicity, assuming that the datagrid ItemSource="{Binding Rows}", and Rows is IEnumerable, where Row is some class you created : add a property IsVisible to Row (don't forget to raise property changed, of course)
you could do something like this:
private void OnFirstTimeLoad()
{
Task.Factory.StartNew(() =>
{
foreach (var row in ViewModel.Rows)
{
/*this is all you really need,
note: since you're on the background thread, make sure IsVisible executes on the UI thread, my utils method does just that*/
myUtils.ExecuteOnUiThread(() => row.IsVisible = true);
/*optional tweak:
this just forces Ui to refresh so each row repaint staggers nicely*/
Application.Current.Dispatcher
.Invoke(DispatcherPriority.Background, (SendOrPostCallback) delegate { }, null);
}
});
}
oh, and don't forget to trigger in XAML:
<DataGrid.ItemContainerStyle>
<Style TargetType="{x:Type DataGridRow}">
<Setter Property="Visibility" Value="{Binding Path=IsVisible, Converter={StaticResource BoolToVisibility}}"/>
........
Is your datagrid inside a Scrollviewer? Because it does the entire datagrid (all the rows) to be rendered. I had a similar problem and removing the scrollviewer solved the problem with the slow loading.
Question that you should be asking is:
Are users groing to look through 200K rows of data?
How much data is too much for the users? May be alert the user that the query returned too many rows and you are listing the first 1000
Is it worth your time & money to program paging, Data Virtualization etc if users do not look beyond the first 1000 rows.

Windows Forms Control - Huge list of filenames

Which control would be best for showing a huge (300.000+) list of filenames?
I've tried DataGridView, but it seems to be overkill and also slow.
Are there better alternatives?
None.
No USER will be able to handle a single list of 300.000+ entries in a meaningful way. Looks like your design is seriously flawed - do you really have to present the complete list?
Consider using a search box and let the users search the file names (use auto completion/suggestions like Google et.al.) or create a separate list for every starting letter (like most address books do). Or find another way to reduce the number of entries from which the user has to select.
The standard ListView control has a virtual mode designed specifically for your situation. I've used it with a million row list previously and it does the job well.
It is a true virtual mode. In other words memory allocation and list population time remains low regardless of the size of the overall list. This is unlike the DataGridView that really starts to slowdown and use memory on large lists.
To use virtual mode set:
VirtualListMode = true
VirtualListSize= 300000
(or whatever size your list currently is)
Then handle the RetrieveVirtualItem event to populate the list on demand from your list. You may also wnat/need to handle the CacheVirtualItems and SearchForVirtualItem events.
Set up pagination and restrict the number of rows displayed by the DataGrid. You can add a combobox to jump between pages. This is a standard solution.
also see this post https://stackoverflow.com/questions/2125963/need-help-in-gridview-and-table
Have you tried the ListView with report style? This is the control used by Windows natively in its file browsers.
Following on from gotch4's answer. Here is a good article from CodeProject on how to do paging with a DataGridView.
You might want to check out ObjectListView, specifically the VirtualObjectListView:
http://objectlistview.sourceforge.net/cs/index.html
I forget off hand what license it has been released under so you might want to look at that before using it in a commercial application.

Categories