Sorting List vs. ObservableCollection

Sorting List vs. ObservableCollection - c#

I have found out that I am "doing WPF wrong" and frustratingly must overhaul alot of my code.
How could I convert the following:
public static class SortName
{
public static int Compare(Person a, Person b)
{
return a.Name.CompareTo(b.Name);
}
}
and I call it like:
list.Sort(SortName.Compare);
to the format required for ObservableCollection. And how would I call it. So far i've tried this following based on what I read here
class ObservableCollectionSortName<T> : ObservableCollection<T>
{
public int Compare (Person a, Person b)
{
return a.Name.CompareTo(b.Name);
}
}

The observable collection doesn't implement sorting, for the simple reason that every time an item moves from one location in the list to another the collection raises an event. That would be great for watching animations of the sort algorithm in action, but it would sort of suck for, you know, sorting.
There are two ways to do this; they're very similar, and both start by sorting the items outside their observable collection, e.g. if _Fruits is an ObservableCollection<Fruit>, and you've defined an IComparer<Fruit> for the sort, you'd do:
var sorted = _Fruits.OrderBy(x => x, new FruitComparer());
That creates a new IEnumerable<Fruit> that, when you iterate over it, will have the objects in the new order. There are two things you can do with this.
One is to create a new collection, replace the old one, and force any items control(s) in the UI to rebind to it:
_Fruits = new ObservableCollection<Fruit>(sorted);
OnPropertyChanged("Fruits");
(This assumes that your class implements INotifyPropertyChanged in the usual way.)
The other is to create a new sorted list, and then use it to move the items in your collection:
int i = 0;
foreach (Fruit f in sorted)
{
_Fruits.MoveItem(_Fruits.IndexOf(f), i);
i++;
}
The second approach is something I'd only try if I had a really serious commitment to not rebinding the items controls, because it's going to raise a ton of collection-changed events.

If the main reason for sorting your observable collection is to display the data in a sorted list to the user and not for performance reasons (i.e. faster access) then a CollectionViewSource can be used. Bea Stollnitz has a good description on her blog of how to use the CollectionViewSource to implement sorting and grouping of collections for the UI.
This may help as it means you will not have to implement sorting on your observable collection and worry about the performance hits indicated by Robert of sending the INotifyCollectionChanged. However it will allow for the displaying of items in a sorted order in the UI.

Related

Filling ObservableCollection through use of RX

I was wondering if following scenario could be fixed by using RX?
I have a REST service call that has an input parameter distance and loads data, this data is than inserted in the ObservableCollection of the ViewModel so that the View will show the data...
Pseudo code like this:
public async Task<int> LoadData(int distance)
{
this.ListOnUI.Clear();
var dataList = await Task.Run(() => _dataService.GetListAsync(distance));
foreach(var dataItem in dataList)
{
this.ListOnUI.Add(dataItem);
}
return dataList.Count;
}
Now this small code snippet is wrapped inside a method, that returns the count of the dataList.
What I do with that count, check if the amount returned is at least 20, if not I recall the method with a larger distance.
So what is wrong with this setup...
Each time the method is called the UI list is cleared
The user sometimes has to wait long until we reach 20 items
While we haven't reached 20 items, the UI will act weird with the clearing of the list
So my gut feeling is telling me this could be solved by using RX somehow, so that we 'chunk' load/add the UI list.
But my knowledge of RX is not good enough to solve it... so any ideas?
REMARK: When we call the LoadData service we are getting a JSON string that is then mapped to a collection of DataItems, so if we not clear the UI ObservableCollection and would just Add them with each itteration... we would get the same item multiple times in the list because it are newly constructed objects ( although with the same data ).

Is there any Key inside the data objects? If so you could check in your foreach wether the object is already contained and only add it if it's not. That way you wouldn't have to clear it (together with all side effects).
If there is no key you could create one by hashing the title + distance or whatever data fields you have that could together uniquely identify your data item and use that for the check.
Don't know wether there is a better way with reactive extensions but it should solve your case at least.

Modified to calculate a list delta each time. For contains to work correctly you just need to implement Equals appropriately on the items returned form GetListAsync. Perhaps by a contrived key comparison as SB Dev suggested. Not sure there's much Rx can bring to the table in the context of the question.
public async Task<int> LoadData(int distance)
{
int count = 0;
IList<object> dataList = null;
while (count < 20)
{
dataList = await Task.Run(() => _dataService.GetListAsync(distance));
count = dataList.Count;
var newItems = dataList.Except(ListOnUI).ToList();
var removedItems = ListOnUI.Except(dataList).ToList();
removedItems.ForEach(x => ListOnUI.Remove(x));
newItems.ForEach(ListOnUI.Add);
}
return count;
}
Assuming you are using an ObservableCollection for your list, see Sort ObservableCollection - what is the best approach? for sorting.

Thanks to the suggested answers, it got me thinking about using a Sortable Observable collection and just adding the items as they come in!
I've used the example explained by Andrea here: http://www.xamlplayground.org/post/2010/04/27/Keeping-an-ObservableCollection-sorted-with-a-method-override.aspx
But used the Binary search option noted in the comments of the blog post!
To be sure we don't stop the code when we find items already in the list, I just commented out the Throw Exception.
For this to work I only needed to implement IComparable.

How do I populate a List<String> with values already present in Listbox control in Winforms

I have a listbox which has a couple of values and is already populated (from user input). Later in my program I want to take these values from the listbox and populate them to a List collection.
One of the approach is of course to iterate through the items of the listbox and populate the List collection one by one (in a loop) using Add method.
But is there a better more efficient way to do this in one shot meaning all the items of the listbox get copied over to a List collection.
I also looked at the AddRange method but that doesnt seem to help.
Any suggestions for this?

It's not necessarily more efficient (in terms of speed/memory), but you can save some typing via LINQ:
List<string> items = listBox.Items.Cast<object>()
.Select(item => item.ToString()).ToList();

If it must be a List<string>, I think you will have to enumerate the items and add them to a list (since you are concerned about the efficiency of LINQ). If you can use a string[], you could use the CopyTo method:
string[] destination = new string[listBox1.Items.Count];
listBox1.Items.CopyTo(destination, 0);

The ItemCollection class implements IList, you'll need to cast the items in the collection to String since it's not a generic collection, but you already have a "List collection" of the items.
If you really need a copy of the items in collection then no, there isn't going to be a more efficient way. There are plenty of ways to implement syntatic sugar so that the code is more concise, but ultimately they will all be iterating over the items and creating copies to be added to a new collection.

No there isn't a more efficient way.
If anything existed it would simply abstract away what you will be doing anyways via an extension method or other means and could have the possibility of making your code less readable.
The CopyTo method does exist via the ObjectCollection on the Items property which could then be LINQ'ed to a List<T>.
ListBox lb = new ListBox();
object[] items = new object[lb.Items.Count];
lb.Items.CopyTo(items, 0);

This seems to be a duplicate of: Most succinct way to convert ListBox.items to a generic list
If you're copying from the listbox, what is populating that list? It might be a better option to go about it that way, versus taking 200k items and moving it over. if you're taking all items in a listbox, populate the List at the same time.
On that note.. really? 200k items in a listbox?. That right there seems a little over the top in a real world application.
Foreach(...) loops are very expensive - even when compiled into IL: http://diditwith.net/2006/10/05/PerformanceOfForeachVsListForEach.aspx
If you are unable to work with LINQ (your using <= VS 2005), this is one of the easier methods of doing it.
string[] items = new string[listBox1.Items.Count];
listBox1.Items.CopyTo(items, 0);
List<string> list = new List<string>(items);

#Aaron McIver, your own method is a bit faster in my opinion. See this link for some performance testing I have done.

Custom ObservableCollection<T> or BindingList<T> with support for periodic notifications

Summary
I have a large an rapidly changing dataset which I wish to bind to a UI (Datagrid with grouping). The changes are on two levels;
Items are frequently added or removed from the collection (500 a second each way)
Each item has a 4 properties which will change up to 5 times in its lifetime
The characteristics of the data are as follows;
There are ~5000 items in the collection
An item may, within a second, be added then have 5 property changes and then be removed.
An item may also remain in some interim state for a while and should be displayed to the user.
The key requirement which I'm having problems with;
The user should be able to sort the dataset by any property on the object
What I would like to do;
Update the UI only every N seconds
Raise only the relevant NotifyPropertyChangedEvents
If item 1 has a property State which
moves from A -> B -> C -> D in the
interval I need/want only one 'State' change
event to be raised, A->D.
I appreciate a user doesn't need to have the UI updated thousands of times a second. if an item is added, has its state changed and is removed all within the window of N seconds between UI updates it should never hit the DataGrid.
DataGrid
The DataGrid is the component which I am using to display the data. I am currently using the XCeed DataGrid as it provides dynamic grouping trivially. I am not emotionally invested in it, the stock DataGrid would be fine if I could provide some dynamic grouping options (Which includes the properties which change frequently).
The bottleneck in my system is
currently in the time taken to re-sort
when an item's properties change
This takes 98% of CPU in the YourKit Profiler.
A different way to phrase the question
Given two BindingList / ObservableCollection instances
which were initially identical but
the first list has since had a series of
additional updates (which you can
listen for), generate the minimal set
of changes to turn one list into the
other.
External Reading
What I need is an equivalent of this ArrayMonitor by George Tryfonas but generalized to support adding and removing of items (they will never be moved).
NB I would really appreciate someone editing the title of the question if they can think of a better summary.
EDIT - My Solution
The XCeed grid binds the cells directly to the items in the grid whereas the sorting & grouping functionality is driven by the ListChangedEvents raised on the BindingList. This is slightly counter intuitive and ruled out the MontioredBindingList below as the rows would update before the groups.
Instead I wrap the items themselves, catching the Property changed events and storing them in a HashSet as Daniel suggested. This works well for me, I periodically iterate over the items and ask them to notify of any changes.
MonitoredBindingList.cs
Here is my attempt at a binding list which can be polled for update notifications. There are likely some bugs with it as it was not useful to me in the end.
It creates a queue of Add/Remove events and keeps track of changes via a list. The ChangeList has the same order as the underlying list so that after we've notified of the add/remove operations you can raise the changes against the right index.
/// <summary>
/// A binding list which allows change events to be polled rather than pushed.
/// </summary>
[Serializable]
public class MonitoredBindingList<T> : BindingList<T>
{
private readonly object publishingLock = new object();
private readonly Queue<ListChangedEventArgs> addRemoveQueue;
private readonly LinkedList<HashSet<PropertyDescriptor>> changeList;
private readonly Dictionary<int, LinkedListNode<HashSet<PropertyDescriptor>>> changeListDict;
public MonitoredBindingList()
{
this.addRemoveQueue = new Queue<ListChangedEventArgs>();
this.changeList = new LinkedList<HashSet<PropertyDescriptor>>();
this.changeListDict = new Dictionary<int, LinkedListNode<HashSet<PropertyDescriptor>>>();
}
protected override void OnListChanged(ListChangedEventArgs e)
{
lock (publishingLock)
{
switch (e.ListChangedType)
{
case ListChangedType.ItemAdded:
if (e.NewIndex != Count - 1)
throw new ApplicationException("Items may only be added to the end of the list");
// Queue this event for notification
addRemoveQueue.Enqueue(e);
// Add an empty change node for the new entry
changeListDict[e.NewIndex] = changeList.AddLast(new HashSet<PropertyDescriptor>());
break;
case ListChangedType.ItemDeleted:
addRemoveQueue.Enqueue(e);
// Remove all changes for this item
changeList.Remove(changeListDict[e.NewIndex]);
for (int i = e.NewIndex; i < Count; i++)
{
changeListDict[i] = changeListDict[i + 1];
}
if (Count > 0)
changeListDict.Remove(Count);
break;
case ListChangedType.ItemChanged:
changeListDict[e.NewIndex].Value.Add(e.PropertyDescriptor);
break;
default:
base.OnListChanged(e);
break;
}
}
}
public void PublishChanges()
{
lock (publishingLock)
Publish();
}
internal void Publish()
{
while(addRemoveQueue.Count != 0)
{
base.OnListChanged(addRemoveQueue.Dequeue());
}
// The order of the entries in the changeList matches that of the items in 'this'
int i = 0;
foreach (var changesForItem in changeList)
{
foreach (var pd in changesForItem)
{
var lc = new ListChangedEventArgs(ListChangedType.ItemChanged, i, pd);
base.OnListChanged(lc);
}
i++;
}
}
}

We are talking about two things here:
The changes to the collection. This raises the event INotifyCollectionChanged.CollectionChanged
The changes to the properties of the items. This raises the event INotifyPropertyChanged.PropertyChanged
The interface INotifyCollectionChanged needs to be implemented by your custom collection. The interface INotifyPropertyChanged needs to be implemented by your items. Furthermore, the PropertyChanged event only tells you which property was changed on an item but not what was the previous value.
This means, your items need to have a implementation that goes something like this:
Have a timer that runs every N seconds
Create a HashSet<string> that contains the names of all properties that have been changed. Because it is a set, each property can only be contained one or zero times.
When a property is changed, add its name to the hash set if it is not already in it.
When the timer elapses, raise the PropertyChanged event for all properties in the hash set and clear it afterwards.
Your collection would have a similar implementation. It is however a little bit harder, because you need to account for items that have been added and deleted between to timer events. This means, when an item is added, you would add it to a hash set "addedItems". If an item is removed, you add it to a "removedItems" hash set, if it is not already in "addedItems". If it is already in "addedItems", remove it from there. I think you get the picture.
To adhere to the principle of separation of concerns and single responsibility, it would be even better to have your items implement INotifyPropertyChanged in the default way and create a wrapper that does the consolidation of the events. That has the advantage that your items are not cluttered with code that doesn't belong there and this wrapper can be made generic and used for every class that implements INotifyPropertyChanged.
The same goes for the collection: You can create a generic wrapper for all collections that implement INotifyCollectionChanged and let the wrapper do the consolidation of the events.

How would you obtain the first and last items in a Queue?

Say I have a rolling collection of values where I specify the size of the collection and any time a new value is added, any old values beyond this specified size are dropped off. Obviously (and I've tested this) the best type of collection to use for this behavior is a Queue:
myQueue.Enqueue(newValue)
If myQueue.Count > specifiedSize Then myQueue.Dequeue()
However, what if I want to calculate the difference between the first and last items in the Queue? Obviously I can't access the items by index. But to switch from a Queue to something implementing IList seems like overkill, as does writing a new Queue-like class. Right now I've got:
Dim firstValue As Integer = myQueue.Peek()
Dim lastValue As Integer = myQueue.ToArray()(myQueue.Count - 1)
Dim diff As Integer = lastValue - firstValue
That call to ToArray() bothers me, but a superior alternative isn't coming to me. Any suggestions?

One thing you could do is have a temporary variable that stores the value that was just enqueued because that will be the last value and so the variable can be accessed to get that value.

Seems to me if you need quick access to the first item in the list, then you're using the wrong data structure. Switch a LinkedList instead, which conveniently has First and Last properties.
Be sure you only add and remove items to the linked list using AddLast and RemoveFirst to maintain the Queue property. To prevent yourself from inadvertantly violating the Queue property, consider creating a wrapper class around the linked list and exposing only the properties you need from your queue.

public class LastQ<T> : Queue<T>
{
public T Last { get; private set; }
public new void Enqueue(T item)
{
Last = item;
base.Enqueue(item);
}
}
Edit:
Obviously this basic class should be more robust to do things like protect the Last property on an empty queue. But this should be enough for the basic idea.

You could use a deque (double-ended queue).
I don't think there is one built into System.Collections(.Generic) but here's some info on the data structure. If you implemented something like this you could just use PeekLeft() and PeekRight() to get the first and last values.
Of course, it will be up to you whether or not implementing your own deque is preferable to dealing with the unsexiness of ToArray(). :)
http://www.codeproject.com/KB/recipes/deque.aspx

Your best bet would be to keep track of the last value added to the Queue, then use the myQueue.Peek() function to see the "first" (meaning next) item in the list without removing it.

Properly exposing a List<T>?

I know I shouldn't be exposing a List<T> in a property, but I wonder what the proper way to do it is? For example, doing this:
public static class Class1
{
private readonly static List<string> _list;
public static IEnumerable<string> List
{
get
{
return _list;
//return _list.AsEnumerable<string>(); behaves the same
}
}
static Class1()
{
_list = new List<string>();
_list.Add("One");
_list.Add("Two");
_list.Add("Three");
}
}
would allow my caller to simply cast back to List<T>:
private void button1_Click(object sender, EventArgs e)
{
var test = Class1.List as List<string>;
test.Add("Four"); // This really modifies Class1._list, which is bad™
}
So if I want a really immutable List<T> would I always have to create a new list? For example, this seems to work (test is null after the cast):
public static IEnumerable<string> List
{
get
{
return new ReadOnlyCollection<string>(_list);
}
}
But I'm worried if there is a performance overhead as my list is cloned every time someone tries to access it?

Exposing a List<T> as a property isn't actually the root of all evil; especially if it allows expected usage such as foo.Items.Add(...).
You could write a cast-safe alternative to AsEnumerable():
public static IEnumerable<T> AsSafeEnumerable<T>(this IEnumerable<T> data) {
foreach(T item in data) yield return item;
}
But your biggest problem at the moment is thread safety. As a static member, you might have big problems here, especially if it is in something like ASP.NET. Even ReadOnlyCollection over an existing list would suffer from this:
List<int> ints = new List<int> { 1, 2, 3 };
var ro = ints.AsReadOnly();
Console.WriteLine(ro.Count); // 3
ints.Add(4);
Console.WriteLine(ro.Count); // 4
So simply wrapping with AsReadOnly is not enough to make your object thread-safe; it merely protects against the consumer adding data (but they could still be enumerating it while your other thread adds data, unless you synchronize or make copies).

Yes and No. Yes, there is a performance overhead, because a new object is created. No, your list is not cloned, it is wrapped by the ReadOnlyCollection.

If the class has no other purpose you could inherit from list and override the add method and have it throw an exception.

Use AsReadOnly() - see MSDN for details

You don't need to worry about the overhead of cloning: wrapping a collection with a ReadOnlyCollection does not clone it. It just creates a wrapper; if the underlying collection changes, the readonly version changes also.
If you worry about creating fresh wrappers over and over again, you can cache it in a separate instance variable.

I asked a similar question earlier:
Difference between List and Collection (CA1002, Do not expose generic lists)
Why does DoNotExposeGenericLists recommend that I expose Collection instead of List?
Based on that I would recommend that you use the List<T> internally, and return it as a Collection<T> or IList<T>. Or if it is only necessary to enumerate and not add or antyhing like that, IEnumerable<T>.
On the matter of being able to cast what you return in to other things, I would just say don't bother. If people want to use your code in a way that it was not intended, they will be able to in some way or another. I previously asked a question about this as well, and I would say the only wise thing to do is to expose what you intend, and if people use it in a different way, well, that is their problem :p Some related questions:
How should I use properties when dealing with List members (Especially this answer)
Encapsulation of for example collections

If you expose your list as IEnumerable, I wouldn't worry about callers casting back to List. You've explicitly indicated in the contract of your class that only the operations defined in IEnumerable are allowed on this list. So you have implicitly stated that the implementation of that list could change to pretty much anything that implements IEnumerable.

AsEnumerable and ReadOnlyCollection have problem when your enumeration is at midway and collection gets modified. These things are not thread safe. Returning them as an array and caching them at time of calling can be much better option.
For example,
public static String[] List{
get{
return _List.ToArray();
}
}
//While using ...
String[] values = Class1.List;
foreach(string v in values){
...
}
// instead of calling foreach(string v in Class1.List)
// again and again, values in this context will not be
// duplicated, however values are cached instance so
// immediate changes will not be available, but its
// thread safe
foreach(string v in values){
...
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.