Cancellable Sort in .NET? - c#

I'm using ListView in VirtualMode to show extremely big number of rows, millions of rows.
The data of the rows stored in a Generic List.
Now i want implement a sort feature, that will sort the List by some Comparer.
The problem is that for now, an average single sort takes around 30 seconds, and during this time the user cannot do anything with the ListView and have to wait until it ends.
Not every user will accept to wait that much time, most of users would cancel the sort, if they could, and i want to allow that cancel feature. Unfortunately, the built-in List.Sort cannot be cancelled nor Array.Sort.
For now the sort occurring on separate thread, so I could use Thread.Abort, but it probebly will result in corrupted List, unacceptable for me.
Is there something i can do except reimplement the whole Sort algorithm by myself?
thanks.

Copy the list, sort the copy in a thread then replace the original list (if the sort completes without getting interrupted).
But I'd go with Martinho's suggestion if possible - having the millions of rows in the application to begin with feels wrong to me. Databases can do a far better job of filtering and sorting data before it gets to you.

I had a similar problem. Beating the framework Array.Sort is not easy, and using Thread.Abort is not recommended at all.
I decided sort operation would not be cancel-able, however I thought about this solution ( did not test it)
Implements your own comparer able to access your IAsyncResult object having a CancelRequested bool field. Before any comparison, check the bool, true throws an exception cought in your your bground thread. No thread abort e.g. any lock can be released correctly. Your initial items are safe. But you still need to build an array, either of ref or of keys indexing your original items, because throwing while comparing may leave the array corrrupted. I guess it is not the case but there's no guaranty for this.
Hope this helps

There are several ways you could do this. I might write a class that uses the standard BeginXXX/EndXXX asynchronous pattern with a special CancelXXX method. I have left out a lot of code, but I think there is enough here to make the point. Unfortunately with this method you would have to code your own sorting algorithm.
public class Sorter
{
public IAsyncResult BeginSort(IList<T> values, AsyncCallback complete)
{
MyAsyncResult asyncResult = new MyAsyncResult();
Thread t = new Thread(() =>
{
// Implement your sorting algorithm here.
// Periodically check asyncResult.Cancel at safe points.
asyncResult.Complete();
if (complete != null)
{
complete(asyncResult);
}
});
t.Start();
return asyncResult;
}
public void EndSort(IAsyncResult asyncResult)
{
MyAsyncResult target = asyncResult as MyAsyncResult;
if (target == null)
{
throw new ArgumentException();
}
// Add code here to extract any additional information from the IAsyncResult that
// you might want to return to the client. Perhaps this method will be empty.
}
public void CancelSort(IAsyncResult asyncResult)
{
MyAsyncResult target = asyncResult as MyAsyncResult;
if (target == null)
{
throw new ArgumentException();
}
target.Cancel = true;
}
private class MyAsyncResult : IAsyncResult
{
private volatile bool m_Cancel = false;
public bool Cancel
{
get { return m_Cancel; }
set { m_Cancel = value; }
}
public void Complete()
{
// Add code here to mark this IAsyncResult as complete.
}
}
}

depending completely upon the environment in question - one approach is to let the user cancel waiting for the sort (which is running on a separate thread) but you secretly continue sorting the list in the background and then tell them when its finished with a subtle notification.

If you need possibility to cancel sort then you have to create a separate thread. You can prevent corrupting of the original list by reating it's copy and sorting the copy. If user doesn't cancel sorting just exchange original list wist sorted one.

Related

Unity C# Firebase can't modify/access list variable in task scope

I am unable to access/edit a list variable inside a getvalueasync task, any code that tries to modify or read the list "ShopItems" returns nothing and prevents any further code from running in the same scope, no errors are being returned in the console. There are no issues with the "ItemsLoaded" int variable.
public static void IsItemPurchased(string item)
{
Debug.Log(ShopManager.ShopItems[0]); // This works
FirebaseDatabase.DefaultInstance.GetReference("/Shop/" + item + "/").GetValueAsync().ContinueWith(task =>
{
if (task.IsFaulted || task.IsCanceled)
{
Debug.LogError("Database: Failed to check item status - " + task.Exception);
}
else if (task.IsCompleted)
{
bool isPurchased;
if (task.Result.Value == null)
isPurchased = false;
else
isPurchased = (bool)task.Result.Value;
Debug.Log(ShopManager.ShopItems[0]); // Does not work
Debug.Log(ShopManager.ItemsLoaded); // This works
ShopManager.ShopItems.Where(i => i.gameObject.name == item).FirstOrDefault().Purchased = isPurchased; // Variable does not update
ShopManager.ItemsLoaded++;
}
});
}
Some things to look at:
It's generally dangerous to change your game state based on the result of a GetValueAsync. See this SO answer for more details (it's an Android answer, but should apply to iOS too), but the general idea is that calling GetValueAsync will generally request data from the server and return whatever data is currently cached (so you miss out on the latest data if you're out of sync). It would be best to listen for the ValueChanged event and update shop items that way.
// important, cache this to cleanup later.
var shopReference = FirebaseDatabase.DefaultInstance.GetReference("/Shop/");
// your have a Dictionary (or an Array if your items are more or less linear integers) from which you can get the value of all your items and update your shop accordingly. You could also listen per item in your shop.
shopReference.ValueChanged += HandleShopUpdated;
// important, do this in your OnDestroy. These are C# events with no knowledge of Unity's C# lifecycle
shopReference.ValueChanged -= HandleShopUpdated;
You're using ContinueWith instead of ContinueWithOnMainThread. Depending on your game, if any of these calls touches something in the UnityEngine namespace (say gameObject.name), it will likely raise an exception from being on the wrong thread. ContinueWithOnMainThread is a drop in replacement for ContinueWith, so consider using that.
General consideration since you're using ContinueWith rather than ContinueWithOnMainThread (ie: you can likely ignore this if you follow my previous recommendation). Since ItemsLoaded is accessed from a task, make sure you mark it as volatile. There is a small chance that you may miss an increment operation if multiple threads are working on the same piece of memory at once (fairly unlikely, but also difficult to catch).

How would you convert this to an iterative function instead of recursive with a nested loop?

The below functions results in hundred of levels of recursion. I'm wondering if anybody has a suggestion on how to switch this to a loop function instead. I believe the order of execution does matter in this case, but I may need to do more of an investigation. I was hoping I could directly convert it to an iterative function instead of recursion.
As you can hopefully see, the only parameter passed to each level of recursion is "after". So the recursive function call itself is relatively simple, but the loop surrounding the call is what throws me off.
I've considered doing a queue, but the condition of "changed" seems to imply depth first checks. I'd have to perform part of the shifting operation prior to adding it to the queue, but in the current code the next level of recursion would start immediately after, so I can't just build up a queue of items to process and execute them in order.
I've considered a stack, but I'm not sure how I'd go about implementing that to replace the recursion.
I decided to simplify the code, as it probably was a bit confusing. Here's a skeleton (that you could actually run if you initialize the variables!) that's probably more "psuedo" like
private void DataChangedRecursive(LinkedNode node)
{
InitializeVariables();
try
{
foreach (LinkedNode after in node.After)
{
var afterDetails = after.Before;
bool changed = CheckData(afterDetails);
if (changed)
{
DataChangedRecursive(afterDetails);
}
}
}
catch
{
// Assume relavant error handling at this level in the stack. This probably isn't important to maintain, but it'd be interested if we could.
throw;
}
}
public object InitializeVariables()
{
// Assume relavant work happens here.
return new object();
}
public bool CheckData(LinkedNode dr)
{
// Logic is that something changes, so it needs to save. This does a bunch of comparisons on the current item.
return dr.DataChanged;
}
public class LinkedNode
{
public LinkedNode Before {get;set;}
public bool DataChanged {get;set;}
public List<LinkedNode> After {get;set;}
}
Finally figured it out, had an epiphany about it. Turns out the easiest way to deal with recursion and a loop is to take advantage of IEnumerator. It requires that you manually deal with the iteration (no for loops), but the order of the recursion and loops will be identical.
I split the function into two parts, the entry function and the function to do the iteration along with the "recursion". Essentially, this will guarantee all of the children are fully complete first, just like the recursion, and return to the correct iteration point in the parent. This should theoretically work for any recursive function with a loop inside it.
private void DataChangedRecursive(LinkedNode node)
{
try
{
DataChanged(node);
}
catch
{
throw;
}
}
private void DataChanged(LinkedNode node)
{
var state = new Stack<IEnumerator<LinkedNode>>();
state.Push(node.After.GetEnumerator());
while (state.Count > 0)
{
InitializeVariables();
while (state.Peek().MoveNext())
{
ItemWithPredPostcessor after = state.Peek().Current;
ItemWithPredPostcessor afterDetails = after.Before;
bool dataChanged = StartShift(afterDetails);
if (dataChanged)
{
Save(afterDetails);
state.Push(afterDetails.After.GetEnumerator());
}
}
state.Pop(); // Remove current from list, as we've completed.
}
}
There is a section in memory, called the stack. Function calls are stored in the stack. Considering that your f function calls a g function, the g function is pushed to the stack. When g is executed, it will be popped from the stack and its result - if any - will be returned to f at the location where f has been interrupted to call and execute g.
This behavior is automatic, but if you understand it - and I advise you to read some articles about it - then you will realize what behavior you will need to emulate. Yes, your thought that you need a stack is true. Each item in your stack will have to store the state, that is, you will need to make sure that whenever you push a new item to your stack you do not forget the state of the interrupted item you were at before pushing a new item to the stack. Also, when you pop an item from a stack, you will need to ensure that its result (in this case you have void, so you will not care about the result, but I'm speaking in general terms) is given back correctly to the new top of the stack after the pop. You will also need to accurately get back to the correct state after the pop. So you will need to carefully handle your variables.
Now, let's see how should it work. For each level you have a queue of potential new levels. In our case you will need a cycle which will handle a step in each iteration. A step can be an initialization, a queue item handling, a pop. The cycle will run until the stack gets empty, so make sure that your stack will get empty eventually to avoid some frustrations. To know what you need to do you will need to have a state for each level of your stack, so your program will always know whether a given item was already initialized, its queue is being iterated or not or is it to be finalized. The queue is the set of possible sub-items. A very simplified pseudocode to illustrate the idea:
root.state <- preinitialized
stack.push(root);
while (not stack.empty()) do
item <- stack.top()
if (item.state = preinitialized) then
item.initializeQueue()
item.queueIterator.initialize()
item.state <- queue
else if (item.state = queue) then
if (item.queueIterator.hasNext()) then
next <- item.queueIterator.next()
if (someCondition) then
next.state <- preinitialized
stack.push(next)
end if
else
item.state <- finished
result <- stack.pop(item)
if (stack.top()) then
stack.top.apply(result)
else
finalResult <- result
end
end if
end if
end while

Using ConcurrentBag correctly

Edit: Thank you, you made me realise that the code below is not working as I assumed, since somehow I thought that cbag works like a hashset. Sorry about it, you saved me some headache :)
the following function is the only function that can change _currentSetOfStepsProcessing. This function can be called from different threads. I am not sure if I understood correctly the use of a ConcurrentBag, so please let me know if in your opinion this can work. _stepsToDo datastructure is never modified once the process starts.
void OnStepDone(InitialiseNewUserBase obj)
{
var stepToDo = _stepsToDo[_currentSetOfStepsProcessing];
stepToDo.TryTake(out obj);
if (stepToDo.Count == 0) //can I assume it will enter here once per ConcurrentBag?
{
if (_currentSetOfStepsProcessing < _stepsToDo.Count - 1)
{
_currentSetOfStepsProcessing++;
}
}
}
List<ConcurrentBag<InitialiseNewUserBase>> _stepsToDo = new List<ConcurrentBag<InitialiseNewUserBase>>();
Action _onFinish;
int _currentSetOfStepsProcessing;
stepToDo.TryTake(out obj); might fail, you don't handle that.
Why are you out-referencing the method argument? This simply overwrites the argument. Why take an argument if you throw it away? More likely, this is a misunderstanding of some kind.
can I assume it will enter here once per ConcurrentBag since access to the bag is apparently concurrent multiple accessing threads might see 0. So yes, you need to handle that case better.
Probably, you should not make things so difficult and use lock in combination with non-concurrent data structures. This would only be a good idea if there was a high frequency of bag operations which seems unlikely.
What about this:
foreach (/*processing step*/) {
Parallel.ForEach(/*item in the step*/, x => { ... });
}
Much simpler.

How to perform this particular type of locking?

I have the following code:
var sequence = from row in CicApplication.DistributorBackpressure44Cache
where row.Coater == this.Coater && row.IsDistributorInUse
select new GenericValue
{
ReadTime = row.CoaterTime.Value,
Value = row.BackpressureLeft
};
this.EvaluateBackpressure(sequence, "BackpressureLeftTarget");
And DistributorBackpressure44Cache is defined as follows:
internal static List<DistributorBackpressure44> DistributorBackpressure44Cache
{
get
{
return _distributorBackpressure44;
}
}
This is part of a heavily threaded application where DistributorBackpressure44Cache could be being refreshed in one thread, and queried from, as shown above, in another thread. The variable 'sequence' above is an IEnumerable, which is passed to the method shown, and then potentially passed to the other methods, before actually being executed. My concern is this. What will happen with the above query if the DistributorBackpressure44Cache is being refreshed (cleared and repopulated) when the query is actually executed?
It wouldn't do any good to put a lock around this code because this query actually gets executed at some point later (unless I were to convert it to a list immediately).
If your design can tolerate it, you could ensure snapshot level isolation with this code and avoid locking altogether. However, you would need to do the following:
Make DistributorBackpressure44Cache return a ReadOnlyCollection<T> instead, this way it is explicit you shouldn't mutate this data.
Ensure that any mutations to _distributorBackpressure44 occur on a copy and result in an atomic assignment back to _distributorBackpressure44 when complete.
var cache = _distributorBackpressure44.ToList();
this.RefreshCache(cache); // this assumes you *need* to know
// about the structure of the old list
// a design where this is not required
// is preferred
_distributorBackpressure44 = cache; // some readers will have "old"
// views of the cache, but all readers
// from some time T (where T < Twrite)
// will use the same "snapshot"
You can convert it to a list immediately (might be best--)
or
You can put a lock in the get for DistributorBackpressure44 that synchs with the cache refresh lock. You might want to include a locked and unlocked accessor; use the unlocked accessor when the result is going to be used immediately, and the locked one when the accessor is going to be used in a deferred execution situation.
Note that even that won't work if the cache refresh mutates the list _distributorBackpress44, only if it just replaces the referenced list.
Without knowing more about your architecture options, you could do something like this.
lock(CicApplication.DistributorBackpressure44Cache)
{
var sequence = from row in CicApplication.DistributorBackpressure44Cache
where row.Coater == this.Coater && row.IsDistributorInUse
select new GenericValue
{
ReadTime = row.CoaterTime.Value,
Value = row.BackpressureLeft
};
}
this.EvaluateBackpressure(sequence, "BackpressureLeftTarget");
Then in the code where you do the clear/update you would have something like this.
lock(CicApplication.DistributorBackpressure44Cache)
{
var objCache = CicApplication.DistributorBackpressure44Cache
objCache.Clear();
// code to add back items here
// [...]
}
It would be cleaner to have a central class (Singleton pattern maybe?) that controls everything surrounding the cache. But I don't know how feasible this is (i.e. putting the query code into another class and passing the parameters in). In lieu of something fancier, the above solution should work as long as you consistently remember to lock() each and every time you ever read/write to this object.

Observable return collection of collection

I'm just starting to learn Observable and all it variations and run into some strange problem. Here it is:
I have a WCF service declaration (after 'Add Service reference' process):
public IAsyncResult ReceiveAllUsersAsync(AsyncCallback Callback, object State)
{
// Do some work
}
and here the closing one:
public IObservable<User> EndReceiveAllUsers(IAsyncResult AsyncResultHandler)
{
// Do some work (actuall y it's a: return AsyncResultHandler.EndInvoke();
// return IObservable<User>
}
as you can see the EndReceiveAllUsers return collection of User's
next I run an RX like so:
// This will not owrk
Func<IObservable<User>> callWcfService = Observable.FromAsyncPattern<IObservable<User>>(BeginReceiveAll, EndReceiveAll);
// Actuall Func<> signature is:
Func<IObservable< IObservable<User> >> callWcfService = Observable.FromAsyncPattern<IObservable<User>>(BeginReceiveAll, EndReceiveAll);
but the problem is that whatever returned from Observable.FromAsyncPattern is IObservable<> of IObservable<User>. Actually it return IObservable< IObservable<User> >. How I could return just one result of IObservable<User> and not the collection of result
It really depends on the behavior you want, but to answer your question directly, you can simply concatenate each sequence of users after the completion of the last one:
IObservable<IObservable<User>> tooMuch = callWcfService();
IObservable<User> justRight = tooMuch.Concat();
Edit:
Observable abstracts the multiple calls to ReceiveAllUsersAsync/EndReceiveAllUsers for you, so each time you get a whole IEnumerable<User>, it's produced as a whole by Observable. So if you want to produce the Users one by one, you need to switch to functions that produce users one at a time. ReceiveAllUsersAsync is not the function you need, as it waits until all the users are obtained, and gives them all back in a pack.
The one thing you can do is to convert the obtained IEnumerable<User> to IObservable<User>, but this will again behave in such a way: (1) get all the users under the hood, (2) produce all of them without pause -- which is not what you expect from a decent IObservable<>.
Old answer, for reference:
Looking at http://msdn.microsoft.com/en-us/library/hh212031%28v=vs.103%29.aspx:
public static Func<IObservable<TResult>> FromAsyncPattern<TResult>(
Func<AsyncCallback, Object, IAsyncResult> begin,
Func<IAsyncResult, TResult> end
)
So you perhaps just need Observable.FromAsyncPattern<User>, as User is your TResult.

Categories