How to call multiple invoke operations one by one? - c#

I have some invoke operations (all different) that I need to call based on the user selected items that are stored in an ObservableCollection and return a string/int value.
Now when the selection is only one item it is straight forward, I can call and use the Completed event and get my return value.
I have approx <= 8 items in the list I need to iterate and perform the invoke operation on each.
I see the foreach will not really wait for the InvokeOperation to finish and just continuously iterates till the end of the list making them run in parallel..sort of...
How do I perform only one operation at once and iterate only when the previous operation is completed (irrespective of the result)?
1 by 1 execution of InvokeOperations is what I'm looking for..any clues, hints..?
Let me know if I'm unclear..
Cheers.
EDIT: The InvokeOperation(s) are different from each other. Each of them doin different operations on the DB which can be time consuming.The main reason for looking into executing them 1-1 is to update the user screen with the output (success/fail) for each of them and not do all at once.
//Pseudo-code
foreach (var item in SelectedItems)
{
var id = item.ID;
switch(id)
{
case: 1
InvokeOperataion<int> inv = context.PerformUpdateFor_1(item);
inv.Completed += (s,a) => {
//get the value assign it to Textblock.
};
break;
case: 2
InvokeOperataion<int> inv = context.PerformUpdateFor_2(item);
inv.Completed += (s,a) => {
//get the value assign it to Textblock.
};
break;
//Other cases similar to this
}
}

Take a look at Reactive Extensions for Silverlight. You will easily be able to wrap your RIA Services InvokeOperation in an IObservable which represents its asynchronous completion. One way to do this involves the Observable.Create method.
After that, you will be able to compose the collection of IObservables in many ways: executing them in sequence using Observable.Concat, executing them in parallel using Observable.ForkJoin, filtering them, joining them etc.
I have used this successfully for RIA Services' load operations, and to a smaller extent invoke operations. The composability of observables in my opinion is much nicer to work with than writing callback methods explicitly.
Here is a blog post outlining the approach for load operations, but the approach will be very similar: Linq Query Expression Syntax with RIA Domain Context

Related

How to safely iterate over an IAsyncEnumerable to send a collection downstream for message processing in batches

I've watched the chat on LINQ with IAsyncEnumerable which has given me some insight on dealing with extension methods for IAsyncEnumerables, but wasn't detailed enough frankly for a real-world application, especially for my experience level, and I understand that samples/documentation don't really exist as of yet for IAsyncEnumerables
I'm trying to read from a file, do some transformation on the stream, returning a IAsyncEnumerable, and then send those objects downstream after an arbitrary number of objects have been obtained, like:
await foreach (var data in ProcessBlob(downloadedFile))
{
//todo add data to List<T> called listWithPreConfiguredNumberOfElements
if (listWithPreConfiguredNumberOfElements.Count == preConfiguredNumber)
await _messageHandler.Handle(listWithPreConfiguredNumberOfElements);
//repeat the behaviour till all the elements in the IAsyncEnumerable returned by ProcessBlob are sent downstream to the _messageHandler.
}
My understanding from reading on the matter so far is that the await foreach line is working on data that employs the use of Tasks (or ValueTasks), so we don't have a count up front. I'm also hesitant to use a List variable and just do a length-check on that as sharing that data across threads doesn't seem very thread-safe.
I'm using the System.Linq.Async package in the hopes that I could use a relevant extensions method. I can see some promise in the form of TakeWhile, but my understanding on how thread-safe the task I intend to do is not all there, causing me to lose confidence.
Any help or push in the right direction would be massively appreciated, thank you.
There is an operator Buffer that does what you want, in the package System.Interactive.Async.
// Projects each element of an async-enumerable sequence into consecutive
// non-overlapping buffers which are produced based on element count information.
public static IAsyncEnumerable<IList<TSource>> Buffer<TSource>(
this IAsyncEnumerable<TSource> source, int count);
This package contains operators like Amb, Throw, Catch, Defer, Finally etc that do not have a direct equivalent in Linq, but they do have an equivalent in System.Reactive. This is because IAsyncEnumerables are conceptually closer to IObservables than to IEnumerables (because both have a time dimension, while IEnumerables are timeless).
I'm also hesitant to use a List variable and just do a length-check on that as sharing that data across threads doesn't seem very thread-safe.
You need to think in terms of execution flows, not threads, when dealing with async; since you are await-ing the processing step, there isn't actually a concurrency problem accessing the list, because regardless of which threads are used: the list is only accessed once at a time.
If you are still concerned, you could new a list per batch, but that is probably overkill. What you do need, however, is two additions - a reset between batches, and a final processing step:
var listWithPreConfiguredNumberOfElements = new List<YourType>(preConfiguredNumber);
await foreach (var data in ProcessBlob(downloadedFile)) // CAF?
{
listWithPreConfiguredNumberOfElements.Add(data);
if (listWithPreConfiguredNumberOfElements.Count == preConfiguredNumber)
{
await _messageHandler.Handle(listWithPreConfiguredNumberOfElements); // CAF?
listWithPreConfiguredNumberOfElements.Clear(); // reset for a new batch
// (replace this with a "new" if you're still concerned about concurrency)
}
}
if (listWithPreConfiguredNumberOfElements.Any())
{ // process any stragglers
await _messageHandler.Handle(listWithPreConfiguredNumberOfElements); // CAF?
}
You might also choose to use ConfigureAwait(false) in the three spots marked // CAF?

How would I write this code with Reactive Programming?

I just started messing around with reactive programming, and I know just enough to write code but not enough to figure out what's happening when I don't get what I expect. I don't really have a mentor available other than blog posts. I haven't found a very good solution to a situation I'm having, and I'm curious about the right approach.
The problem:
I need to get a Foo, which is partially composed of an array of Bar objects. I fetch the Bar objects from web services. So I represented each web service call as an IObservable from which I expect 0 or 1 elements before completion. I want to make an IObservable that will:
Subscribe to each of the IObservable instances.
Wait for up to a 2 second Timeout.
When either both sequences complete or the timeout happens:
Create an array with any of the Bar objects that were generated (there may be 0.)
Produce the Foo object using that Bar[].
I sort of accomplished this with this bit of code:
public Foo CreateFoo() {
var producer1 = webService.BarGenerator()
.Timeout(TimeSpan.FromSeconds(2), Observable.Empty<Bar>());
var producer2 = // similar to above
var pipe = producer1.Concat(producer2);
Bar[] result = pipe.ToEnumerable().ToArray();
...
}
That doesn't seem right, for a lot of reasons. The most obvious is Concat() will start the sequences serially rather than in parallel, so that's a 4-second timeout. I don't really care that it blocks, it's actually convenient for the architecture I'm working with that it does. I'm fine with this method becoming a generator of IObservable, but there's a few extra caveats here that seem to make that challenging when I try:
I need the final array to put producer1 and producer2's result in that order, if they both produce a result.
I'd like to use a TestScheduler to verify the timeout but haven't succeeded at that yet, I apparently don't understand schedulers at all.
This is, ultimately, a pull model, whatever gets the Foo needs it at a distinct point and there's no value to receiving it 'on the fly'. Maybe this tilts the answer to "Don't use Rx". To be honest, I got stuck enough I switched to a Task-based API. But I want to see how one might approach this with Rx, because I want to learn.
var pipe = producer1
.Merge(producer2)
.Buffer(Observable.Timer(TimeSpan.FromSeconds(2), testScheduler))
.Take(1);
var subscription = pipe
.Select(list => new Foo(list.ToArray())
.Subscribe(foo => {} /* Do whatever you want with your foo here.*/);
Buffer takes all elements emitted during a window (in our case in two seconds), and outputs a list.
If you want to stick with your pull model, instead of a subscription you could do:
var list = await pipe;
var foo = new Foo(list.ToArray());
//....

Multi Threading with LINQ to SQL

I am writing a WinForms application. I am pulling data from my database, performing some actions on that data set and then plan to save it back to the database. I am using LINQ to SQL to perform the query to the database because I am only concerned with 1 table in our database so I didn't want to implement an entire ORM for this.
I have it pulling the dataset from the DB. However, the dataset is rather large. So currently what I am trying to do is separate the dataset into 4 relatively equal sized lists (List<object>).
Then I have a separate background worker to run through each of those lists, perform the action and report its progress while doing so. I have it planned to consolidate those sections into one big list once all 4 background workers have finished processing their section.
But I keep getting an error while the background workers are processing their unique list. Do the objects maintain their tie to the DataContext for the LINQ to SQL even though they have been converted to List objects? Any ideas how to fix this? I have minimal experience with multi-threading so if I am going at this completely wrong, please tell me.
Thanks guys. If you need any code snippets or any other information just ask.
Edit: Oops. I completely forgot to give the error message. In the DataContext designer.cs it gives the error An item with the same key has already been added. on the SendPropertyChanging function.
private void Setup(){
List<MyObject> quarter1 = _listFromDB.Take(5000).ToList();
bgw1.RunWorkerAsync();
}
private void bgw1_DoWork(object sender, DoWorkEventArgs e){
e.Result = functionToExecute(bgw1, quarter1);
}
private List<MyObject> functionToExecute(BackgroundWorker caller, List<MyObject> myList)
{
int progress = 0;
foreach (MyObject obj in myList)
{
string newString1 = createString();
obj.strText = newString;
//report progress here
caller.ReportProgress(progress++);
}
return myList;
}
This same function is called by all four workers and is given a different list for myList based on which worker is called the function.
Because a real answer has yet to be posted, I'll give it a shot.
Given that you haven't shown any LINQ-to-SQL code (no usage of DataContext) - I'll take an educated guess that the DataContext is shared between the threads, for example:
using (MyDataContext context = new MyDataContext())
{
// this is just some random query, that has not been listed - ToList()
// thus query execution is defered. listFromDB = IQueryable<>
var listFromDB = context.SomeTable.Where(st => st.Something == true);
System.Threading.Tasks.Task.Factory.StartNew(() =>
{
var list1 = listFromDB.Take(5000).ToList(); // runs the SQL query
// call some function on list1
});
System.Threading.Tasks.Task.Factory.StartNew(() =>
{
var list2 = listFromDB.Take(5000).ToList(); // runs the SQL query
// call some function on list2
});
}
Now the error you got - An item with the same key has already been added. - was because the DataContext object is not thread safe! A lot of stuff happens in the background - DataContext has to load objects from SQL, track their states, etc. This background work is what throws the error (because each thread is running the query, the DataContext gets accessed).
At least this is my own personal experience. Having come across the same error while sharing the DataContext between multiple threads. You only have two options in this scenario:
1) Before starting the threads, call .ToList() on the query, making listFromDB not an IQueryable<>, but an actual List<>. This means that the query has already ran and the threads operate on an actual List, not on the DataContext.
2) Move the DataContext definition into each thread. Because the DataContext is no longer shared, no more errors.
The third option would be to re-write the scenario into something else, like you did (for example, make everything sequential on a single background thread)...
First of all, I don't really see why you'd need multiple worker threads at all. (are theses lists in seperate databases / tables / servers? Do you really want to show 4 progress bars if you have 4 lists or are you somehow merging these progress reportings into one weird progress bar:D
Also, you're trying to speed up processing updates to your databases, but you don't send linq to sql any SAVES, so you're not really batching transactions, you'll just save everything at the end in one big transaction, is that really what you're aiming for? the progress bar will just stop at 100% and then spend a lot of time on the SQL side.
Just create one background thread and process everything synchronously, but batch a save transaction every couple of rows (i'd suggest something like every 1000 rows, but you should experiment with this) , it'll be fast, even with millions of rows,
If you really need this multithreaded solution:
The "another blabla with the same key has been added" error suggests that you are adding the same item to multiple "mylists", or adding the same item to the same list twice, otherwise how would there be any errors at all?
Using Parallel LINQ (PLINQ), you can take benefit of multiple CPU cores for processing your data. But if your application is going to run on single-core CPU, then splitting data into peaces wouldn't give you performance benefits instead it will incur some context-change overhead.
Hope it Helps

new objects added during long loop

We currently have a production application that runs as a windows service. Many times this application will end up in a loop that can take several hours to complete. We are using Entity Framework for .net 4.0 for our data access.
I'm looking for confirmation that if we load new data into the system, after this loop is initialized, it will not result in items being added to the loop itself. When the loop is initialized we are looking for data "as of" that moment. Although I'm relatively certain that this will work exactly like using ADO and doing a loop on the data (the loop only cycles through data that was present at the time of initialization), I am looking for confirmation for co-workers.
Thanks in advance for your help.
//update : here's some sample code in c# - question is the same, will the enumeration change if new items are added to the table that EF is querying?
IEnumerable<myobject> myobjects = (from o in db.theobjects where o.id==myID select o);
foreach (myobject obj in myobjects)
{
//perform action on obj here
}
It depends on your precise implementation.
Once a query has been executed against the database then the results of the query will not change (assuming you aren't using lazy loading). To ensure this you can dispose of the context after retrieving query results--this effectively "cuts the cord" between the retrieved data and that database.
Lazy loading can result in a mix of "initial" and "new" data; however once the data has been retrieved it will become a fixed snapshot and not susceptible to updates.
You mention this is a long running process; which implies that there may be a very large amount of data involved. If you aren't able to fully retrieve all data to be processed (due to memory limitations, or other bottlenecks) then you likely can't ensure that you are working against the original data. The results are not fixed until a query is executed, and any updates prior to query execution will appear in results.
I think your best bet is to change the logic of your application such that when the "loop" logic is determining whether it should do another interation or exit you take the opportunity to load the newly added items to the list. see pseudo code below:
var repo = new Repository();
while (repo.HasMoreItemsToProcess())
{
var entity = repo.GetNextItem();
}
Let me know if this makes sense.
The easiest way to assure that this happens - if the data itself isn't too big - is to convert the data you retrieve from the database to a List<>, e.g., something like this (pulled at random from my current project):
var sessionIds = room.Sessions.Select(s => s.SessionId).ToList();
And then iterate through the list, not through the IEnumerable<> that would otherwise be returned. Converting it to a list triggers the enumeration, and then throws all the results into memory.
If there's too much data to fit into memory, and you need to stick with an IEnumerable<>, then the answer to your question depends on various database and connection settings.
I'd take a snapshot of ID's to be processed -- quickly and as a transaction -- then work that list in the fashion you're doing today.
In addition to accomplishing the goal of not changing the sample mid-stream, this also gives you the ability to extend your solution to track status on each item as it's processed. For a long-running process, this can be very helpful for progress reporting restart / retry capabilities, etc.

Publisher/Subscriber in LINQ?

Problem:
IEnumerable<Signal> feed = GetFeed();
var average1 = feed.MovingAverage(10);
var average2 = feed.MovingAverage(20);
var zipped = average1.Zip(average2, (x,y) => Tuple.Create(x,y));
When I iterate through "zipped", GetFeed().GetEnumerator() gets called twice and creates all sorts of synchronization issues. Is there a LINQ operator that can be used to broadcast values from single producer to multiple consumers? I know about Memoize, but in my case I can't predict buffer size to keep slow and fast consumers "happy".
I am thinking about writing my own operator that would keep separate queues for each consumer, but wanted to check if there is an existing solution.
What you want is Reactive Extensions. It's like LINQ to Objects, but in reverse: you don't pull values, they're pushed through observers.
It takes a little while to get used to it, but judging by what you've posted, it's exactly the right model for you.

Categories