Does IEnumerable<T>.Count() actually work for IObservable<T>? - c#

Is there an example out there showing me how the Observable.Count<TSource> Method actually works? The examples I come up with appear to return a count wrapped in an observable instead of the expected count.
For example, I expect 1 to be returned from this:
System.Diagnostics.Debug.WriteLine((Observable.Return<string>("Hello world!")).Count());
Will 1 be returned in the future (because, after all, it is an asynchronous sequence)? Or am I missing a few things fundamental? As of this writing, I actually assume that .Count() will return the results of T and grow over time as long a results are pushed out. Really? Yes.

The aggregate operators in Rx work a bit differently than in LINQ - they do not immediately return a value, they return a future result (i.e. we can only know what the final Count is once the Observable completes).
So if you write:
Observable.Return("foo").Count().Subscribe(x => Console.WriteLine(x));
>>> 1
because, after all, it is an asynchronous sequence
This actually isn't exactly true. Here, everything will be run immediately, as soon as somebody calls Subscribe. There is nothing asynchronous about this code above, there are no extra threads, everything happens on the Subscribe.

I think that using an observable that returns immediately and also using the async/await syntax as rasx did in the comments is confusing matters rather too much.
Let's create a stream with 5 elements that come back one every second and then complete:
private IObservable<long> StreamWith5Elements()
{
return Observable.Interval(TimeSpan.FromSeconds(1))
.Take(5);
}
We can call it using async/await magic as in this LINQPad friendly example:
void Main()
{
CountExampleAsync().Wait();
}
private async Task CountExampleAsync()
{
int result = await StreamWith5Elements().Count();
Console.WriteLine(result);
}
But it's misleading what's going on here - Count() returns an IObservable<int>, but Rx is super-friendly with await and converts that result stream into a Task<int> - and the await then hands back that task's int result.
When you use await against an IObservable<T>, you are implicitly saying that you expect that observable to call OnNext() with a single result and then call OnComplete(). What actually happens is that you will get a Task<T> that returns the last value sent before the stream terminated. (Similar to how AsyncSubject<T> behaves).
This is useful because it means any stream can be mapped to a Task, but it does require some careful thought.
Now, the above example is equivalent to the following more traditional Rx:
void Main()
{
PlainRxCountExample();
}
private void PlainRxCountExample()
{
IObservable<int> countResult = StreamWith5Elements().Count();
countResult.Subscribe(count => Console.WriteLine(count));
/* block until completed for the sake of the example */
countResult.Wait();
}
Here you can see that Count() is indeed returning a stream of int - to provide an asynchronous count. It will return only when the source stream completes.
In the early days of Rx, Count() was in fact synchronous.
However, that's not a terribly useful state of affairs since it "Exits the Monad" - i.e. brings you out of IObservable<T> and prevents you from further composition with Rx operators.
Once you start "thinking in streams", the asynchronous nature of Count() is quite intuitive really, since of course you can only provide a count of a stream when it's finished - and why hang around for that?? :)

Related

Difference Await and ContinueWith

I've read some threads regards the difference between await and ContinueWith. But no one has answer me completely.
I've got a DataAccess Layer that insert records in a database using Dapper.
The InsertAsync method is:
public Task<int> InsertAsync(TEntity entity)
{
return Connection.InsertAsync(entity, Transaction).ContinueWith(r => Convert.ToInt32(r.Result));
}
I don't use async and await because in my head who will use this method will waiting for the result.
Is correct?
I don't use async and await because in my head who will use this method will waiting for the result. Is correct?
That is not correct. While the await keyword does indeed wait for Connection.InsertAsync to complete before it calls Convert.ToInt32, the moment it starts waiting for Connection.InsertAsync it releases control back to its caller.
In other words, the caller will not be stuck waiting for Connection.InsertAsync to finish. The caller will be told "this will take a while, feel free to do something else", so it can continue.
Now, if the caller themselves was explicitly told to await the InsertAsync(TEntity) method on the same line that you call the method, then it will wait and it won't do anything else (except release control back to its caller), but that's because it was explicitly instructed to wait at that point.
To explain in code:
// I will wait for this result
var myInt = await Connection.InsertAsync(myEntity);
// I will not do this until I receive myInt
var sum = 1 + 1;
var anotherSum = 2 + 2;
var andAnotherSum = 3 + 3;
Without the await, the caller will just move on to the next command and do its work, all the way up to the point where it is finally told that it must await the task returned from InsertAsync(TEntity).
To explain in code:
// I will ask for this result but not wait for it
var myIntTask = Connection.InsertAsync(myEntity);
// I will keep myself busy doing this work
var sum = 1 + 1;
var anotherSum = 2 + 2;
var andAnotherSum = 3 + 3;
// My work is done. I hope the task is already done too.
// If not, I will have to wait for it because I can't put it off any longer.
var myInt = await myIntTask;
I've read some threads regards the difference between await and ContinueWith.
Functionally speaking, there is no difference between the two. However, the ContinueWith syntax has recently fallen out of popular favor, and the await syntax is much more favored because it reduces nesting and improves readability.
In terms of waiting, the behavior is exactly the same.
Personally, I suspect that ContinueWith is a leftover artifact from initially trying to design async methods the same way that promises in JS work, but this is just a suspicion.
That should be fine. However, there is a recommendation to always pass a taskscheduler to Continue with, to avoid any ambiguity of what context the continuation will run in, even if it does not matter in this particular case.
I would prefer the version
public async Task<int> InsertAsync(TEntity entity)
{
var r = await Connection.InsertAsync(entity, Transaction);
return Convert.ToInt32(r);
}
I consider this easier to read, and it will always execute the continuation on the same context as the caller. Behind the scenes it will produce very similar code to your example.
You should definitely prefer async/await over the ContinueWith method.
public async Task<int> InsertAsync(TEntity entity)
{
var result = await Connection.InsertAsync(entity, Transaction);
return Convert.ToInt32(result);
}
The primitive ContinueWith method has many hidden gotchas. Exceptions thrown synchronously, exceptions wrapped in AggregateExceptions, TaskScheduler.Current ambiguity, SynchronizationContext not captured, nested Task<Task>s not properly unwrapped, will all come and bite you at one point or another, if you get in the habit of following the ContinueWith route.

How does this ConcurrentDictionary + Lazy<Task<T>> code work?

There's various posts/answers that say that the .NET/.NET Core's ConcurrentDictionary GetOrAdd method is not thread-safe when the Func delegate is used to calculate the value to insert into the dictionary, if the key didn't already exist.
I'm under the belief that when using the factory method of a ConcurrentDictionary's GetOrAdd method, it could be called multiple times "at the same time/in really quick succession" if a number of requests occur at the "same time". This could be wasteful, especially if the call is "expensive". (#panagiotis-kanavos explains this better than I). With this assumption, I'm struggling to understand how some sample code I made, seems to work.
I've created a working sample on .NET Fiddle but I'm stuck trying to understand how it works.
A common recommendation suggestion/idea I've read is to have a Lazy<Task<T>> value in the ConcurrentDictionary. The idea is that the Lazy prevents other calls from executing the underlying method.
The main part of the code which does the heavy lifting is this:
public static async Task<DateTime> GetDateFromCache()
{
var result = await _cache.GetOrAdd("someDateTime", new Lazy<Task<DateTime>>(async () =>
{
// NOTE: i've made this method take 2 seconds to run, each time it's called.
var someData = await GetDataFromSomeExternalDependency();
return DateTime.UtcNow;
})).Value;
return result;
}
This is how I read this:
Check if someDateTime key exists in the dictionary.
If yes, return that. <-- That's a thread-safe atomic action. Yay!
If no, then here we go ....
Create an instance of a Lazy<Task<DateTime>> (which is basically instant)
Return that Lazy instance. (so far, the actual 'expensive' operation hasn't been called, yet.)
Now get the Value, which is a Task<DateTime>.
Now await this task .. which finally does the 'expensive' call. It waits 2 seconds .. and then returns the result (some point in Time).
Now this is where I'm all wrong. Because I'm assuming (above) that the value in the key/value is a Lazy<Task<DateTime>> ... which the await would call each time. If the await is called, one at a time (because the Lazy protects other callers from all calling at the same time) then I would have though that the result would a different DateTime with each independent call.
So can someone please explain where I'm wrong in my thinking, please?
(please refer to the full running code in .NET Fiddler).
Because I'm assuming (above) that the value in the key/value is a Lazy<Task<DateTime>>
Yes, that is true.
which the await would call each time. If the await is called, one at a time (because the Lazy protects other callers from all calling at the same time) then I would have though that the result would a different DateTime with each independent call.
await is not a call, it is more like "continue execution when the result is available". Accessing Lazy.Value will create the task, and this will initiate the call to the GetDataFromSomeExternalDependency that eventually returns the DateTime. You can await the task however many times you want and get the same result.

C# Rx Observable.Never<> behaves like Observable.Empty<>?

I'm new to Rx and have this code snippet for a try.
Observable.Never<string>().Subscribe(Console.Write);
Observable.Empty<string>().Subscribe(Console.Write);
I expected that Never<string>() will behave like Console.ReadKey which will not end, but as I run these 2 lines, they end immediately, so [Never] behaves like [Empty] to me.
What is the correct understanding of [Never] and is there a good sample usage for it?
Both the Observable.Never() and Observable.Empty() observable will not emit any values. However, the observable built with Observable.Never() will not complete and instead stays "open/active". It might be a difference at the location where you consume these observable if the observable completes (Empty()) or not (Never()), but this depends on your actual use-case.
Having observables which doesn't emit any values might sound useless, but maybe you are at a location where you have to provide an observable (instead of using null). So you can write something like this:
public override IObservable<string> NameChanged => Observable.Never<string>();
So I don't have a ton of experience with Rx, but I believe all Subscribe is doing is registering what to do when the observable emits. If your observable never emits (ie Empty or Never) then the method is never called. The application is not waiting for the subscription itself to end. If you wanted to wait forever you would use something like
Observable.Never<string>().Wait();
This ties back into the reason you should not use async operation in Subscribe. Take the following code
static void Main(string[] args)
{
Observable.Range(1, 5).Subscribe(async x => await DoTheThing(x));
Console.WriteLine("done");
}
static async Task DoTheThing(int x)
{
await Task.Delay(TimeSpan.FromSeconds(x));
Console.WriteLine(x);
}
When run the application will immediately write "done" and exit after pushing the values into the observable because it is unaware of the subscriber in the context of whether it has completed its handling or not. Hopefully I made that clear, and if someone with more Rx knowledge wants to step in to help if needed that'd be good.
This link gives you the difference between empty,never ,and throw:
http://reactivex.io/documentation/operators/empty-never-throw.html
And this is one usage of Never:
https://rxjs-dev.firebaseapp.com/api/index/const/NEVER

C# await vs continuations: not quite the same?

After reading Eric Lippert’s answer I got the impression that await and call/cc are pretty much two sides of the same coin, with at most syntactic differences. However, upon trying to actually implement call/cc in C# 5, I ran into a problem: either I misunderstand call/cc (which is fairly possible), or await is only reminiscent of call/cc.
Consider pseudo-code like this:
function main:
foo();
print "Done"
function foo:
var result = call/cc(bar);
print "Result: " + result;
function bar(continuation):
print "Before"
continuation("stuff");
print "After"
If my understanding of call/cc is correct, then this should print:
Before
Result: stuff
Done
Crucially, when the continuation is called, the program state is restored along with the call history, so that foo returns into main and never comes back to bar.
However, if implemented using await in C#, calling the continuation does not restore this call history. foo returns into bar, and there’s no way (that I can see) that await can be used to make the correct call history part of the continuation.
Please explain: did I completely mis-understand the operation of call/cc, or is await just not quite the same as call/cc?
Now that I know the answer, I have to say that there’s a good reason to think of them as fairly similar. Consider what the above program looks like in pseudo-C#-5:
function main:
foo();
print "Done"
async function foo:
var result = await(bar);
print "Result: " + result;
async function bar():
print "Before"
return "stuff";
print "After"
So while the C# 5 style never gives us a continuation object to pass a value to, overall the similarity is quite striking. Except that this time it’s totally obvious that "After" never gets called, unlike in the true-call/cc example, which is another reason to love C# and praise its design!
await is indeed not quite the same as call/cc.
The kind of totally fundamental call/cc that you are thinking of would indeed have to save and restore the entire call stack. But await is just a compile-time transformation. It does something similar, but not using the real call stack.
Imagine you have an async function containing an await expression:
async Task<int> GetInt()
{
var intermediate = await DoSomething();
return calculation(intermediate);
}
Now imagine that the function you call via await itself contains an await expression:
async Task<int> DoSomething()
{
var important = await DoSomethingImportant();
return un(important);
}
Now think about what happens when DoSomethingImportant() finishes and its result is available. Control returns to DoSomething(). Then DoSomething() finishes and what happens then? Control returns to GetInt(). The behaviour is exactly as it would be if GetInt() were on the call stack. But it isn’t really; you have to use await at every call that you want simulated this way. Thus, the call stack is lifted into a meta-call-stack that is implemented in the awaiter.
The same, incidentally, is true of yield return:
IEnumerable<int> GetInts()
{
foreach (var str in GetStrings())
yield return computation(str);
}
IEnumerable<string> GetStrings()
{
foreach (var stuff in GetStuffs())
yield return computation(stuff);
}
Now if I call GetInts(), what I get back is an object that encapsulates the current execution state of GetInts() (so that calling MoveNext() on it resumes operation where it left off). This object itself contains the iterator that is iterating through GetStrings() and calls MoveNext() on that. Thus, the real call stack is replaced by a hierarchy of objects which recreate the correct call stack each time via a series of calls to MoveNext() on the next inner object.

Async CTP Task.WhenAll issue

I am stuck on something which appears simple but I cannot see what i'm doing wrong. I have a simple class structure 'StaticQuote' which describes values returned from a complex calculation for which I am trying to find the lowest value. Because the Calculate call is expensive I am creating an array of tasks so the calculations execute in parallel, and am then using a Task.WhenAll to wait until they have all finished before comparing the results (which are stored in StaticQuote objects). The problem is that when trying to return the array of StaticQuotes I am getting the following error:
Cannot implicitly convert type 'System.Threading.Tasks.Task<Services.QuoteGeneratorAsync.StaticQuote[]>' to 'Services.QuoteGeneratorAsync.StaticQuote[]'
I have seen similar examples with strings etc where this assignment works perfectly so cannot understand which the right hand side is not returning an array of StaticQuote results? I am new to multi threaded code and the Async CTP. Can anyone provide the answer? Many thanks.
example problem:
List<Task<StaticQuote>> Calculations = new List<Task<StaticQuote>>();
foreach()
{
Calculations.Add(TaskEx.RunEx(() => Calculate(...my params....)));
}
StaticQuote[] Quotes=TaskEx.WhenAll<StaticQuote>(Calculations); --//this line won't compile
TaskEx.WhenAll returns a Task<T[]> which indicates when all the other tasks have finished. So you want:
StaticQuote[] quotes = await TaskEx.WhenAll(Calculations);
The await expression "unwraps" a Task<T> to a T. So elsewhere if you've got:
Task<string> downloadTask = webClient.DownloadStringTaskAsync(url);
string result = await downloadTask;
it's exactly the same thing - it's just that the WhenAll version is slightly more complicated because it's got a collection of task inputs and outputs, instead of a single one.
Obviously in order to use await you have to be in an async method to start with.
If all of this is still confusing, you might want to read my blog posts about async, as well as those of Eric Lippert. (There are plenty of others available too, of course.)

Categories