I have many methods calling each other that each have to certain tasks, some of them asynchronous, that all operate on a DOM (so only one thread must access the DOM at any time).
For example:
object A() {
/*...A() code 1...*/
var res = B();
/*...A() code 2 that uses res...*/
}
object B() {
/*...B code 1...*/
var res1 = C();
/*...B code 2 that uses res1...*/
var res2 = C();
/*...B code 3 that uses res2...*/
}
object C() {
/*...C code 1...*/
if (rnd.NextDouble() < 0.3) { // unpredictable condition
startAsyncStuff();
/*...C code 2 that uses async result above...*/
}
if (rnd.NextDouble() < 0.7) { // unpredictable condition
startOtherAsyncStuff();
/*...C code 3 that might use any/both async results above...*/
}
}
Now let's say I have a method that wants to execute method A() 1000 times as fast as possible (the async methods can run in separate threads, however all other code must only access the DOM one at a time), so Ideally when the async calls are reached code execution for A(), B() and C() are paused, so A() can be called again.
There are 2 ways I can think of to do this. One is with yield, by changing all the methods to iterators I can pause and resume execution:
struct DeferResult {
public object Result;
public bool Deferred;
}
IEnumerator<DeferResult> A() {
/*...A() code 1...*/
var dres = B();
if (dres.Deferred) yield dres;
/*...A() code 2...*/
}
IEnumerator<DeferResult> B() {
/*...B code 1...*/
var dres1 = C();
if (dres1.Deferred) yield dres1;
/*...B code 2...*/
var dres2 = C();
if (dres2.Deferred) yield dres2;
/*...B code 3...*/
}
IEnumerator<DeferResult> C() {
/*...C code 1...*/
if (rnd.NextDouble() < 0.3) { // unpredictable condition
startAsyncStuff();
yield return new DeferResult { Deferred = true; }
/*...C code 2 that uses async result above...*/
}
if (rnd.NextDouble() < 0.7) { // unpredictable condition
startOtherAsyncStuff();
yield return new DeferResult { Deferred = true; }
/*...C code 3 that might use any/both async results above...*/
}
yield return new DeferResult { Result = someResult(); }
}
void Main() {
var deferredMethods = new List<IEnumerator<DeferResult>>();
for (int i = 0; i < 1000; i++) {
var en = A().GetEnumerator();
if (en.MoveNext())
if (en.Current.Deferred)
deferredMethods.Add(en);
}
// then use events from the async methods so when any is done continue
// running it's enumerator to execute the code until the next async
// operation, or until finished
// once all 1000 iterations are complete call an AllDone() method.
}
This method has quite some overhead from the iterators, and is a bit more code intensive, however it all runs on one thread so I don't need to synchronize the DOM access.
Another way would be to use threads (1000 simultaneous threads are a bad idea, so i'd implement some kind of thread pooling), but this requires synchronizing DOM access which is costly.
Are there any other methods I can use to defer code execution under these conditions? What would be the recommended way to do this?
As Karl has suggested, does this need to be multi-threaded? I may go for multi-threaded situation if
DOM access are random but not frequent
All other code in A, B, C is substantial in terms of time (as compared to DOM Access code)
All other code in A, B, C can be executed in thread-safe way w/o doing any locking etc i.e. if they depend on some shared state then you have synchronize access to that as well as.
Now in such case, I would consider using a thread pool to launch A multiple times with synchronizing access to DOM. Cost of DOM synchronization can be reduced using thread-safe caching - of course that depends upon a kind of DOM access.
Related
Is there a way in c# to call a method so that if the method takes to long to complete, the method will be canceled and it will return to the calling method? I think I can do this with threading but what if threading is not needed?
For reference, the method I may need to kill/stop/abort is calling the CorelDraw 15 API. This opens an instance of CorelDraw and I have received non-repeatable errors in this method. Meaning, I can process the same image twice and one time it will freeze or error and the other it will not.
The current solution to the issue I am using is to have a second application that does Process.Start(firstAppExecutablePath) and then checks a variable in a text file and if the variable doesn't change after 10 minutes, .Kill(); is called on the instance of the process. I would prefer to avoid this solution if possible as it seems clunky and prone to issues. Since it runs .Kill(); it is being very messy in how things close but generally does not cause an issue.
Not built-in, no, since interrupting arbitrary code cannot be done safely (what if it's in the middle of calling a C library function (that doesn't support exceptions) which has just taken a global lock and needs to release it?).
But you can write such support yourself. I wouldn't add threads to the mix unless absolutely necessary, since they come with an entire new dimension of potential problems.
Example:
void Caller()
{
int result;
if (TryDoSomething(out result, 100)) {
System.Console.WriteLine("Result: {0}", result);
}
}
bool TryDoSomething(out int result, int timeoutMillis)
{
var sw = Stopwatch.StartNew();
result = 0x12345678;
for (int i = 0; i != 100000000; ++i) {
if (sw.ElapsedMilliseconds > timeoutMillis)
return false;
result += i / (result % 43) + (i % 19);
}
return true;
}
Threading is absolutely needed unless you are ok with checking the timeout from within the function - which probably you arn't. So here is a minimalistic approach with threads:
private static bool ExecuteWithTimeout(TimeSpan timeout, Action action)
{
Thread x = new Thread(() => { action(); });
x.Start();
if (!x.Join(timeout))
{
x.Abort(); //Or Interrupt instead, if you use e.g. Thread.Sleep in your method
return false;
}
return true;
}
I'm talking about single-threaded (not TaskEx for WindowsPhone) (ok, even basic Task is designed to be async, this makes question senseless) and synchronous (no async/await) pure Task.
Can in be useful in some cases (i have quite common app which pulls data from the server, deserialize it and shows results), or is Task just a basement for
await TaskEx.Run()?
EDIT1: i mean, how this
void Foo()
{
DoSmth();
}
void Main()
{
int a = 1;
Foo();
int b = 1;
}
would differ from
void Main()
{
int a = 1;
Task.Run( () => DoSmth );
int b = 1;
}
Calling Foo(); is also kinda a "promise that next code would be called after Foo() is done".
EDIT2: I just ran in wp7 app
Debug.WriteLine("OnLoaded {0} ", Thread.CurrentThread.ManagedThreadId);
Task.Factory.StartNew(() =>
{
Thread.Sleep(5000);
Debug.WriteLine("Run Id: {0}", Thread.CurrentThread.ManagedThreadId);
});
Debug.WriteLine("Done");
Got the output:
OnLoaded 1
Done
Run Id: 4
So, is Task.Factory.StartNew() the same as TaskEx.Run() ?
ESIT3: so, here is a short summary (as Task.Factory.StartNew() is the same as TaskEx.Run()):
Thread.Sleep(5000); // UI is frozen for 5 seconds
int a = 1; // this is called after 5 seconds
TaskEx.Run(() =>
{
Thread.Sleep(5000);
int a = 1; // this is called after 5 seconds
}
int b = 2; // UI is not frozen, this is called instantly
await TaskEx.Run(() => // UI is not frozen, but...
{
Thread.Sleep(5000);
int a = 1; // this is called after 5 seconds
}
int b = 2; // this is called then task is done
A Task is just a way to represent something that will complete in the future. This is most commonly an asynchronous operation or something running in a background thread (via Task.Run/TaskEx.Run).
A "synchronous pure Task" really doesn't make sense - the entire purpose of a Task is to represent something that is not synchronous.
Can in be useful in some cases (i have quite common app which pulls data from the server, deserialize it and shows results),
In this case, since the data is pulling from a server, that is by its nature a good canidate for an asynchronous operation. This would make it a perfect canidate for Task (or Task<T>).
In response to your edit:
In the first version, everything is just run sequentially.
The second version, using Task.Run, actually causes DoSmth() to execute in a background thread. The Task returned can be used with await to asynchonously wait for it to complete, if you wanted to do so. This means that DoSmth() will potentially run at the same time as the assignment to b (and subsequent operations).
I have a simple function as the following:
static Task<A> Peirce<A, B>(Func<Func<A, Task<B>>, Task<A>> a)
{
var aa = new TaskCompletionSource<A>();
var tt = new Task<A>(() =>
a(b =>
{
aa.SetResult(b);
return new TaskCompletionSource<B>().Task;
}).Result
);
tt.Start();
return Task.WhenAny(aa.Task, tt).Result;
}
The idea is simple: for any implementation of a, it must return a Task<A> to me. For this purpose, it may or may not use the parameter (of type Func<A, Task<B>). If it do, our callback will be called and it sets the result of aa, and then aa.Task will complete. Otherwise, the result of a will not depend on its parameter, so we simply return its value. In any of the situation, either aa.Task or the result of a will complete, so it should never block unless a do not uses its parameter and blocks, or the task returned by a blocks.
The above code works, for example
static void Main(string[] args)
{
Func<Func<int, Task<int>>, Task<int>> t = a =>
{
return Task.FromResult(a(20).Result + 10);
};
Console.WriteLine(Peirce(t).Result); // output 20
t = a => Task.FromResult(10);
Console.WriteLine(Peirce(t).Result); // output 10
}
The problem here is, the two tasks aa.Task and tt must be cleaned up once the result of WhenAny has been determined, otherwise I am afraid there will be a leak of hanging tasks. I do not know how to do this, can any one suggest something? Or this is actually not a problem and C# will do it for me?
P.S. The name Peirce came from the famous "Peirce's Law"(((A->B)->A)->A) in propositional logic.
UPDATE: the point of matter is not "dispose" the tasks but rather stop them from running. I have tested, when I put the "main" logic in a 1000 loop it runs slowly (about 1 loop/second), and creates a lot of threads so it is a problem to solve.
A Task is a managed object. Unless you are introducing unmanaged resources, you shouldn't worry about a Task leaking resources. Let the GC clean it up and let the finalizer take care of the WaitHandle.
EDIT:
If you want to cancel tasks, consider using cooperative cancellation in the form of a CancellationTokenSource. You can pass this token to any tasks via the overload, and inside of each task, you may have some code as follows:
while (someCondition)
{
if (cancelToken.IsCancellationRequested)
break;
}
That way your tasks can gracefully clean up without throwing an exception. However you can propogate an OperationCancelledException if you call cancelToken.ThrowIfCancellationRequested(). So the idea in your case would be that whatever finishes first can issue the cancellation to the other tasks so that they aren't hung up doing work.
Thanks to #Bryan Crosby's answer, I can now implement the function as the following:
private class CanceledTaskCache<A>
{
public static Task<A> Instance;
}
private static Task<A> GetCanceledTask<A>()
{
if (CanceledTaskCache<A>.Instance == null)
{
var aa = new TaskCompletionSource<A>();
aa.SetCanceled();
CanceledTaskCache<A>.Instance = aa.Task;
}
return CanceledTaskCache<A>.Instance;
}
static Task<A> Peirce<A, B>(Func<Func<A, Task<B>>, Task<A>> a)
{
var aa = new TaskCompletionSource<A>();
Func<A, Task<B>> cb = b =>
{
aa.SetResult(b);
return GetCanceledTask<B>();
};
return Task.WhenAny(aa.Task, a(cb)).Unwrap();
}
and it works pretty well:
static void Main(string[] args)
{
for (int i = 0; i < 1000; ++i)
{
Func<Func<int, Task<String>>, Task<int>> t =
async a => (await a(20)).Length + 10;
Console.WriteLine(Peirce(t).Result); // output 20
t = async a => 10;
Console.WriteLine(Peirce(t).Result); // output 10
}
}
Now it is fast and not consuming to much resources. It can be even faster (about 70 times in my machine) if you do not use the async/await keyword:
static void Main(string[] args)
{
for (int i = 0; i < 10000; ++i)
{
Func<Func<int, Task<String>>, Task<int>> t =
a => a(20).ContinueWith(ta =>
ta.IsCanceled ? GetCanceledTask<int>() :
Task.FromResult(ta.Result.Length + 10)).Unwrap();
Console.WriteLine(Peirce(t).Result); // output 20
t = a => Task.FromResult(10);
Console.WriteLine(Peirce(t).Result); // output 10
}
}
Here the matter is, even you can detected the return value of a(20), there is no way to cancel the async block rather than throwing an OperationCanceledException and it prevents WhenAny to be optimized.
UPDATE: optimised code and compared async/await and native Task API.
UPDATE: If I can write the following code it will be ideal:
static Task<A> Peirce<A, B>(Func<Func<A, Task<B>>, Task<A>> a)
{
var aa = new TaskCompletionSource<A>();
return await? a(async b => {
aa.SetResult(b);
await break;
}) : await aa.Task;
}
Here, await? a : b has value a's result if a successes, has value b if a is cancelled (like a ? b : c, the value of a's result should have the same type of b).
await break will cancel the current async block.
As Stephen Toub of MS Parallel Programming Team says: "No. Don't bother disposing of your tasks."
tldr: In most cases, disposing of a task does nothing, and when the task actually has allocated unmanaged resources, its finalizer will release them when the task object is collected.
I'm new to RX.
I'd like to traverse an IEnumerable and publish to multi DataHandlers that process the data in their respective threads.
Below is my sample program. The publish works and a new thread is created, but the 3 RowHandlers are all running in 1 thread. I need 3 threads. What is the best way to implement this?
class Program
{
public class MyDataGenerator
{
public IEnumerable<int> myData()
{
//Heavy lifting....Don't want to process more than once.
yield return 1;
yield return 2;
yield return 3;
yield return 4;
yield return 5;
yield return 6;
}
}
static void Main(string[] args)
{
MyDataGenerator h = new MyDataGenerator();
Console.WriteLine("Thread id " + Thread.CurrentThread.ManagedThreadId.ToString());
//
var shared = h.myData().ToObservable().Publish();
///////////////////////////////
// Row Handling Requirements
//
// 1. Single Scan of IEnumerable.
// 2. Row handlers process data in their own threads.
// 3. OK if scanning thread blocks while data is processed
//
//Create the RowHandlers
MyRowHandler rn1 = new MyRowHandler();
rn1.ido = shared.Subscribe(i => rn1.processID(i));
MyRowHandler rn2 = new MyRowHandler();
rn2.ido = shared.Subscribe(i => rn2.processID(i));
MyRowHandler rn3 = new MyRowHandler();
rn3.ido = shared.Subscribe(i => rn3.processID(i));
//
shared.Connect();
}
public class MyRowHandler
{
public IDisposable ido = null;
public void processID(int i)
{
var o = Observable.Start(() =>
{
Console.WriteLine(String.Format("Start Thread ID {0} Int{1}", Thread.CurrentThread.ManagedThreadId, i));
Thread.Sleep(30);
Console.WriteLine("Done Thread ID"+Thread.CurrentThread.ManagedThreadId.ToString());
}
);
o.First();
}
}
}
Discovery :
The coding speed & code quality gains one receives from Rx come at the expense of performance. Task/Delegates are without a doubt multiples faster. That means that the most important thing one needs to learn about Rx is when to use Rx. Below is a draft summary guideline. For large volumes I can see use for Rx in chuncking, combining, and other many stream-many handler models; however, basic Async should not use rx.
I'd post an image with a matrix guideline, but the site won't let me post images
If I understand your sequencing requirements correctly and you want three parallel running scans, you can just observe on the TaskPool and subscribe from there;
...
//Create the RowHandlers
MyRowHandler rn1 = new MyRowHandler();
rn1.ido = shared.ObserveOn(Scheduler.TaskPool).Subscribe(i => rn1.processID(i));
...
Note that since you're then running asynchronously and your main thread doesn't wait for the scans to get done, your program will terminate right away unless you for example put a Console.ReadKey() at the end of the program.
EDIT: Regarding running the same thread "all the way", you're scheduling a bit strangely for that. If you drop the observable in the rowhandler, you can use Scheduler.NewThread and get good results;
...
var rowHandler1 = new MyRowHandler();
rowHandler1.ido = shared.ObserveOn(Scheduler.NewThread).Subscribe(rowHandler1.ProcessID);
...
public void ProcessID(int i)
{
Console.WriteLine(String.Format("Start Thread ID {0} Int{1}", Thread.CurrentThread.ManagedThreadId, i));
Thread.Sleep(30);
Console.WriteLine("Done Thread ID" + Thread.CurrentThread.ManagedThreadId.ToString(CultureInfo.InvariantCulture));
}
That will give each subscription its own thread, and stay with it.
If I understand meaning of volatile and MemoryBarrier correctly than the program below has never to be able to show any result.
It catches reordering of write operations every time I run it. It does not matter if I run it in Debug or Release. It also does not matter if I run it as 32bit or 64bit application.
Why does it happen?
using System;
using System.Threading;
using System.Threading.Tasks;
namespace FlipFlop
{
class Program
{
//Declaring these variables as volatile should instruct compiler to
//flush all caches from registers into the memory.
static volatile int a;
static volatile int b;
//Track a number of iteration that it took to detect operation reordering.
static long iterations = 0;
static object locker = new object();
//Indicates that operation reordering is not found yet.
static volatile bool continueTrying = true;
//Indicates that Check method should continue.
static volatile bool continueChecking = true;
static void Main(string[] args)
{
//Restarting test until able to catch reordering.
while (continueTrying)
{
iterations++;
var checker = new Task(Check);
var writter = new Task(Write);
lock (locker)
{
continueChecking = true;
checker.Start();
}
writter.Start();
checker.Wait();
writter.Wait();
}
Console.ReadKey();
}
static void Write()
{
//Writing is locked until Main will start Check() method.
lock (locker)
{
//Using memory barrier should prevent opration reordering.
a = 1;
Thread.MemoryBarrier();
b = 10;
Thread.MemoryBarrier();
b = 20;
Thread.MemoryBarrier();
a = 2;
//Stops spinning in the Check method.
continueChecking = false;
}
}
static void Check()
{
//Spins until finds operation reordering or stopped by Write method.
while (continueChecking)
{
int tempA = a;
int tempB = b;
if (tempB == 10 && tempA == 2)
{
continueTrying = false;
Console.WriteLine("Caught when a = {0} and b = {1}", tempA, tempB);
Console.WriteLine("In " + iterations + " iterations.");
break;
}
}
}
}
}
You aren't cleaning the variables between tests, so (for all but the first) initially a is 2 and b is 20 - before Write has done anything.
Check can get that initial value of a (so tempA is 2), and then Write can get in, get as far as changing b to 10.
Now Check reads the b (so tempB is 10).
Et voila. No re-order necessary to repro.
Reset a and b to 0 between runs and I expect it will go away.
edit: confirmed; "as is" I get the issue almost immediately (<2000 iterations); but by adding:
while (continueTrying)
{
a = b = 0; // reset <======= added this
it then loops for any amount of time without any issue.
Or as a flow:
Write A= B= Check
(except first run) 2 20
int tempA = a;
a = 1; 1 20
Thread.MemoryBarrier();
b = 10; 1 10
int tempB = b;
I don't think this is re-ordering.
This piece of code is simply not thread-safe:
while (continueChecking)
{
int tempA = a;
int tempB = b;
...
I think this scenario is possible:
int tempA = a; executes with the values of the last loop (a == 2)
There is a context switch to the Write thread
b = 10 and the loop stops
There is a context switch to the Check thread
int tempB = b; executes with b == 10
I notice that the calls to MemoryBarrier() enhance the chances of this scenario. Probably because they cause more context-switching.
The result has nothing to do with reordering, with memory barries, or with volatile. All these constructs are needed to avoid effects of compiler or CPU reordering of the instructions.
But this program would produce the same result even assuming fully consistent single-CPU memory model and no compiler optimization.
First of all, notice that there will be multiple Write() tasks started in parallel. They are running sequentially due to lock() inside Write(), but a signle Check() method can read a and b produced by different instances of Write() tasks.
Because Check() function has no synchronization with Write function - it can read a and b at two arbitrary and different moments. There is nothing in your code that prevents Check() from reading a produced by previous Write() at one moment and then reading b produced by following Write() at another moment. First of all you need synchronization (lock) in Check() and then you might (but probably not in this case) need memory barriers and volatile to fight with memory model problems.
This is all you need:
int tempA, tempB;
lock (locker)
{
tempA = a;
tempB = b;
}
If you use MemoryBarrier in writer, why don't you do that in checker? Put Thread.MemoryBarrier(); before int tempA = a;.
Calling Thread.MemoryBarrier(); so many times blocks all of the advantages of the method. Call it only once before or after a = 1;.