Just for the heck of it I'm trying to emulate how JRuby generators work using threads in C#.
Also, I'm fully aware that C# has built in support for yield return, I'm just toying around a bit.
I guess it's some sort of poor mans coroutines by keeping multiple callstacks alive using threads. (even though none of the callstacks should execute at the same time)
The idea is like this:
The consumer thread requests a value
The worker thread provides a value and yields back to the consumer thread
Repeat untill worker thread is done
So, what would be the correct way of doing the following?
//example
class Program
{
static void Main(string[] args)
{
ThreadedEnumerator<string> enumerator = new ThreadedEnumerator<string>();
enumerator.Init(() =>
{
for (int i = 1; i < 100; i++)
{
enumerator.Yield(i.ToString());
}
});
foreach (var item in enumerator)
{
Console.WriteLine(item);
};
Console.ReadLine();
}
}
//naive threaded enumerator
public class ThreadedEnumerator<T> : IEnumerator<T>, IEnumerable<T>
{
private Thread enumeratorThread;
private T current;
private bool hasMore = true;
private bool isStarted = false;
AutoResetEvent enumeratorEvent = new AutoResetEvent(false);
AutoResetEvent consumerEvent = new AutoResetEvent(false);
public void Yield(T item)
{
//wait for consumer to request a value
consumerEvent.WaitOne();
//assign the value
current = item;
//signal that we have yielded the requested
enumeratorEvent.Set();
}
public void Init(Action userAction)
{
Action WrappedAction = () =>
{
userAction();
consumerEvent.WaitOne();
enumeratorEvent.Set();
hasMore = false;
};
ThreadStart ts = new ThreadStart(WrappedAction);
enumeratorThread = new Thread(ts);
enumeratorThread.IsBackground = true;
isStarted = false;
}
public T Current
{
get { return current; }
}
public void Dispose()
{
enumeratorThread.Abort();
}
object System.Collections.IEnumerator.Current
{
get { return Current; }
}
public bool MoveNext()
{
if (!isStarted)
{
isStarted = true;
enumeratorThread.Start();
}
//signal that we are ready to receive a value
consumerEvent.Set();
//wait for the enumerator to yield
enumeratorEvent.WaitOne();
return hasMore;
}
public void Reset()
{
throw new NotImplementedException();
}
public IEnumerator<T> GetEnumerator()
{
return this;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return this;
}
}
Ideas?
There are many ways to implement the producer/consumer pattern in C#.
The best way, I guess, is using TPL (Task, BlockingCollection). See an example here.
Related
I am using C# and I have an enumerator and I am reading the data inside the enumerator sequentially.
This is a third party library object and does not support Parallel.Foreach
while(enumerator.Next())
{
var item = enumerator.Read();
ProcessItem(item);
}
ProcessItem(Item item)
{
// Is lock required here
if(item.prop == "somevalue")
this._list.Add(item);
}
I want to achieve multithreading here while reading the content.
while(enumerator.Next())
{
// This code should run in a multi-threaded way
var item = enumerator.Read();
// ProcessItem method puts these items on a class level list property
// Is there any Lock required?
ProcessItem(item);
}
I am new to multithreading. Please share any code samples which satisfies the above requirement.
Yes, some locking required. you can achieve it using lock or using a concurrent collection type.
using lock:
ProcessItem(Item item)
{
if(item.prop == "somevalue")
{
lock(_list)
{
_list.Add(item);
}
}
}
Edit: based on detail you provided, you can wrap the enumerator from external lib using your own enumerator like below so you can use Parallel.ForEach on it:
We assume the enumerator you got is something like MockEnumerator, we wrap it in a normal IEnumerator, and IEnumerable so we are able to use Parallel.ForEach to read in parallel.
class Program
{
class Item
{
public int SomeProperty { get; }
public Item(int prop)
{
SomeProperty = prop;
}
}
class MockEnumerator
{
private Item[] _items = new Item[] { new Item(1), new Item(2) };
private int _position = 0;
public bool Next()
{
return _position++ < _items.Length;
}
public Item Read()
{
return _items[_position];
}
}
class EnumeratorWrapper : IEnumerator<Item>, IEnumerable<Item>
{
private readonly MockEnumerator _enumerator;
public EnumeratorWrapper(MockEnumerator enumerator)
{
this._enumerator = enumerator;
}
public Item Current => _enumerator.Read();
object IEnumerator.Current => Current;
public void Dispose()
{
}
public IEnumerator<Item> GetEnumerator()
{
throw new NotImplementedException();
}
public bool MoveNext()
{
return _enumerator.Next();
}
public void Reset()
{
}
IEnumerator IEnumerable.GetEnumerator()
{
return this;
}
}
private static List<Item> _list = new List<Item>();
static void Main(string[] args)
{
var enumerator = new EnumeratorWrapper(new MockEnumerator());
Parallel.ForEach(enumerator, item =>
{
if (item.SomeProperty == 1)//someval
{
lock (_list)
{
_list.Add(item);
}
}
});
}
}
This is a good example for task-based parallelization. Each processing of an item corresponds to a task. Hence, you can change the loop to the following:
var tasks = new List<Task<int>>();
while(enumerator.MoveNext())
{
var item = enumerator.Current;
Task<int> task = new Task<int>(() => ProcessItem(item));
task.Start();
tasks.Add(task);
}
foreach(Task<int> task in tasks)
{
int i = task.Result;
classList.Add(i);
}
Note that the synchronization on the classList is implicitly given by first spawning all tasks in the while loop and then merging the results in the foreach loop. The synchronization is specifically given by the access to Result which waits until the corresponding task is finished.
I have a static class and it has a static function IsDataCorrect() which does a http request.
The function can be called from multiple threads at the same time, and I want to let the first thread doing the request, and the others should be rejected (meaning they should get false as return value, they should not just be blocked!) until half a second after the first thread finished the request.
After that, the next winning thread should be able to do the next request, others should be rejected, and so on.
This is my approach, could someone please confirm if that is reasonable:
static class MyClass
{
private static bool IsBusy = false;
private static object lockObject = new object();
public static bool IsDataCorrect(string testString)
{
lock (lockObject)
{
if (IsBusy) return false;
IsBusy = true;
}
var uri = $"https://something.com";
bool htmlCheck = GetDocFromUri(uri, 2);
var t = new Thread(WaitBeforeFree);
t.Start();
//Fast Evaluations
//...
return htmlCheck;
}
private static void WaitBeforeFree()
{
Thread.Sleep(500);
IsBusy = false;
}
}
Your threads accessing the function would still be serialized in access for checking IsBusy flag, since only one thread at a time would be able to check it due to synchronization on lockObject. Instead, you can simply attempt to get a lock, and consequently, you don't need a flag since the lock itself will serve as the lock. Second, I would replace launching of new thread every time just to sleep and reset the flag, and replace it with a check on DateTime field.
static class MyClass
{
private static DateTime NextEntry = DateTime.Now;
private static ReaderWriterLockSlim timeLock = new ReaderWriterLockSlim();
private static object lockObject = new object();
public static bool IsDataCorrect(string testString)
{
bool tryEnterSuccess = false;
try
{
try
{
timeLock.EnterReadLock()
if (DateTime.Now < NextEntry) return false;
}
finally
{
timeLock.ExitReadLock()
}
Monitor.TryEnter(lockObject, ref tryEnterSuccess);
if (!tryEnterSuccess) return false;
var uri = $"https://something.com";
bool htmlCheck = GetDocFromUri(uri, 2);
//Fast Evaluations
//...
try
{
timeLock.EnterWriteLock()
NextEntry = DateTime.Now.AddMilliseconds(500);
} finally {
timeLock.ExitWriteLock()
}
return htmlCheck;
} finally {
if (tryEnterSuccess) Monitor.Exit(lockObject);
}
}
}
More efficient this way for not launching new threads, DateTime access is safe and yet concurrent so threads only stop when absolutely have to. Otherwise, everything keeps moving along with minimal resource usage.
I see you guys solved the problem correctly, but I think that there is still room to make it correct, efficient and simple in same time:).
How about this way?
EDIT: Edit to make calming easier and part of the example.
public static class ConcurrentCoordinationExtension
{
private static int _executing = 0;
public static bool TryExecuteSequentially(this Action actionToExecute)
{
// compate _executing with zero, if zero, set 1,
// return original value as result,
// successfull entry then result is zero, non zero returned, then somebody is executing
if (Interlocked.CompareExchange(ref _executing, 1, 0) != 0) return false;
try
{
actionToExecute.Invoke();
return true;
}
finally
{
Interlocked.Exchange(ref _executing, 0);//
}
}
public static bool TryExecuteSequentially(this Func<bool> actionToExecute)
{
// compate _executing with zero, if zero, set 1,
// return original value as result,
// successfull entry then result is zero, non zero returned, then somebody is executing
if (Interlocked.CompareExchange(ref _executing, 1, 0) != 0) return false;
try
{
return actionToExecute.Invoke();
}
finally
{
Interlocked.Exchange(ref _executing, 0);//
}
}
}
class Program
{
static void Main(string[] args)
{
DateTime last = DateTime.MinValue;
Func<bool> operation= () =>
{
//calming condition was not meant
if (DateTime.UtcNow - last < TimeSpan.FromMilliseconds(500)) return false;
last = DateTime.UtcNow;
//some stuff you want to process sequentially
return true;
};
operation.TryExecuteSequentially();
}
}
I have a situation where I have multiple producers and multiple consumers. The producers enters a job into a queue. I chose the BlockingCollection and it works great since I need the consumers to wait for a job to be found. However, if I use the GetConsumingEnumerable() feature the order of the items in the collection change... this is not what I need.
It even says in MSDN http://msdn.microsoft.com/en-us/library/dd287186.aspx
that it does not preserve the order of the items.
Does anyone know an alternative for this situation?
I see that the Take method is available but does it also provide a 'wait' condition for the consumer threads?
It says http://msdn.microsoft.com/en-us/library/dd287085.aspx
'A call to Take may block until an item is available to be removed.' Is it better to use TryTake? I really need the thread to wait and keep checking for a job.
Take blocks the thread till something comes available.
TryTake as the name implies tries to do so but returns a bool if it fails or succeeds.
Allowing for more flex using it:
while(goingOn){
if( q.TryTake(out var){
Process(var)
}
else{
DoSomething_Usefull_OrNotUseFull_OrEvenSleep();
}
}
instead of
while(goingOn){
if( var x = q.Take(){
//w'll wait till this ever will happen and then we:
Process(var)
}
}
My votes are for TryTake :-)
EXAMPLE:
public class ProducerConsumer<T> {
public struct Message {
public T Data;
}
private readonly ThreadRunner _producer;
private readonly ThreadRunner _consumer;
public ProducerConsumer(Func<T> produce, Action<T> consume) {
var q = new BlockingCollection<Message>();
_producer = new Producer(produce,q);
_consumer = new Consumer(consume,q);
}
public void Start() {
_producer.Run();
_consumer.Run();
}
public void Stop() {
_producer.Stop();
_consumer.Stop();
}
private class Producer : ThreadRunner {
public Producer(Func<T> produce, BlockingCollection<Message> q) : base(q) {
_produce = produce;
}
private readonly Func<T> _produce;
public override void Worker() {
try {
while (KeepRunning) {
var item = _produce();
MessageQ.TryAdd(new Message{Data = item});
}
}
catch (ThreadInterruptedException) {
WasInterrupted = true;
}
}
}
public abstract class ThreadRunner {
protected readonly BlockingCollection<Message> MessageQ;
protected ThreadRunner(BlockingCollection<Message> q) {
MessageQ = q;
}
protected Thread Runner;
protected bool KeepRunning = true;
public bool WasInterrupted;
public abstract void Worker();
public void Run() {
Runner = new Thread(Worker);
Runner.Start();
}
public void Stop() {
KeepRunning = false;
Runner.Interrupt();
Runner.Join();
}
}
class Consumer : ThreadRunner {
private readonly Action<T> _consume;
public Consumer(Action<T> consume,BlockingCollection<Message> q) : base(q) {
_consume = consume;
}
public override void Worker() {
try {
while (KeepRunning) {
Message message;
if (MessageQ.TryTake(out message, TimeSpan.FromMilliseconds(100))) {
_consume(message.Data);
}
else {
//There's nothing in the Q so I have some spare time...
//Excellent moment to update my statisics or update some history to logfiles
//for now we sleep:
Thread.Sleep(TimeSpan.FromMilliseconds(100));
}
}
}
catch (ThreadInterruptedException) {
WasInterrupted = true;
}
}
}
}
}
USAGE:
[Fact]
public void ConsumerShouldConsume() {
var produced = 0;
var consumed = 0;
Func<int> produce = () => {
Thread.Sleep(TimeSpan.FromMilliseconds(100));
produced++;
return new Random(2).Next(1000);
};
Action<int> consume = c => { consumed++; };
var t = new ProducerConsumer<int>(produce, consume);
t.Start();
Thread.Sleep(TimeSpan.FromSeconds(5));
t.Stop();
Assert.InRange(produced,40,60);
Assert.InRange(consumed, 40, 60);
}
I'm writing a wrapper around a 3rd party library, and it has a method to scan the data it manages. The method takes a callback method that it calls for each item in the data that it finds.
e.g. The method is essentially: void Scan(Action<object> callback);
I want to wrap it and expose a method like IEnumerable<object> Scan();
Is this possible without resorting to a separate thread to do the actual scan and a buffer?
You can do this quite simply with Reactive:
class Program
{
static void Main(string[] args)
{
foreach (var x in CallBackToEnumerable<int>(Scan))
Console.WriteLine(x);
}
static IEnumerable<T> CallBackToEnumerable<T>(Action<Action<T>> functionReceivingCallback)
{
return Observable.Create<T>(o =>
{
// Schedule this onto another thread, otherwise it will block:
Scheduler.Later.Schedule(() =>
{
functionReceivingCallback(o.OnNext);
o.OnCompleted();
});
return () => { };
}).ToEnumerable();
}
public static void Scan(Action<int> act)
{
for (int i = 0; i < 100; i++)
{
// Delay to prove this is working asynchronously.
Thread.Sleep(100);
act(i);
}
}
}
Remember that this doesn't take care of things like cancellation, since the callback method doesn't really allow it. A proper solution would require work on the part of the external library.
You should investigate the Rx project — this allows an event source to be consumed as an IEnumerable.
I'm not sure if it allows vanilla callbacks to be presented as such (it's aimed at .NET events) but it would be worth a look as it should be possible to present a regular callback as an IObservable.
Here is a blocking enumerator (the Scan method needs to run in a separate thread)
public class MyEnumerator : IEnumerator<object>
{
private readonly Queue<object> _queue = new Queue<object>();
private ManualResetEvent _event = new ManualResetEvent(false);
public void Callback(object value)
{
lock (_queue)
{
_queue.Enqueue(value);
_event.Set();
}
}
public void Dispose()
{
}
public bool MoveNext()
{
_event.WaitOne();
lock (_queue)
{
Current = _queue.Dequeue();
if (_queue.Count == 0)
_event.Reset();
}
return true;
}
public void Reset()
{
_queue.Clear();
}
public object Current { get; private set; }
object IEnumerator.Current
{
get { return Current; }
}
}
static void Main(string[] args)
{
var enumerator = new MyEnumerator();
Scan(enumerator.Callback);
while (enumerator.MoveNext())
{
Console.WriteLine(enumerator.Current);
}
}
You could wrap it in a simple IEnumerable<Object>, but I would not recommend it. IEnumerable lists implies that you can run multiple enumerators on the same list, which you can't in this case.
How about this one:
IEnumerable<Object> Scan()
{
List<Object> objList = new List<Object>();
Action<Object> action = (obj) => { objList.Add(obj); };
Scan(action);
return objList;
}
Take a look at the yield keyword -- which will allow you to have a method that looks like an IEnumerable but which actually does processing for each return value.
I have a thread, which creates a variable number of worker threads and distributes tasks between them. This is solved by passing the threads a TaskQueue object, whose implementation you will see below.
These worker threads simply iterate over the TaskQueue object they were given, executing each task.
private class TaskQueue : IEnumerable<Task>
{
public int Count
{
get
{
lock(this.tasks)
{
return this.tasks.Count;
}
}
}
private readonly Queue<Task> tasks = new Queue<Task>();
private readonly AutoResetEvent taskWaitHandle = new AutoResetEvent(false);
private bool isFinishing = false;
private bool isFinished = false;
public void Enqueue(Task task)
{
Log.Trace("Entering Enqueue, lock...");
lock(this.tasks)
{
Log.Trace("Adding task, current count = {0}...", Count);
this.tasks.Enqueue(task);
if (Count == 1)
{
Log.Trace("Count = 1, so setting the wait handle...");
this.taskWaitHandle.Set();
}
}
Log.Trace("Exiting enqueue...");
}
public Task Dequeue()
{
Log.Trace("Entering Dequeue...");
if (Count == 0)
{
if (this.isFinishing)
{
Log.Trace("Finishing (before waiting) - isCompleted set, returning empty task.");
this.isFinished = true;
return new Task();
}
Log.Trace("Count = 0, lets wait for a task...");
this.taskWaitHandle.WaitOne();
Log.Trace("Wait handle let us through, Count = {0}, IsFinishing = {1}, Returned = {2}", Count, this.isFinishing);
if(this.isFinishing)
{
Log.Trace("Finishing - isCompleted set, returning empty task.");
this.isFinished = true;
return new Task();
}
}
Log.Trace("Entering task lock...");
lock(this.tasks)
{
Log.Trace("Entered task lock, about to dequeue next item, Count = {0}", Count);
return this.tasks.Dequeue();
}
}
public void Finish()
{
Log.Trace("Setting TaskQueue state to isFinishing = true and setting wait handle...");
this.isFinishing = true;
if (Count == 0)
{
this.taskWaitHandle.Set();
}
}
public IEnumerator<Task> GetEnumerator()
{
while(true)
{
Task t = Dequeue();
if(this.isFinished)
{
yield break;
}
yield return t;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
As you can see, I'm using an AutoResetEvent object to make sure that the worker threads don't exit prematurely, i.e. before getting any tasks.
In a nutshell:
the main thread assigns a task to a thread by Enqeueue-ing a task to its TaskQueue
the main thread notifies the thread that are no more tasks to execute by calling the TaskQueue's Finish() method
the worker thread retrieves the next task assigned to it by calling the TaskQueue's Dequeue() method
The problem is that the Dequeue() method often throws an InvalidOperationException, saying that the Queue is empty. As you can see I added some logging, and it turns out, that the AutoResetEvent doesn't block the Dequeue(), even though there were no calls to its Set() method.
As I understand it, calling AutoResetEvent.Set() will allow a waiting thread to proceed (who previously called AutoResetEvent.WaitOne()), and then automatically calls AutoResetEvent.Reset(), blocking the next waiter.
So what can be wrong? Did I get something wrong? Do I have an error somewhere?
I'm sitting above this for 3 hours now, but I cannot figure out what's wrong.
Please help me!
Thank you very much!
Your dequeue code is incorrect. You check the Count under lock, then fly by the seams of your pants, and then you expect the tasks to have something. You cannot retain assumptions while you release the lock :). Your Count check and tasks.Dequeue must occur under lock:
bool TryDequeue(out Tasks task)
{
task = null;
lock (this.tasks) {
if (0 < tasks.Count) {
task = tasks.Dequeue();
}
}
if (null == task) {
Log.Trace ("Queue was empty");
}
return null != task;
}
You Enqueue() code is similarly riddled with problems. Your Enqueue/Dequeue don't ensure progress (you will have dequeue threads blocked waiting even though there are items in the queue). Your signature of Enqueue() is wrong. Overall your post is very very poor code. Frankly, I think you're trying to chew more than you can bite here... Oh, and never log under lock.
I strongly suggest you just use ConcurrentQueue.
If you don't have access to .Net 4.0 here is an implementation to get you started:
public class ConcurrentQueue<T>:IEnumerable<T>
{
volatile bool fFinished = false;
ManualResetEvent eventAdded = new ManualResetEvent(false);
private Queue<T> queue = new Queue<T>();
private object syncRoot = new object();
public void SetFinished()
{
lock (syncRoot)
{
fFinished = true;
eventAdded.Set();
}
}
public void Enqueue(T t)
{
Debug.Assert (false == fFinished);
lock (syncRoot)
{
queue.Enqueue(t);
eventAdded.Set();
}
}
private bool Dequeue(out T t)
{
do
{
lock (syncRoot)
{
if (0 < queue.Count)
{
t = queue.Dequeue();
return true;
}
if (false == fFinished)
{
eventAdded.Reset ();
}
}
if (false == fFinished)
{
eventAdded.WaitOne();
}
else
{
break;
}
} while (true);
t = default(T);
return false;
}
public IEnumerator<T> GetEnumerator()
{
T t;
while (Dequeue(out t))
{
yield return t;
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
A more detailed answer from me is pending, but I just want to point out something very important.
If you're using .NET 3.5, you can use the ConcurrentQueue<T> class. A backport is included in the Rx extensions library, which is available for .NET 3.5.
Since you want blocking behavior, you would need to wrap a ConcurrentQueue<T> in a BlockingCollection<T> (also available as part of Rx).
It looks like you are trying to replicate a blocking queue. One already exists in the .NET 4.0 BCL as a BlockingCollection. If .NET 4.0 is not an option for you then you can use this code. It use the Monitor.Wait and Monitor.Pulse method instead of AutoResetEvent.
public class BlockingCollection<T>
{
private Queue<T> m_Queue = new Queue<T>();
public T Take() // Dequeue
{
lock (m_Queue)
{
while (m_Queue.Count <= 0)
{
Monitor.Wait(m_Queue);
}
return m_Queue.Dequeue();
}
}
public void Add(T data) // Enqueue
{
lock (m_Queue)
{
m_Queue.Enqueue(data);
Monitor.Pulse(m_Queue);
}
}
}
Update:
I am fairly certain that it is not possible to implement a producer-consumer queue using AutoResetEvent if you want it to be thread-safe for multiple producers and multiple consumers (I am prepared to be proven wrong if someone can come up with a counter example). Sure, you will see examples on the internet, but they are all wrong. In fact, one such attempt by Microsoft is flawed in that the queue can get live-locked.