I am seeing an odd issue where The .NET client for MongoDB throws a The wait queue for acquiring a connection to server 127.0.0.1:27017 is full. exception.
I have a semaphore that guards any call to MongoDB, with a size of 10.
Meaning, there are never more than 10 concurrent calls to Mongo.
The default connection pool size is 100 for the .NET driver, which is more than 10.
so 10 concurrent calls should not be an issue.
To replicate this I have the following code, contrived yes, but it makes the issue visible.
I also found this spec for MongoDB
https://github.com/mongodb/specifications/blob/master/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#id94
Is that related?
Does each calling thread (thread pool worker in this case) go into the wait queue and try to grab a connection, and if I have more worker threads, even if concurrency level is low, the connections still have to be assigned to this new calling worker thread?
using System;
using System.Threading.Tasks;
using MongoDB.Bson;
using MongoDB.Driver;
using System.Threading;
namespace ConsoleApp58
{
public class AsyncSemaphore
{
private readonly SemaphoreSlim _semaphore;
public AsyncSemaphore(int maxConcurrency)
{
_semaphore = new SemaphoreSlim(
maxConcurrency,
maxConcurrency
);
}
public async Task<T> WaitAsync<T>(Task<T> task)
{
await _semaphore.WaitAsync();
//proves we have the correct max concurrent calls
// Console.WriteLine(_semaphore.CurrentCount);
try
{
var result = await task;
return result;
}
finally
{
_semaphore.Release();
}
}
}
class Program
{
public class SomeEntity
{
public ObjectId Id { get; set; }
public string Name { get; set; }
}
static void Main(string[] args)
{
var settings = MongoClientSettings.FromUrl(MongoUrl.Create("mongodb://127.0.0.1:27017"));
// settings.MinConnectionPoolSize = 10;
// settings.MaxConnectionPoolSize = 1000;
// I get that I can tweak settings, but I want to know why this occurs at all?
// if we guard the calls with a semaphore, how can this happen?
var mongoClient = new MongoClient(settings);
var someCollection = mongoClient.GetDatabase("dummydb").GetCollection<SomeEntity>("some");
var a = new AsyncSemaphore(10);
// is this somehow related ?
// https://github.com/mongodb/specifications/blob/master/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#id94
_ = Task.Run(() =>
{
while (true)
{
// this bit is protected by a semaphore of size 10
// (we will flood the thread pool with ongoing tasks, yes)
_ = a.WaitAsync(RunTask(someCollection))
//after the task is done, dump the result
// dot is OK, else exception message
.ContinueWith(x =>
{
if (x.IsFaulted)
{
Console.WriteLine(x.Exception);
}
});
}
}
);
Console.ReadLine();
}
private static async Task<SomeEntity> RunTask(IMongoCollection<SomeEntity> pids)
{
//simulate some mongo interaction here
var res = await pids.Find(x => x.Name == "").FirstOrDefaultAsync();
return res;
}
}
}
Connections take time to be established. You do not instantly get 100 usable connections. If you create a client and immediately request even 10 operations, while there are no available connections, you can hit the wait queue timeout.
Some drivers also had a wait queue length limit. It's not standardized and should be deprecated in my understanding but may continue to exist for compatibility reasons. Consult your driver docs to see how to raise it.
Then, either increase waitQueueTimeoutMS or ramp up the load gradually or wait for connections to be established prior to starting the load (you can use CMAP events for the latter).
Make sure your concurrency bound of 10 outstanding operations is actually working properly too.
Related
I have a Windows service that reads data from the database and processes this data using multiple REST API calls.
Originally, this service ran on a timer where it would read unprocessed data from the database and process it using multiple threads limited using SemaphoreSlim. This worked well except that the database read had to wait for all processing to finish before reading again.
ServicePointManager.DefaultConnectionLimit = 10;
Original that works:
// Runs every 5 seconds on a timer
private void ProcessTimer_Elapsed(object sender, ElapsedEventArgs e)
{
var hasLock = false;
try
{
Monitor.TryEnter(timerLock, ref hasLock);
if (hasLock)
{
ProcessNewData();
}
else
{
log.Info("Failed to acquire lock for timer."); // This happens all of the time
}
}
finally
{
if (hasLock)
{
Monitor.Exit(timerLock);
}
}
}
public void ProcessNewData()
{
var unproceesedItems = GetDatabaseItems();
if (unproceesedItems.Count > 0)
{
var downloadTasks = new Task[unproceesedItems.Count];
var maxThreads = new SemaphoreSlim(semaphoreSlimMinMax, semaphoreSlimMinMax); // semaphoreSlimMinMax = 10 is max threads
for (var i = 0; i < unproceesedItems .Count; i++)
{
maxThreads.Wait();
var iClosure = i;
downloadTasks[i] =
Task.Run(async () =>
{
try
{
await ProcessItemsAsync(unproceesedItems[iClosure]);
}
catch (Exception ex)
{
// handle exception
}
finally
{
maxThreads.Release();
}
});
}
Task.WaitAll(downloadTasks);
}
}
To improve efficiency, I rewrite the service to run GetDatabaseItems in a separate thread from the rest so that there is a ConcurrentDictionary of unprocessed items between them that GetDatabaseItems fills and ProcessNewData empties.
The problem is that while 10 unprocessed items are send to ProcessItemsAsync, they are processed two at a time instead of all 10.
The code inside of ProcessItemsAsync calls var response = await client.SendAsync(request); where the delay occurs. All 10 threads make it to this code but come out of it two at a time. None of this code changed between the old version and the new.
Here is the code in the new version that did change:
public void Start()
{
ServicePointManager.DefaultConnectionLimit = maxSimultaneousThreads; // 10
// Start getting unprocessed data
getUnprocessedDataTimer.Interval = getUnprocessedDataInterval; // 5 seconds
getUnprocessedDataTimer.Elapsed += GetUnprocessedData; // writes data into a ConcurrentDictionary
getUnprocessedDataTimer.Start();
cancellationTokenSource = new CancellationTokenSource();
// Create a new thread to process data
Task.Factory.StartNew(() =>
{
try
{
ProcessNewData(cancellationTokenSource.Token);
}
catch (Exception ex)
{
// error handling
}
}, TaskCreationOptions.LongRunning
);
}
private void ProcessNewData(CancellationToken token)
{
// Check if task has been canceled.
while (!token.IsCancellationRequested)
{
if (unprocessedDictionary.Count > 0)
{
try
{
var throttler = new SemaphoreSlim(maxSimultaneousThreads, maxSimultaneousThreads); // maxSimultaneousThreads = 10
var tasks = unprocessedDictionary.Select(async item =>
{
await throttler.WaitAsync(token);
try
{
if (unprocessedDictionary.TryRemove(item.Key, out var item))
{
await ProcessItemsAsync(item);
}
}
catch (Exception ex)
{
// handle error
}
finally
{
throttler.Release();
}
});
Task.WhenAll(tasks);
}
catch (OperationCanceledException)
{
break;
}
}
Thread.Sleep(1000);
}
}
Environment
.NET Framework 4.7.1
Windows Server 2016
Visual Studio 2019
Attempts to fix:
I tried the following with the same bad result (two await client.SendAsync(request) completing at a time):
Set Max threads and ServicePointManager.DefaultConnectionLimit to 30
Manually create threads using Thread.Start()
Replace async/await pattern with sync HttpClient calls
Call data processing using Task.Run(async () => and Task.WaitAll(downloadTasks);
Replace the new long-running thread for ProcessNewData with a timer
What I want is to run GetUnprocessedData and ProcessNewData concurrently with an HttpClient connection limit of 10 (set in config) so that 10 requests are processed at the same time.
Note: the issue is similar to HttpClient.GetAsync executes only 2 requests at a time? but the DefaultConnectionLimit is increased and the service runs on a Windows Server. It also creates more than 2 connections when original code runs.
Update
I went back to the original project to make sure it still worked, it did. I added a new timer to perform some unrelated operations and the httpClient issue came back. I removed the timer, everything worked. I added a new thread to do parallel processing, the problem came back.
This is not a direct answer to your question, but a suggestion for simplifying your service that could make the debugging of any problem easier. My suggestion is to implement the producer-consumer pattern using an iterator for producing the unprocessed items, and a parallel loop for consuming them. Ideally the parallel loop would have async delegates, but since you are targeting the .NET Framework you don't have access to the .NET 6 method Parallel.ForEachAsync. So I will suggest the slightly wasteful approach of using a synchronous parallel loop that blocks threads. You could use either the Parallel.ForEach method, or the PLINQ like in the example below:
private IEnumerable<Item> Iterator(CancellationToken token)
{
while (true)
{
Task delayTask = Task.Delay(5000, token);
foreach (Item item in GetDatabaseItems()) yield return item;
delayTask.GetAwaiter().GetResult();
}
}
public void Start()
{
//...
ThreadPool.SetMinThreads(degreeOfParallelism, Environment.ProcessorCount);
new Thread(() =>
{
try
{
Partitioner
.Create(Iterator(token), EnumerablePartitionerOptions.NoBuffering)
.AsParallel()
.WithDegreeOfParallelism(degreeOfParallelism)
.WithCancellation(token)
.ForAll(item => ProcessItemAsync(item).GetAwaiter().GetResult());
}
catch (OperationCanceledException) { } // Ignore
}).Start();
}
Online demo.
The Iterator fetches unprocessed items from the database in batches, and yields them one by one. The database won't be hit more frequently than once every 5 seconds.
The PLINQ query is going to fetch a new item from the Iterator each time it has a worker available, according to the WithDegreeOfParallelism policy. The setting EnumerablePartitionerOptions.NoBuffering ensures that it won't try to fetch more items in advance.
The ThreadPool.SetMinThreads is used in order to boost the availability of ThreadPool threads, since the PLINQ is going to use lots of them. Without it the ThreadPool will not be able to satisfy the demand immediately, although it will gradually inject more threads and eventually will catch up. But since you already know how many threads you'll need, you can configure the ThreadPool from the start.
In case you dislike the idea of blocking threads, you can find a simple substitute of the Parallel.ForEachAsync here, based on the TPL Dataflow library. It requires installing a NuGet package.
The issue turned out to be the place where ServicePointManager.DefaultConnectionLimit is set.
In the version where HttpClient was only doing two requests at a time, ServicePointManager.DefaultConnectionLimit was being set before the threads were being created but after the HttpClient was initialized.
Once I moved it into the constructor before the HttpClient is initialized, everything started working.
Thank you very much to #Theodor Zoulias for the help.
TLDR; Set ServicePointManager.DefaultConnectionLimit before initializing the HttpClient.
We have an old 3rd party system (let's call it Junksoft® 95) that we interface with via PowerShell (it exposes a COM object) and I'm in the process of wrapping it in a REST API (ASP.NET Framework 4.8 and WebAPI 2). I use the System.Management.Automation nuget package to create a PowerShell in which I instantiate Junksoft's COM API as a dynamic object that I then use:
//I'm omitting some exception handling and maintenance code for brevity
powerShell = System.Management.Automation.PowerShell.Create();
powerShell.AddScript("Add-Type -Path C:\Path\To\Junksoft\Scripting.dll");
powerShell.AddScript("New-Object Com.Junksoft.Scripting.ScriptingObject");
dynamic junksoftAPI = powerShell.Invoke()[0];
//Now we issue commands to junksoftAPI like this:
junksoftAPI.Login(user,pass);
int age = junksoftAPI.GetAgeByCustomerId(custId);
List<string> names = junksoftAPI.GetNames();
This works fine when I run all of this on the same thread (e.g. in a console application). However, for some reason this usually doesn't work when I put junksoftAPI into a System.Web.Caching.Cache and use it from different controllers in my web app. I say ususally because this actually works when ASP.NET happens to give the incoming call to the thread that junksoftAPI was created on. If it doesn't, Junksoft 95 gives me an error.
Is there any way for me to make sure that all interactions with junksoftAPI happen on the same thread?
Note that I don't want to turn the whole web application into a single-threaded application! The logic in the controllers and elswhere should happen like normal on different threads. It should only be the Junksoft interactions that happen on the Junksoft-specific thread, something like this:
[HttpGet]
public IHttpActionResult GetAge(...)
{
//finding customer ID in database...
...
int custAge = await Task.Run(() => {
//this should happen on the Junksoft-specific thread and not the next available thread
var cache = new System.Web.Caching.Cache();
var junksoftAPI = cache.Get(...); //This has previously been added to cache on the Junksoft-specific thread
return junksoftAPI.GetAgeByCustomerId(custId);
});
//prepare a response using custAge...
}
You can create your own singleton worker thread to achieve this. Here is the code which you can plug it into your web application.
public class JunkSoftRunner
{
private static JunkSoftRunner _instance;
//singleton pattern to restrict all the actions to be executed on a single thread only.
public static JunkSoftRunner Instance => _instance ?? (_instance = new JunkSoftRunner());
private readonly SemaphoreSlim _semaphore;
private readonly AutoResetEvent _newTaskRunSignal;
private TaskCompletionSource<object> _taskCompletionSource;
private Func<object> _func;
private JunkSoftRunner()
{
_semaphore = new SemaphoreSlim(1, 1);
_newTaskRunSignal = new AutoResetEvent(false);
var contextThread = new Thread(ThreadLooper)
{
Priority = ThreadPriority.Highest
};
contextThread.Start();
}
private void ThreadLooper()
{
while (true)
{
//wait till the next task signal is received.
_newTaskRunSignal.WaitOne();
//next task execution signal is received.
try
{
//try execute the task and get the result
var result = _func.Invoke();
//task executed successfully, set the result
_taskCompletionSource.SetResult(result);
}
catch (Exception ex)
{
//task execution threw an exception, set the exception and continue with the looper
_taskCompletionSource.SetException(ex);
}
}
}
public async Task<TResult> Run<TResult>(Func<TResult> func, CancellationToken cancellationToken = default(CancellationToken))
{
//allows only one thread to run at a time.
await _semaphore.WaitAsync(cancellationToken);
//thread has acquired the semaphore and entered
try
{
//create new task completion source to wait for func to get executed on the context thread
_taskCompletionSource = new TaskCompletionSource<object>();
//set the function to be executed by the context thread
_func = () => func();
//signal the waiting context thread that it is time to execute the task
_newTaskRunSignal.Set();
//wait and return the result till the task execution is finished on the context/looper thread.
return (TResult)await _taskCompletionSource.Task;
}
finally
{
//release the semaphore to allow other threads to acquire it.
_semaphore.Release();
}
}
}
Console Main Method for testing:
public class Program
{
//testing the junk soft runner
public static void Main()
{
//get the singleton instance
var softRunner = JunkSoftRunner.Instance;
//simulate web request on different threads
for (var i = 0; i < 10; i++)
{
var taskIndex = i;
//launch a web request on a new thread.
Task.Run(async () =>
{
Console.WriteLine($"Task{taskIndex} (ThreadID:'{Thread.CurrentThread.ManagedThreadId})' Launched");
return await softRunner.Run(() =>
{
Console.WriteLine($"->Task{taskIndex} Completed On '{Thread.CurrentThread.ManagedThreadId}' thread.");
return taskIndex;
});
});
}
}
}
Output:
Notice that, though the function was launched from the different threads, some portion of code got always executed always on the same context thread with ID: '5'.
But beware that, though all the web requests are executed on independent threads, they will eventually wait for some tasks to get executed on the singleton worker thread. This will eventually create a bottle neck in your web application. This is anyway your design limitation.
Here is how you could issue commands to the Junksoft API from a dedicated STA thread, using a BlockingCollection class:
public class JunksoftSTA : IDisposable
{
private readonly BlockingCollection<Action<Lazy<dynamic>>> _pump;
private readonly Thread _thread;
public JunksoftSTA()
{
_pump = new BlockingCollection<Action<Lazy<dynamic>>>();
_thread = new Thread(() =>
{
var lazyApi = new Lazy<dynamic>(() =>
{
var powerShell = System.Management.Automation.PowerShell.Create();
powerShell.AddScript("Add-Type -Path C:\Path\To\Junksoft.dll");
powerShell.AddScript("New-Object Com.Junksoft.ScriptingObject");
dynamic junksoftAPI = powerShell.Invoke()[0];
return junksoftAPI;
});
foreach (var action in _pump.GetConsumingEnumerable())
{
action(lazyApi);
}
});
_thread.SetApartmentState(ApartmentState.STA);
_thread.IsBackground = true;
_thread.Start();
}
public Task<T> CallAsync<T>(Func<dynamic, T> function)
{
var tcs = new TaskCompletionSource<T>(
TaskCreationOptions.RunContinuationsAsynchronously);
_pump.Add(lazyApi =>
{
try
{
var result = function(lazyApi.Value);
tcs.SetResult(result);
}
catch (Exception ex)
{
tcs.SetException(ex);
}
});
return tcs.Task;
}
public Task CallAsync(Action<dynamic> action)
{
return CallAsync<object>(api => { action(api); return null; });
}
public void Dispose() => _pump.CompleteAdding();
public void Join() => _thread.Join();
}
The purpose of using the Lazy class is for surfacing a possible exception during the construction of the dynamic object, by propagating it to the callers.
...exceptions are cached. That is, if the factory method throws an exception the first time a thread tries to access the Value property of the Lazy<T> object, the same exception is thrown on every subsequent attempt.
Usage example:
// A static field stored somewhere
public static readonly JunksoftSTA JunksoftStatic = new JunksoftSTA();
await JunksoftStatic.CallAsync(api => { api.Login("x", "y"); });
int age = await JunksoftStatic.CallAsync(api => api.GetAgeByCustomerId(custId));
In case you find that a single STA thread is not enough to serve all the requests in a timely manner, you could add more STA threads, all of them running the same code (private readonly Thread[] _threads; etc). The BlockingCollection class is thread-safe and can be consumed concurrently by any number of threads.
If you did not say that was a 3rd party tool, I would have asumed it is a GUI class. For practical reasons, it is a very bad idea to have multiple threads write to them. .NET enforces a strict "only the creating thread shall write" rule, from 2.0 onward.
WebServers in general and ASP.Net in particular use a pretty big thread pool. We are talking 10's to 100's of Threads per Core. That means it is really hard to nail any request down to a specific Thread. You might as well not try.
Again, looking at the GUI classes might be your best bet. You could basically make a single thread with the sole purpose of immitating a GUI's Event Queue. The Main/UI Thread of your average Windows Forms application, is responsible for creating every GUI class instance. It is kept alive by polling/processing the event queue. It ends onlyx when it receies a cancel command, via teh Event Queue. Dispatching just puts orders into that Queue, so we can avoid Cross-Threading issues.
This is a follow-up question to this question. On the next level, I now want to use maximal task concurrency to connect to expected hosts on a large set of IP addresses, using TCP/IP on a specific port.
My own research, as well as community reference, has lead me to key articles, for example:
How to check TCP/IP port availability using C# (Socket Communication)
Checking if ip with port is available?
How to set the timeout for a TcpClient?
A very impressive solution for large-scale pinging: Multithreading C# GUI ping example
And of course the precursor to this question: C#, Maximize Thread Concurrency
This allowed me to set up my own code, which works fine, but currently takes a full 30 seconds to finish scanning 255 IPs, using only one specific port. Given the test, machine has 8 logical cores this observation suggests that my construct actually spawns at maximum 8 concurrent tasks (255/8=31.85).
The function I wrote returns a list of responding IPs {IPs} which is a subset of the List of all IPs {IP_Ports} to be checked. This is my current code, working fine but not yet suitable for use on larger networks due to what I suspect is lack of efficient task concurrency:
// Check remote host connectivity
public static class CheckRemoteHost
{
// Private Class members
private static bool AllDone = false;
private static object lockObj = new object();
private static List<string> IPs;
// Wrapper: manage async method <TCP_check>
public static List<string> TCP(Dictionary<string, int> IP_Ports, int TimeoutInMS = 100)
{
// Locals
IPs = new List<string>();
// Perform remote host check
AllDone = false;
TCP_check(IP_Ports, TimeoutInMS);
while (!AllDone) { Thread.Sleep(50); }
// Finish
return IPs;
}
private static async void TCP_check(Dictionary<string, int> IP_Ports, int timeout)
{// async worker method: check remote host via TCP-IP
// Build task-set for parallel IP queries
var tasks = IP_Ports.Select(host => TCP_IPAndUpdateAsync(host.Key, host.Value, timeout));
// Start execution queue
await Task.WhenAll(tasks).ContinueWith(t =>
{
AllDone = true;
});
}
private static async Task TCP_IPAndUpdateAsync(string ip, int port, int timeout)
{// method to call IP-check
// Run method asynchronously
await Task.Run(() =>
{
// Locals
TcpClient client;
IAsyncResult result;
bool success;
try
{
client = new TcpClient();
result = client.BeginConnect(ip, port, null, null);
success = result.AsyncWaitHandle.WaitOne(TimeSpan.FromMilliseconds(timeout));
if (success)
{
lock (lockObj)
{
IPs.Add(ip);
}
}
}
catch (Exception e)
{
// do nothing
}
});
}
}// end public static class CheckRemoteHost
So my question is: how can I maximize the task concurrency of requesting a response using TCP/IP at Port X such that I can obtain very fast IP-Port network scans on large internal networks?
Details
The default task scheduler is usually the ThreadPool scheduler. That means the number of concurrent tasks will be limited by the available threads in the pool.
Remarks
The thread pool provides new worker threads or I/O completion threads on demand until it reaches the minimum for each category. By default, the minimum number of threads is set to the number of processors on a system. When the minimum is reached, the thread pool can create additional threads in that category or wait until some tasks complete. Beginning with the .NET Framework 4, the thread pool creates and destroys threads in order to optimize throughput, which is defined as the number of tasks that complete per unit of time. Too few threads might not make optimal use of available resources, whereas too many threads could increase resource contention.
(Source: https://msdn.microsoft.com/en-us/library/system.threading.threadpool.getminthreads(v=vs.110).aspx)
You are likely just under the threshold where the threadpool would spin up new threads since tasks are being completed. Hence why you only have 8 concurrent tasks running at once.
Solutions
1. Use ConnectAsync with a timeout.
Instead of creating a separate task which blocks waiting for the connect. You can call ConnectAsync and join it with a delay to create a timeout. ConnectAsync doesn't seem to block the threadpool threads.
public static async Task<bool> ConnectAsyncWithTimeout(this Socket socket, string host, int port, int timeout = 0)
{
if (timeout < 0)
throw new ArgumentOutOfRangeException("timeout");
try
{
var connectTask = socket.ConnectAsync(host, port);
var res = await Task.WhenAny(connectTask, Task.Delay(timeout));
await res;
return connectTask == res && connectTask.IsCompleted && !connectTask.IsFaulted;
}
catch(SocketException se)
{
return false;
}
}
Example usage
private static async Task TCP_IPAndUpdateAsync(string ip, int port, int timeout)
{// method to call IP-check
client = new TcpClient();
var success = await client.Client.ConnectAsyncWithTimeout(ip, port, timeout);
if (success)
{
lock (lockObj)
{
IPs.Add(ip);
}
}
}
2. Use long running tasks.
Using Task.Factor.StartNew you can specify that the task is LongRunning. The threadpool task scheduler specifically will create a new thread for the task instead of using the threadpool. This will get around the 8 thread limit you are hitting. However, it should be noted that this is not a good solution if you plan to naively create thousands of tasks. Since at that point, the bottle neck will be thread context switches. You could however split all of the work between, for example, 100 tasks.
3. Use non-blocking connect
This method doesn't require creating multiple tasks. Instead you can call multiple connects on a single thread and check the status of multiple sockets at once. This method is a bit more involved though. If you rather go with this approach and want a more complete example then comment letting me know. Here is a quick snippet on how to use the API.
var socket = new Socket(SocketType.Stream, ProtocolType.Tcp);
socket.Blocking = false;
try
{
socket.Connect("127.0.0.1", 12345);
}
catch(SocketException se)
{
//Ignore the "A non-blocking socket operation could not be completed immediately" error
if (se.ErrorCode != 10035)
throw;
}
//Check the connection status of the socket.
var writeCheck = new List<Socket>() { socket };
var errorCheck = new List<Socket>() { socket };
Socket.Select(null, writeCheck, errorCheck, 0);
if (writeCheck.Contains(socket))
{
//Connection opened successfully.
}
else if (errorCheck.Contains(socket))
{
//Connection refused
}
else
{
//Connect still pending
}
With the help of Google and community, I was able to build a nice set of methods allowing me to asynchronously call a function. This function is testing remote host properties, so it is idling most of the time. For this reason I would like to maximize the number of concurrent threads launched such that all calls can be processed in the minimum amount of time.
Here is the Code I have so far:
// Check remote host connectivity
public static class CheckRemoteHost
{
// Private Class members
private static bool AllDone = false;
private static object lockObj = new object();
private static List<string> IPs;
// Wrapper: manage async method <Ping>
public static List<string> Ping(HashSet<string> IP_Ports, int TimeoutInMS = 100)
{// async worker method: check remote host via <Ping>
// Locals
IPs = new List<string>();
// Perform remote host check
AllDone = false;
Ping_check(IP_Ports, TimeoutInMS);
while (!AllDone) { CommonLib.Utils.ApplicationWait(10, 10); }
// Finish
return IPs;
}
private static async void Ping_check(HashSet<string> IP_Ports, int timeout)
{
// Locals
var tasks = new List<Task>();
// Build task-set for parallel Ping checks
foreach (string host in IP_Ports)
{
var task = PingAndUpdateAsync(host, timeout);
tasks.Add(task);
}
// Start execution queue
await Task.WhenAll(tasks).ContinueWith(t =>
{
AllDone = true;
});
}
private static async Task PingAndUpdateAsync(string ip, int timeout)
{
// Locals
System.Net.NetworkInformation.Ping ping;
System.Net.NetworkInformation.PingReply reply;
try
{
ping = new System.Net.NetworkInformation.Ping();
reply = await ping.SendPingAsync(ip, timeout);
if(reply.Status == System.Net.NetworkInformation.IPStatus.Success)
{
lock (lockObj)
{
IPs.Add(ip);
}
}
}
catch
{
// do nothing
}
}
}// end public static class CheckRemoteHost
This code is tested quite extensively, and the code seems stable and reliably report live hosts. Having said that, I know that it only spawns 8 threads at a time (= number of logical core on my test machine).
The key portion of the code is this:
// Start execution queue
await Task.WhenAll(tasks).ContinueWith(t =>
{
AllDone = true;
});
This is where I would like to increase/ maximize the number of concurrently launched threads to something like 25 per core (remember the thread job is 99% idle).
So far, my thread concurrency research has brought up the explicit thread and Parallel.For approaches. However, these seem to have the same shortcoming of spawning no more than 8 threads.
Any help would be very much appreciated, so thank you very much in advance everyone for looking!
You're making your life hard with the code you have. It's got a lot of plumbing that isn't needed and you're sharing static fields that would cause your code to fail if you called Ping a second time while the first one is running.
You need to get rid of all of that stuff.
I'd suggest using Microsoft's Reactive Framework - just NuGet "System.Reactive" and add using System.Reactive.Linq; to your code. Then you can do this:
public static class CheckRemoteHost
{
public static IList<string> Ping(HashSet<string> IP_Ports, int TimeoutInMS = 100)
{
var query =
from host in IP_Ports.ToObservable()
from status in Observable.FromAsync(() => PingAsync(host, TimeoutInMS))
where status
select host;
return query.ToList().Wait();
}
private static async Task<bool> PingAsync(string ip, int timeout)
{
try
{
var ping = new System.Net.NetworkInformation.Ping();
var reply = await ping.SendPingAsync(ip, timeout);
return reply.Status == System.Net.NetworkInformation.IPStatus.Success;
}
catch
{
return false;
}
}
}
That's it. That's all of the code you need. It's automatically maximising the thread use to get the job done.
Problem: I've got tons of emails to send, presently, an average of 10 emails in the queue at any point in time. The code I have process the queue one at a time; that is, receive the message, process it and eventually send the email. This cause a considerably delay in sending emails to users when they signup for the service.
I've begun to think of modifying the code to process the messages in parrallel say 5 asynchronously. I'm imagining writing a method and using the CTP to call this method in parallel, say, 5 times.
I'm a little bit lost in how to implement this. The cost of making a mistake is exceedingly great as users will get disappointed if things go wrong.
Request: I need help in writing code that process messages in Azure service bus in parallel.
Thanks.
My code in a nutshell.
Public .. Run()
{
_myQueueClient.BeginReceive(ProcessUrgentEmails, _myQueueClient);
}
void ProcessUrgentEmails(IAsyncResult result)
{
//casted the `result` as a QueueClient
//Used EndReceive on an object of BrokeredMessage
//I processed the message, then called
sendEmail.BeginComplete(ProcessEndComplete, sendEmail);
}
//This method is never called despite having it as callback function above.
void ProcessEndComplete(IAsyncResult result)
{
Trace.WriteLine("ENTERED ProcessEndComplete method...");
var bm = result.AsyncState as BrokeredMessage;
bm.EndComplete(result);
}
This page gives you performance tips when using Windows Azure Service Bus.
About parallel processing, you could have a pool of threads for processing, and every time you get a message, you just grab one of that pool and assign it a message. You need to manage that pool.
OR, you could retrieve multiple messages at once and process them using TPL... for example, the method BeginReceiveBatch/EndReceiveBatch allows you to retrieve multiple "items" from Queue (Async) and then use "AsParallel" to convert the IEnumerable returned by the previous methods and process the messages in multiple threads.
VERY simple and BARE BONES sample:
var messages = await Task.Factory.FromAsync<IEnumerable<BrokeredMessage>>(Client.BeginReceiveBatch(3, null, null), Client.EndReceiveBatch);
messages.AsParallel().WithDegreeOfParallelism(3).ForAll(item =>
{
ProcessMessage(item);
});
That code retrieves 3 messages from queue and processes then in "3 threads" (Note: it is not guaranteed that it will use 3 threads, .NET will analyze the system resources and it will use up to 3 threads if necessary)
You could also remove the "WithDegreeOfParallelism" part and .NET will use whatever threads it needs.
At the end of the day there are multiple ways to do it, you have to decide which one works better for you.
UPDATE: Sample without using ASYNC/AWAIT
This is a basic (without error checking) sample using regular Begin/End Async pattern.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Net;
using System.Threading;
using Microsoft.ServiceBus;
using Microsoft.ServiceBus.Messaging;
using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.ServiceRuntime;
namespace WorkerRoleWithSBQueue1
{
public class WorkerRole : RoleEntryPoint
{
// The name of your queue
const string QueueName = "QUEUE_NAME";
const int MaxThreads = 3;
// QueueClient is thread-safe. Recommended that you cache
// rather than recreating it on every request
QueueClient Client;
bool IsStopped;
int dequeueRequests = 0;
public override void Run()
{
while (!IsStopped)
{
// Increment Request Counter
Interlocked.Increment(ref dequeueRequests);
Trace.WriteLine(dequeueRequests + " request(s) in progress");
Client.BeginReceive(new TimeSpan(0, 0, 10), ProcessUrgentEmails, Client);
// If we have made too many requests, wait for them to finish before requesting again.
while (dequeueRequests >= MaxThreads && !IsStopped)
{
System.Diagnostics.Trace.WriteLine(dequeueRequests + " requests in progress, waiting before requesting more work");
Thread.Sleep(2000);
}
}
}
void ProcessUrgentEmails(IAsyncResult result)
{
var qc = result.AsyncState as QueueClient;
var sendEmail = qc.EndReceive(result);
// We have received a message or has timeout... either way we decrease our counter
Interlocked.Decrement(ref dequeueRequests);
// If we have a message, process it
if (sendEmail != null)
{
var r = new Random();
// Process the message
Trace.WriteLine("Processing message: " + sendEmail.MessageId);
System.Threading.Thread.Sleep(r.Next(10000));
// Mark it as completed
sendEmail.BeginComplete(ProcessEndComplete, sendEmail);
}
}
void ProcessEndComplete(IAsyncResult result)
{
var bm = result.AsyncState as BrokeredMessage;
bm.EndComplete(result);
Trace.WriteLine("Completed message: " + bm.MessageId);
}
public override bool OnStart()
{
// Set the maximum number of concurrent connections
ServicePointManager.DefaultConnectionLimit = 12;
// Create the queue if it does not exist already
string connectionString = CloudConfigurationManager.GetSetting("Microsoft.ServiceBus.ConnectionString");
var namespaceManager = NamespaceManager.CreateFromConnectionString(connectionString);
if (!namespaceManager.QueueExists(QueueName))
{
namespaceManager.CreateQueue(QueueName);
}
// Initialize the connection to Service Bus Queue
Client = QueueClient.CreateFromConnectionString(connectionString, QueueName);
IsStopped = false;
return base.OnStart();
}
public override void OnStop()
{
// Waiting for all requestes to finish (or timeout) before closing
while (dequeueRequests > 0)
{
System.Diagnostics.Trace.WriteLine(dequeueRequests + " request(s), waiting before stopping");
Thread.Sleep(2000);
}
// Close the connection to Service Bus Queue
IsStopped = true;
Client.Close();
base.OnStop();
}
}
}
Hope it helps.