C#, Maximize Thread Concurrency - c#

With the help of Google and community, I was able to build a nice set of methods allowing me to asynchronously call a function. This function is testing remote host properties, so it is idling most of the time. For this reason I would like to maximize the number of concurrent threads launched such that all calls can be processed in the minimum amount of time.
Here is the Code I have so far:
// Check remote host connectivity
public static class CheckRemoteHost
{
// Private Class members
private static bool AllDone = false;
private static object lockObj = new object();
private static List<string> IPs;
// Wrapper: manage async method <Ping>
public static List<string> Ping(HashSet<string> IP_Ports, int TimeoutInMS = 100)
{// async worker method: check remote host via <Ping>
// Locals
IPs = new List<string>();
// Perform remote host check
AllDone = false;
Ping_check(IP_Ports, TimeoutInMS);
while (!AllDone) { CommonLib.Utils.ApplicationWait(10, 10); }
// Finish
return IPs;
}
private static async void Ping_check(HashSet<string> IP_Ports, int timeout)
{
// Locals
var tasks = new List<Task>();
// Build task-set for parallel Ping checks
foreach (string host in IP_Ports)
{
var task = PingAndUpdateAsync(host, timeout);
tasks.Add(task);
}
// Start execution queue
await Task.WhenAll(tasks).ContinueWith(t =>
{
AllDone = true;
});
}
private static async Task PingAndUpdateAsync(string ip, int timeout)
{
// Locals
System.Net.NetworkInformation.Ping ping;
System.Net.NetworkInformation.PingReply reply;
try
{
ping = new System.Net.NetworkInformation.Ping();
reply = await ping.SendPingAsync(ip, timeout);
if(reply.Status == System.Net.NetworkInformation.IPStatus.Success)
{
lock (lockObj)
{
IPs.Add(ip);
}
}
}
catch
{
// do nothing
}
}
}// end public static class CheckRemoteHost
This code is tested quite extensively, and the code seems stable and reliably report live hosts. Having said that, I know that it only spawns 8 threads at a time (= number of logical core on my test machine).
The key portion of the code is this:
// Start execution queue
await Task.WhenAll(tasks).ContinueWith(t =>
{
AllDone = true;
});
This is where I would like to increase/ maximize the number of concurrently launched threads to something like 25 per core (remember the thread job is 99% idle).
So far, my thread concurrency research has brought up the explicit thread and Parallel.For approaches. However, these seem to have the same shortcoming of spawning no more than 8 threads.
Any help would be very much appreciated, so thank you very much in advance everyone for looking!

You're making your life hard with the code you have. It's got a lot of plumbing that isn't needed and you're sharing static fields that would cause your code to fail if you called Ping a second time while the first one is running.
You need to get rid of all of that stuff.
I'd suggest using Microsoft's Reactive Framework - just NuGet "System.Reactive" and add using System.Reactive.Linq; to your code. Then you can do this:
public static class CheckRemoteHost
{
public static IList<string> Ping(HashSet<string> IP_Ports, int TimeoutInMS = 100)
{
var query =
from host in IP_Ports.ToObservable()
from status in Observable.FromAsync(() => PingAsync(host, TimeoutInMS))
where status
select host;
return query.ToList().Wait();
}
private static async Task<bool> PingAsync(string ip, int timeout)
{
try
{
var ping = new System.Net.NetworkInformation.Ping();
var reply = await ping.SendPingAsync(ip, timeout);
return reply.Status == System.Net.NetworkInformation.IPStatus.Success;
}
catch
{
return false;
}
}
}
That's it. That's all of the code you need. It's automatically maximising the thread use to get the job done.

Related

Azure DeviceClient does not shut down dotnetty threads on program exit

When using Microsoft.Azure.Devices.Client.DeviceClient .net framework 4.8 closing out the application leaves multiple Threads running. Specifically DotNetty.Common.dll! DotNetty.Common.Concurrency.SingleThreadEventExecutor.PollTask
Versions 1.34.0 & 1.35.0 of Microsoft.Azure.Devices have this same problem.
Are we using DeviceClient improperly?
Is it a async thing im not understanding?
Am i missing a call to shut it down properly?
From examples online, i shouldn't have to do anything special and it should close it self out.
However it still hangs, currently this is a close implementation. I have yet to make a stand alone, so i havent duplicated this problem with only DeviceClient Code running
When the program exits, is_running gets set, and the program closes down other threads. Eventually we call
Environment.Exit(0);
This should be all the relevant code
private void thread_method()
{
using (var _deviceClient = DeviceClient.CreateFromConnectionString(connection), TransportType.Mqtt))
{
while (is_running)
{
var db = new Database(); // roughly an open entity framework connection
List <class> unprocessed_messages = db.GetUnprocessed();
List<List<Messages>> processed = breakup_method(unprocessed_messages);
foreach (var sublist in processed)
{
if (!await SendMessages(sublist , _deviceClient))
break;
// the processed sublist was successful
db.SaveChanges(); // make sure we dont send again
}
}
Thread.Sleep(500);
await _deviceClient.CloseAsync();
}
}
private async Task<bool> SendMessages(List<Message> messages, DeviceClient _deviceClient)
{
try
{
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource(5000);
CancellationToken cancellationToken = cancellationTokenSource.Token;
await _deviceClient.SendEventBatchAsync(messages, cancellationToken);
if (cancellationToken.IsCancellationRequested)
return false;
return true;
}
catch (Exception e)
{
// logging
}
return false;
}
Different approach, which doesnt actively send anything.
Just an open , sleep until the program exits, Then close,
All in a using statement.
8 threads are still running the PollTask, and in the amount of time it took to setup everything above, was the time i was waiting for them to close. Which was at least 5 minutes.
private void thread_method()
{
using (var _deviceClient = DeviceClient.CreateFromConnectionString(connection), TransportType.Mqtt))
{
await _deviceClient.OpenAsync();
while (is_running) Thread.Sleep(500);
await _deviceClient.CloseAsync();
}
}
Last update, stand alone console app.
100% not my problem.
// Repost just in case
class Program
{
private static string _connection_string = $"HostName={url};DeviceId={the_id};SharedAccesskey={key}";// fill your in
public static bool is_running = false;
static void Main(string[] args)
{
is_running = true;
new System.Threading.Thread(new System.Threading.ThreadStart(thread_method)).Start();
Console.WriteLine("enter to exit");
String line = Console.ReadLine();
is_running = false;
}
public static async void thread_method()
{
using (var _deviceClient = DeviceClient.CreateFromConnectionString(_connection_string, TransportType.Mqtt))
{
await _deviceClient.OpenAsync();
while (is_running) System.Threading.Thread.Sleep(500);
await _deviceClient.CloseAsync();
}
}
}
https://github.com/Azure/azure-sdk-for-net/issues/24550
Bumped to the proper location
https://github.com/Azure/azure-iot-sdk-csharp/issues/2194
https://github.com/Azure/azure-sdk-for-net/issues/24550
https://github.com/Azure/azure-iot-sdk-csharp/issues/2194
Not a configuration issue, a 'dot netty' bug was hanging.
The fix, get a newer azure version Microsoft.Azure.Devices > 1.35.0

MongoDB Client throws Wait Queue Full exception

I am seeing an odd issue where The .NET client for MongoDB throws a The wait queue for acquiring a connection to server 127.0.0.1:27017 is full. exception.
I have a semaphore that guards any call to MongoDB, with a size of 10.
Meaning, there are never more than 10 concurrent calls to Mongo.
The default connection pool size is 100 for the .NET driver, which is more than 10.
so 10 concurrent calls should not be an issue.
To replicate this I have the following code, contrived yes, but it makes the issue visible.
I also found this spec for MongoDB
https://github.com/mongodb/specifications/blob/master/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#id94
Is that related?
Does each calling thread (thread pool worker in this case) go into the wait queue and try to grab a connection, and if I have more worker threads, even if concurrency level is low, the connections still have to be assigned to this new calling worker thread?
using System;
using System.Threading.Tasks;
using MongoDB.Bson;
using MongoDB.Driver;
using System.Threading;
namespace ConsoleApp58
{
public class AsyncSemaphore
{
private readonly SemaphoreSlim _semaphore;
public AsyncSemaphore(int maxConcurrency)
{
_semaphore = new SemaphoreSlim(
maxConcurrency,
maxConcurrency
);
}
public async Task<T> WaitAsync<T>(Task<T> task)
{
await _semaphore.WaitAsync();
//proves we have the correct max concurrent calls
// Console.WriteLine(_semaphore.CurrentCount);
try
{
var result = await task;
return result;
}
finally
{
_semaphore.Release();
}
}
}
class Program
{
public class SomeEntity
{
public ObjectId Id { get; set; }
public string Name { get; set; }
}
static void Main(string[] args)
{
var settings = MongoClientSettings.FromUrl(MongoUrl.Create("mongodb://127.0.0.1:27017"));
// settings.MinConnectionPoolSize = 10;
// settings.MaxConnectionPoolSize = 1000;
// I get that I can tweak settings, but I want to know why this occurs at all?
// if we guard the calls with a semaphore, how can this happen?
var mongoClient = new MongoClient(settings);
var someCollection = mongoClient.GetDatabase("dummydb").GetCollection<SomeEntity>("some");
var a = new AsyncSemaphore(10);
// is this somehow related ?
// https://github.com/mongodb/specifications/blob/master/source/connection-monitoring-and-pooling/connection-monitoring-and-pooling.rst#id94
_ = Task.Run(() =>
{
while (true)
{
// this bit is protected by a semaphore of size 10
// (we will flood the thread pool with ongoing tasks, yes)
_ = a.WaitAsync(RunTask(someCollection))
//after the task is done, dump the result
// dot is OK, else exception message
.ContinueWith(x =>
{
if (x.IsFaulted)
{
Console.WriteLine(x.Exception);
}
});
}
}
);
Console.ReadLine();
}
private static async Task<SomeEntity> RunTask(IMongoCollection<SomeEntity> pids)
{
//simulate some mongo interaction here
var res = await pids.Find(x => x.Name == "").FirstOrDefaultAsync();
return res;
}
}
}
Connections take time to be established. You do not instantly get 100 usable connections. If you create a client and immediately request even 10 operations, while there are no available connections, you can hit the wait queue timeout.
Some drivers also had a wait queue length limit. It's not standardized and should be deprecated in my understanding but may continue to exist for compatibility reasons. Consult your driver docs to see how to raise it.
Then, either increase waitQueueTimeoutMS or ramp up the load gradually or wait for connections to be established prior to starting the load (you can use CMAP events for the latter).
Make sure your concurrency bound of 10 outstanding operations is actually working properly too.

C# Windows Async Pinging Network - different results each run

I've written a class that asynchronously pings a subnet. It works, however, the number of hosts returned will sometimes change between runs. Some questions:
Am I doing something wrong in the code below?
What can I do to make it work better?
The ScanIPAddressesAsync() method is called like this:
NetworkDiscovery nd = new NetworkDiscovery("192.168.50.");
nd.RaiseIPScanCompleteEvent += HandleScanComplete;
nd.ScanIPAddressesAsync();
namespace BPSTestTool
{
public class IPScanCompleteEvent : EventArgs
{
public List<String> IPList { get; set; }
public IPScanCompleteEvent(List<String> _list)
{
IPList = _list;
}
}
public class NetworkDiscovery
{
private static object m_lockObj = new object();
private List<String> m_ipsFound = new List<string>();
private String m_ipBase = null;
public List<String> IPList
{
get { return m_ipsFound; }
}
public EventHandler<IPScanCompleteEvent> RaiseIPScanCompleteEvent;
public NetworkDiscovery(string ipBase)
{
this.m_ipBase = ipBase;
}
public async void ScanIPAddressesAsync()
{
var tasks = new List<Task>();
m_ipsFound.Clear();
await Task.Run(() => AsyncScan());
return;
}
private async void AsyncScan()
{
List<Task> tasks = new List<Task>();
for (int i = 2; i < 255; i++)
{
String ip = m_ipBase + i.ToString();
if (m_ipsFound.Contains(ip) == false)
{
for (int x = 0; x < 2; x++)
{
Ping p = new Ping();
var task = HandlePingReplyAsync(p, ip);
tasks.Add(task);
}
}
}
await Task.WhenAll(tasks).ContinueWith(t =>
{
OnRaiseIPScanCompleteEvent(new IPScanCompleteEvent(m_ipsFound));
});
}
protected virtual void OnRaiseIPScanCompleteEvent(IPScanCompleteEvent args)
{
RaiseIPScanCompleteEvent?.Invoke(this, args);
}
private async Task HandlePingReplyAsync(Ping ping, String ip)
{
PingReply reply = await ping.SendPingAsync(ip, 1500);
if ( reply != null && reply.Status == System.Net.NetworkInformation.IPStatus.Success)
{
lock (m_lockObj)
{
if (m_ipsFound.Contains(ip) == false)
{
m_ipsFound.Add(ip);
}
}
}
}
}
}
One problem I see is async void. The only reason async void is even allowed is only for event handlers. If it's not an event handler, it's a red flag.
Asynchronous methods always start running synchronously until the first await that acts on an incomplete Task. In your code, that is at await Task.WhenAll(tasks). At that point, AsyncScan returns - before all the tasks have completed. Usually, it would return a Task that will let you know when it's done, but since the method signature is void, it cannot.
So now look at this:
await Task.Run(() => AsyncScan());
When AsyncScan() returns, then the Task returned from Task.Run completes and your code moves on, before all of the pings have finished.
So when you report your results, the number of results will be random, depending on how many happened to finish before you displayed the results.
If you want make sure that all of the pings are done before continuing, then change AsyncScan() to return a Task:
private async Task AsyncScan()
And change the Task.Run to await it:
await Task.Run(async () => await AsyncScan());
However, you could also just get rid of the Task.Run and just have this:
await AsyncScan();
Task.Run runs the code in a separate thread. The only reason to do that is in a UI app where you want to move CPU-heavy computations off of the UI thread. When you're just doing network requests like this, that's not necessary.
On top of that, you're also using async void here:
public async void ScanIPAddressesAsync()
Which means that wherever you call ScanIPAddressesAsync() is unable to wait until everything is done. Change that to async Task and await it too.
This code needs a lot of refactoring and bugs like this in concurrency are hard to pinpoint. My bet is on await Task.Run(() => AsyncScan()); which is wrong because AsyncScan() is async and Task.Run(...) will return before it is complete.
My second guess is m_ipsFound which is called a shared state. This means there might be many threads simultaneously reading and writing on this. List<T> is not a data type for this.
Also as a side point having a return in the last line of a method is not adding to the readability and async void is a prohibited practice. Always use async Task even if you return nothing. You can read more on this very good answer.

Parallel processing using TPL in windows service

I have a windows service which is consuming a messaging system to fetch messages. I have also created a callback mechanism with the help of Timer class which helps me to check the message after some fixed time to fetch and process. Previously, the service is processing the message one by one. But I want after the message arrives the processing mechanism to execute in parallel. So if the first message arrived it should go for processing on one task and even if the processing is not finished for the first message still after the interval time configured using the callback method (callback is working now) next message should be picked and processed on a different task.
Below is my code:
Task.Factory.StartNew(() =>
{
Subsriber<Message> subsriber = new Subsriber<Message>()
{
Interval = 1000
};
subsriber.Callback(Process, m => m != null);
});
public static void Process(Message message)
{
if (message != null)
{
// Processing logic
}
else
{
}
}
But using the Task Factory I am not able to control the number of tasks in parallel so in my case I want to configure the number of tasks on which messages will run on the availability of the tasks?
Update:
Updated my above code to add multiple tasks
Below is the code:
private static void Main()
{
try
{
int taskCount = 5;
Task.Factory.StartNewAsync(() =>
{
Subscriber<Message> consumer = new
Subcriber<Message>()
{
Interval = 1000
};
consumer.CallBack(Process, msg => msg!=
null);
}, taskCount);
Console.ReadLine();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
public static void StartNewAsync(this TaskFactory
target, Action action, int taskCount)
{
var tasks = new Task[taskCount];
for (int i = 0; i < taskCount; i++)
{
tasks[i] = target.StartNew(action);
}
}
public static void Process(Message message)
{
if (message != null)
{
}
else
{ }
}
}
I think what your looking for will result in quite a large sample. I'm trying just to demonstrate how you would do this with ActionBlock<T>. There's still a lot of unknowns so I left the sample as skeleton you can build off. In the sample the ActionBlock will handle and process in parallel all your messages as they're received from your messaging system
public class Processor
{
private readonly IMessagingSystem _messagingSystem;
private readonly ActionBlock<Message> _handler;
private bool _pollForMessages;
public Processor(IMessagingSystem messagingSystem)
{
_messagingSystem = messagingSystem;
_handler = new ActionBlock<Message>(msg => Process(msg), new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 5 //or any configured value
});
}
public async Task Start()
{
_pollForMessages = true;
while (_pollForMessages)
{
var msg = await _messagingSystem.ReceiveMessageAsync();
await _handler.SendAsync(msg);
}
}
public void Stop()
{
_pollForMessages = false;
}
private void Process(Message message)
{
//handle message
}
}
More Examples
And Ideas
Ok, sorry I'm short on time but here's the general idea/skeleton of what I was thinking as an alternative.
If I'm honest though I think the ActionBlock<T> is the better option as there's just so much done for you, with the only limit being that you can't dynamically scale the amount of work it will do it once, although I think the limit can be quite high. If you get into doing it this way you could have more control or just have a kind of dynamic amount of tasks running but you'll have to do a lot of things manually, e.g if you want to limit the amount of tasks running at a time, you'd have to implement a queueing system (something ActionBlock handles for you) and then maintain it. I guess it depends on how many messages you're receiving and how fast your process handles them.
You'll have to check it out and think of how it could apply to your direct use case as I think some of the details area a little sketchily implemented on my side around the concurrentbag idea.
So the idea behind what I've thrown together here is that you can start any number of tasks, or add to the tasks running or cancel tasks individually by using the collection.
The main thing I think is just making the method that the Callback runs fire off a thread that does the work, instead of subscribing within a separate thread.
I used Task.Factory.StartNew as you did, but stored the returned Task object in an object (TaskInfo) which also had it's CancellationTokenSource, it's Id (assigned externally) as properties, and then added that to a collection of TaskInfo which is a property on the class this is all a part of:
Updated - to avoid this being too confusing i've just updated the code that was here previously.
You'll have to update bits of it and fill in the blanks in places like with whatever you have for my HeartbeatController, and the few events that get called because they're beyond the scope of the question but the idea would be the same.
public class TaskContainer
{
private ConcurrentBag<TaskInfo> Tasks;
public TaskContainer(){
Tasks = new ConcurrentBag<TaskInfo>();
}
//entry point
//UPDATED
public void StartAndMonitor(int processorCount)
{
for (int i = 0; i <= processorCount; i++)
{
Processor task = new Processor(ProcessorId = i);
CreateProcessorTask(task);
}
this.IsRunning = true;
MonitorTasks();
}
private void CreateProcessorTask(Processor processor)
{
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
Task taskInstance = Task.Factory.StartNew(
() => processor.Start(cancellationTokenSource.Token)
);
//bind status update event
processor.ProcessorStatusUpdated += ReportProcessorProcess;
Tasks.Add(new ProcessorInfo()
{
ProcessorId = processor.ProcessorId,
Task = taskInstance,
CancellationTokenSource = cancellationTokenSource
});
}
//this method gets called once but the HeartbeatController gets an action as a param that it then
//executes on a timer. I haven't included that but you get the idea
//This method also checks for tasks that have stopped and restarts them if the manifest call says they should be running.
//Will also start any new tasks included in the manifest and stop any that aren't included in the manifest.
internal void MonitorTasks()
{
HeartbeatController.Beat(() =>
{
HeartBeatHappened?.Invoke(this, null);
List<int> tasksToStart = new List<int>();
//this is an api call or whatever drives your config that says what tasks must be running.
var newManifest = this.GetManifest(Properties.Settings.Default.ResourceId);
//task Removed Check - If a Processor is removed from the task pool, cancel it if running and remove it from the Tasks List.
List<int> instanceIds = new List<int>();
newManifest.Processors.ForEach(x => instanceIds.Add(x.ProcessorId));
var removed = Tasks.Select(x => x.ProcessorId).ToList().Except(instanceIds).ToList();
if (removed.Count() > 0)
{
foreach (var extaskId in removed)
{
var task = Tasks.FirstOrDefault(x => x.ProcessorId == extaskId);
task.CancellationTokenSource?.Cancel();
}
}
foreach (var newtask in newManifest.Processors)
{
var oldtask = Tasks.FirstOrDefault(x => x.ProcessorId == newtask.ProcessorId);
//Existing task check
if (oldtask != null && oldtask.Task != null)
{
if (!oldtask.Task.IsCanceled && (oldtask.Task.IsCompleted || oldtask.Task.IsFaulted))
{
var ex = oldtask.Task.Exception;
tasksToStart.Add(oldtask.ProcessorId);
continue;
}
}
else //New task Check
tasksToStart.Add(newtask.ProcessorId);
}
foreach (var item in tasksToStart)
{
var taskToRemove = Tasks.FirstOrDefault(x => x.ProcessorId == item);
if (taskToRemove != null)
Tasks.Remove(taskToRemove);
var task = newManifest.Processors.FirstOrDefault(x => x.ProcessorId == item);
if (task != null)
{
CreateProcessorTask(task);
}
}
});
}
}
//UPDATED
public class Processor{
private int ProcessorId;
private Subsriber<Message> subsriber;
public Processor(int processorId) => ProcessorId = processorId;
public void Start(CancellationToken token)
{
Subsriber<Message> subsriber = new Subsriber<Message>()
{
Interval = 1000
};
subsriber.Callback(Process, m => m != null);
}
private void Process()
{
//do work
}
}
Hope this gives you an idea of how else you can approach your problem and that I didn't miss the point :).
Update
To use events to update progress or which tasks are processing, I'd extract them into their own class, which then has subscribe methods on it, and when creating a new instance of that class, assign the event to a handler in the parent class which can then update your UI or whatever you want it to do with that info.
So the content of Process() would look more like this:
Processor processor = new Processor();
Task task = Task.Factory.StartNew(() => processor.ProcessMessage(cancellationTokenSource.CancellationToken));
processor.StatusUpdated += ReportProcess;

Maximize Task Concurrency - using TCP/IP with specific Port Filtering

This is a follow-up question to this question. On the next level, I now want to use maximal task concurrency to connect to expected hosts on a large set of IP addresses, using TCP/IP on a specific port.
My own research, as well as community reference, has lead me to key articles, for example:
How to check TCP/IP port availability using C# (Socket Communication)
Checking if ip with port is available?
How to set the timeout for a TcpClient?
A very impressive solution for large-scale pinging: Multithreading C# GUI ping example
And of course the precursor to this question: C#, Maximize Thread Concurrency
This allowed me to set up my own code, which works fine, but currently takes a full 30 seconds to finish scanning 255 IPs, using only one specific port. Given the test, machine has 8 logical cores this observation suggests that my construct actually spawns at maximum 8 concurrent tasks (255/8=31.85).
The function I wrote returns a list of responding IPs {IPs} which is a subset of the List of all IPs {IP_Ports} to be checked. This is my current code, working fine but not yet suitable for use on larger networks due to what I suspect is lack of efficient task concurrency:
// Check remote host connectivity
public static class CheckRemoteHost
{
// Private Class members
private static bool AllDone = false;
private static object lockObj = new object();
private static List<string> IPs;
// Wrapper: manage async method <TCP_check>
public static List<string> TCP(Dictionary<string, int> IP_Ports, int TimeoutInMS = 100)
{
// Locals
IPs = new List<string>();
// Perform remote host check
AllDone = false;
TCP_check(IP_Ports, TimeoutInMS);
while (!AllDone) { Thread.Sleep(50); }
// Finish
return IPs;
}
private static async void TCP_check(Dictionary<string, int> IP_Ports, int timeout)
{// async worker method: check remote host via TCP-IP
// Build task-set for parallel IP queries
var tasks = IP_Ports.Select(host => TCP_IPAndUpdateAsync(host.Key, host.Value, timeout));
// Start execution queue
await Task.WhenAll(tasks).ContinueWith(t =>
{
AllDone = true;
});
}
private static async Task TCP_IPAndUpdateAsync(string ip, int port, int timeout)
{// method to call IP-check
// Run method asynchronously
await Task.Run(() =>
{
// Locals
TcpClient client;
IAsyncResult result;
bool success;
try
{
client = new TcpClient();
result = client.BeginConnect(ip, port, null, null);
success = result.AsyncWaitHandle.WaitOne(TimeSpan.FromMilliseconds(timeout));
if (success)
{
lock (lockObj)
{
IPs.Add(ip);
}
}
}
catch (Exception e)
{
// do nothing
}
});
}
}// end public static class CheckRemoteHost
So my question is: how can I maximize the task concurrency of requesting a response using TCP/IP at Port X such that I can obtain very fast IP-Port network scans on large internal networks?
Details
The default task scheduler is usually the ThreadPool scheduler. That means the number of concurrent tasks will be limited by the available threads in the pool.
Remarks
The thread pool provides new worker threads or I/O completion threads on demand until it reaches the minimum for each category. By default, the minimum number of threads is set to the number of processors on a system. When the minimum is reached, the thread pool can create additional threads in that category or wait until some tasks complete. Beginning with the .NET Framework 4, the thread pool creates and destroys threads in order to optimize throughput, which is defined as the number of tasks that complete per unit of time. Too few threads might not make optimal use of available resources, whereas too many threads could increase resource contention.
(Source: https://msdn.microsoft.com/en-us/library/system.threading.threadpool.getminthreads(v=vs.110).aspx)
You are likely just under the threshold where the threadpool would spin up new threads since tasks are being completed. Hence why you only have 8 concurrent tasks running at once.
Solutions
1. Use ConnectAsync with a timeout.
Instead of creating a separate task which blocks waiting for the connect. You can call ConnectAsync and join it with a delay to create a timeout. ConnectAsync doesn't seem to block the threadpool threads.
public static async Task<bool> ConnectAsyncWithTimeout(this Socket socket, string host, int port, int timeout = 0)
{
if (timeout < 0)
throw new ArgumentOutOfRangeException("timeout");
try
{
var connectTask = socket.ConnectAsync(host, port);
var res = await Task.WhenAny(connectTask, Task.Delay(timeout));
await res;
return connectTask == res && connectTask.IsCompleted && !connectTask.IsFaulted;
}
catch(SocketException se)
{
return false;
}
}
Example usage
private static async Task TCP_IPAndUpdateAsync(string ip, int port, int timeout)
{// method to call IP-check
client = new TcpClient();
var success = await client.Client.ConnectAsyncWithTimeout(ip, port, timeout);
if (success)
{
lock (lockObj)
{
IPs.Add(ip);
}
}
}
2. Use long running tasks.
Using Task.Factor.StartNew you can specify that the task is LongRunning. The threadpool task scheduler specifically will create a new thread for the task instead of using the threadpool. This will get around the 8 thread limit you are hitting. However, it should be noted that this is not a good solution if you plan to naively create thousands of tasks. Since at that point, the bottle neck will be thread context switches. You could however split all of the work between, for example, 100 tasks.
3. Use non-blocking connect
This method doesn't require creating multiple tasks. Instead you can call multiple connects on a single thread and check the status of multiple sockets at once. This method is a bit more involved though. If you rather go with this approach and want a more complete example then comment letting me know. Here is a quick snippet on how to use the API.
var socket = new Socket(SocketType.Stream, ProtocolType.Tcp);
socket.Blocking = false;
try
{
socket.Connect("127.0.0.1", 12345);
}
catch(SocketException se)
{
//Ignore the "A non-blocking socket operation could not be completed immediately" error
if (se.ErrorCode != 10035)
throw;
}
//Check the connection status of the socket.
var writeCheck = new List<Socket>() { socket };
var errorCheck = new List<Socket>() { socket };
Socket.Select(null, writeCheck, errorCheck, 0);
if (writeCheck.Contains(socket))
{
//Connection opened successfully.
}
else if (errorCheck.Contains(socket))
{
//Connection refused
}
else
{
//Connect still pending
}

Categories