I'm working on a program that pings a long list of IP addresses. Currently there are about 250 in the database to ping, and it takes a long time to get through all of them. The program also sends email alerts when that status changes (from failed to success or vice versa). At worst, it takes two seconds for each ping (if they fail). So it generally takes about 8 minutes for the whole program to cycle through.
I'd like the email alerts to be closer to real time if possible. Would calling the ping function asynchronously allow them to fire off more quickly without waiting for the response from the one before it? I'm new to non-synchronous programming of any sort, and I'm not sure if this is an appropriate situation to use it in.
If it is, any pointers towards resources for getting started with this would be much appreciated!
The network should support sending 250 pings in below one second. You should probably start all pings at once using async IO, then use Task.WaitAll to collect the results. That way you are done within 2sec.
With 250 work items I would strongly prefer a solution using async IO. You can use threads as well if you like.
Find out what async methods the Ping class provides. Learn about async/await and Task. This is a good use-case for them.
Related
I have about 10.000 jobs that I want to be handled by approx 100 threads. Once a thread finished, the free 'slot' should get a new job untill there are no more jobs available.
Side note: processor load is not an issue, these jobs are mostly waiting for results or (socket) timeouts. And the amount of 100 is something that I am going to play with to find an optimum. Each job will take between 2 seconds and 5 minutes. So I want to assign new jobs to free threads and not pre-assign all jobs to threads.
My problem is that I am not sure how to do this. Im primarily using Visual Basic .Net (but C# is also ok).
I tried to make an array of threads but since each job/thread also returns a value (it also takes 2 input vars), I used 'withevents' and found out that you cannot do that on an array... maybe a collection would work? But I also need a way to manage the threads and feed them new jobs... And all results should go back to the main-form (thread)...
I have it all running in one thread, but now I want to speed up.
And then I though: Actually this is a rather common problem. There is a bunch of work to be done that needs to be distributed over an amount of worker threads.... So thats why I am asking. Whats the most common solution here?
I tried to make it question as generic as possible, so lots of people with the same kind of problem can be helped with your reply. Thanks!
Edit:
What I want to do in more detail is the following. I currently have about 1200 connected sensors that I want to read from via sockets. First thing I want to know is if the device is online (can connect on ip:port) or not. After it connects it will be depending on the device type. The device type is known after connect and Some devices I just read back a sensor value. Other devices need calibration to be performed, taking up to 5 minutes with mostly wait times and some reading/setting of values. All via the socket. Some even have FTP that I need to download a file from, but that I do via socket to.
My problem: Lot's of waiting time, so lot's of possibility to do things paralel and speed it up hugely.
My starting point is a list of ip:port addresses and I want to end up with a file with that shows the results and the results are also shown on a textbox on the main form (next to a start/pause/stop button)
This was very helpfull:
Multi Threading with Return value : vb.net
It explains the concept of a BackgroundWorker which takes away a lot of the hassle. I am now trying to see where it will bring me.
I have to refactor a fairly time-consuming process in one of my applications and after doing some research I think it's a perfect match for using TPL. I wanted to clarify my understanding of it and ask if there are any more issues which I should take into account.
In few words, I have a windows service, which runs overnight and sends out emails with data updates to around 10000 users. At presence, the whole process takes around 8 hrs to complete. I would like to reduce it to 2 hrs max.
Application workflow follows steps below:
1. Iterate through all users list
2. Check if this user has to be notified
3. If so, create an email body by calling external service
4. Send an email
Analysis of the code has shown that step 3 is the most time-consuming one and takes around 3,5 sec to complete. It means, that when processing 10000 users, my application waits well over 6 hrs in total for a response from the external service! I think this is a reason good enough to try to introduce some asynchronous and parallel processing.
So, my plan is to use Parallel class and ForEach method to iterate through users in step 1. As I can understand this should distribute processing each user into a separate thread, making them run in parallel? Processes are completely independent of each other and each doesn't return any value. In the case of any exception being thrown it will be persisted in logs db. As with regards to step 3, I would like to convert a call to external service into an async call. As I can understand this would release the resources on the thread so it could be reused by the Parallel class to start processing next user from the list?
I had a read through MS documentation regarding TPL, especially Potential Pitfalls in Data and Task Parallelism document and the only point I'm not sure about is "Avoid Writing to Shared Memory Locations". I am using a local integer to count a total number of emails processed. As with regards to all of the rest, I'm quite positive they're not applicable to my scenario.
My question is, without any implementation as yet. Is what I'm trying to achieve possible (especially the async await part for external service call)? Should I be aware of any other obstacles that might affect my implementation? Is there any better way of improving the workflow?
Just to clarify I'm using .Net v4.0
Yes, you can use the TPL for your problem. If you cannot influence your external problem, then this might be the best way.
However, you can make the best gains if you can get your external source to accept batches. Because this source could actually optimize the performance. Right now you have a message overhead of 10000 messages to serialize, send, work on, receive and deserialize. This is stuff that could be done once. In addition, your external source might be able to optimize the work they do if they know they will get multiple records.
So the bottom line is: if you need to optimize locally, the TPL is fine. If you want to optimize your whole process for actual gains, try to find out if your external source can help you, because that is where you can make some real progress.
You didn't show any code, and I'm assuming that step 4 (send an e-mail) is not that fast either.
With the presented case, unless your external service from step 3 (create an email body by calling external service) processes requests in parallel and supports a good load of simultaneous requests, you will not gain much with this refactor.
In other words, test the external service and the e-mail server first for:
Parallel request execution
The way to test this is to send at least 2 simultaneous requests and observe how long it takes to process them.
If it takes about double the time of a single, the requests have some serial processing, either they're queued or some broad lock is being taken.
Load test
Go up to 4, 8, 12, 16, 20, etc, and see where it starts to degrade.
You should set a limit on the amount of simultaneous requests to something that keeps execution time above e.g. 80% of the time it takes to process a single request, assuming you're the sole consumer
Or a few requests before it starts degrading (e.g. divide by the number of consumers) to leave the external service available for other consumers.
Only then can you decide if the refactor is worth. If you can't change the external service or the e-mail server, you must weight it they offer enough parallel capability without degrading.
Even so, be realistic. Don't let your service push the external service and the e-mail server to their limits in production.
I'm trying to implement a basic UDP client. One of its functions is the ability to probe computers to see if a UDP server is listening. I need to scan lots of these computers quickly.
I can't use the Socket.BeginReceiveFrom method and run a timeout waiting for it to complete, because callbacks may occur after the timeout is over, and seeing as many computers are being probed quickly, I found that later callbacks ended up using modified data as a new probe was already underway when the callback was finally invoked.
I can't use the Socket.ReceiveFrom method and set a Socket.ReceiveTimeout because the SocketException being thrown+handled takes a long time (not sure why, I'm not running much code to handle it), meaning it takes about 2 seconds per computer rather than 100ms like hoped.
Is there any way of running a timeout on a synchronous call to ReceiveFrom without using exceptions to determine when the call has failed/succeeded? Or is there a tactic I've not yet taken that you think could work?
Any advice is appreciated.
I decided to rewrite the probe code using TCP.
However, I later discovered the Socket.ReceiveFromAsync method which, seeing as it only receives a single datagram per call, would have made life easier.
We've built this app that needs to have some calculations done on a remote machine (actually a MatLab server). We're using web services to connect to the MatLab server and perform the calculations.
In order to speed things up, we've used Parallel.ForEach() in order to have multiple service calls going at the same time. If we're very conservative in setting ParallelOptions.MaxDegreeOfParallelism (DOP) to 4 or something, everything works fine and well.
However, if we let the framework decide on the DOP it will spawn so many threads that it forces the remote machine on its knees and timeouts start occurring ( > 10 minutes ).
How can we solve this issue? What I would LOVE to be able to do is use the response time to throttle the calls. If response time is less than 30 sec, keep adding threads, as soon as it's over 30 sec, use less. Any suggestions?
N.B. Related to the response in this question: https://stackoverflow.com/a/20192692/896697
Simplest way would be to tune for the best number of concurrent requests and hardcode that as you have done so far, however there are some nicer options if you are willing to put in some effort.
You could move from a Parallel.ForEach to using a thread pool. That way as things come back from the remote server you can either manually or programatically tune the number of available threads. reducing/increasing the number of available threads as things slow down/speed up, or even kill them if needed.
You could also do a variant of the above using Tasks which are the newer way of doing parallel/async stuff in .net.
Another option would be to use a timers and/or jobs model to schedule jobs every x milliseconds, which could then be throttled/relaxed as results returned from the server. The easiest way to get started would be using Quartz.Net.
The requirement of the TCP server:
receive from each client and send
result back to same client (the
server only do this)
require to cater for 100 clients
speed is an important factor, ie:
even at 100 client connections, it should not be laggy.
For now I have been using C# async method, but I find that I always encounter laggy at around 20 connections. By laggy I mean taking around almost 15-20 seconds to get the result. At around 5-10 connections, time to get result is almost immediate.
Actually when the tcp server got the message, it will interact with a dll which does some processing to return a result. Not exactly sure what is the workflow behind it but at small scale you do not see any problem, so I thought the problem might be with my TCP server.
Right now, I thinking of using a sync method. Doing so, I will have a while loop to block the accept method, and spawn a new thread for each client after accept. But at 100 connections, it is definitely overkill.
Chance upon IOCP, not exactly sure, but it seems to be like a connection pool, as the way it handles tcp is quite like the normal way.
For these TCP methods I am also not sure whether it is a better option to open and close connection each time message needs to be passed. On average, message are passed from each client at around 5-10 min interval.
Another alternative might be to use a web, (looking at generic handler) to form only 1 connection with the server. Any message that needs to be handled will be passed to this generic handler, which then sends and receive message from the server.
Need advice from especially those who did TCP in large scale. I do not have 100 PC for me to test out, so quite hard for me. Language wise C# or C++ will do, I'm more familar with C#, but will consider porting to C++ for the speed.
You must be doing it wrong. I personally wrote C# based servers that could handle 1000+ connections, sending more than 1 message per second, with <10ms response time, on commodity hardware.
If you have such high response times it must be your server process that is causing blocking. Perhaps contention on locks, perhaps plain bad code, perhaps blocking on external access leading to thread pool exhaustion. Unfortunately, there are plenty of ways to screw this up, and only few ways to get it right. There are good guidelines out there, starting with the fundamentals covered in Rick Vicik's High Performance Windows Programming articles, going over the SocketAsyncEventArgs example which covers the most performant way of writing socket apps in .Net since the advent of Socket Performance Enhancements in Version 3.5 and so on and so forth.
If you find yourself lost at the task ahead (as it seems you happen to be) I would urge you to embrace an established communication framework, perhaps WCF with a net binding, and use the declarative service model programming of WCF. This way you'll piggyback on the WCF performance. While this may not be enough for some, it will get you far enough, much further than you are right now for sure, with regard to performance.
I don't see why C# should be any worse than C++ in this situation - chances are that you've not yet hit upon the 'right way' to handle the incoming connections. Spawning off a separate thread for each client would certainly be a step in the right direction, assuming that workload for each thread is more I/O bound than CPU intensive. Whether you spawn off a thread per connection or use a thread pool to manage a number of threads is another matter - and something to determine through experimentation and also whilst considering whether 100 clients is your maximum!