I wondering would this work. I have a simple C# cmd line application. It sends out emails at a set time(through windows scheduler).
I am wondering if the smtp would say fail would this be a good idea?
In the smtpException I put thread that sleeps for say 15mins. When it wakes up it just calls that method again. This time hopefully the smtp would be back up. If not it would keep doing this until the smpt is back online.
Is some down side that I am missing about this? I would of course do some logging that this is happening.
This is not a bad idea, in fact what you are effectively implementing is a simple variation of the Circuit-Breaker pattern.
The idea behind the pattern is the fact that if an external resource is down, it will probably not come back up a few milliseconds later. It might need some time to recover. Typically the circuit breaker pattern is used as a mean to fail fast - so that the user can get an error sooner; or in order not to consume more resources on the failing system. When you have stuff that can be put in a queue, and does not require instant delivery, like you do, it is perfectly reasonable to wait around for the resource to become available again.
Some things to note though: You might want to have a maximum count of retries, before failing completely, and you might want to start off with a delay less than 15 minutes.
Exponential back-off is the common choice here I think. Like the strategy that TCP uses to try to make a connection: double the timeout on each failed attempt. Prevents your program from flooding the event log with repeated failure notifications before somebody notices that something is wrong. Which can take a while.
However, using the task scheduler certainly doesn't help. You really ought to reprogram it so your program isn't consuming machine resource needlessly. But using the ITaskService interface from .NET isn't that easy. Check out this project.
I would strongly recommend using a Windows Service. Long-running processes that run in the background, wait for long periods of time and need a controlled, logged, 'monitorable' lifetime: it's what Windows Services do.
Thread.Sleep would do the job, but if you want it to be interruptable from another thread or something else going on, I would recommend Monitor.Wait (MSDN ref). Then you can run your process in a thread created and managed by the Service, and if you need to stop/interrupt, you Monitor.Pulse on the same sync object and the thread will come back to life.
Also ref:
Best architecture for a 30 + hour query
Hope that helps!
Infinite loops are always a worry. You should set it up so that it will fail after N attempts, and you definitely should have some way to shut it down from the user console.
Failure is not such a bad thing when the failure isn't yours. Let it fail and report why it failed.
Your choices are limited. Assuming that it is just a temporary condition and that it has worked at some point. The only thing you can do is notify of a problem, get somebody to fix it and then retry the operation later. The only thing you need to do is safeguard the messages so that you do not lose any.
If you stick with what you got watchout for concurrency, perhaps a named mutex to ensure only a single process is running at a time.
I send out Notifications to all our developers in a similar fashion. Only, I store the message body and subject in the database. After a message has been successfully processed then I set a success flag in the database. This way its easy to track and report errors and retries are a cakewalk.
Related
Is it possible to send a heartbeat to hangfire (Redis Storage) to tell the system that the process is still alive? At the moment I set the InvisibilityTimeout to TimeSpan.MaxValue to prevent hangfire from restarting the job. But, if the process fails or the server restarts, the job will never be removed from the list of running jobs. So my idea was, to remove the large time out and send a kind of heartbeat instead. Is this possible?
I found https://discuss.hangfire.io/t/hangfire-long-job-stop-and-restart-several-time/4282/2 which deals with how to keep a long-running job alive in Hangfire.
The User zLanger says that jobs are considered dead and restarted once you ...
[...] are hitting hangfire’s invisibilityTimeout. You have two options.
increase the timeout to more than the job will ever take to run
have the job send a heartbeat to let hangfire’s know it’s still alive.
That's not new to you. But interestingly, the follow-up question there is:
How do you implement heartbeat on job?
This remains unanswered there, a hint that that your problem is really not trivial.
I have never handled long-running jobs in Hangfire, but I know the problem from other queuing systems like the former SunGrid Engine which is how I got interested in your question.
Back in the days, I had exactly your problem with SunGrid and the department's computer guru told me that one should at any cost avoid long-running jobs according to some mathematical queuing theory (I will try to contact him and find the reference to the book he quoted). His idea is maybe worth sharing with you:
If you have some job which takes longer than the tolerated maximal running time of the queuing system, do not submit the job itself, but rather multiple calls of a wrapper script which is able to (1) start, (2) freeze-stop, (3) unfreeze-continue the actual task.
This stop-continue can indeed be a suspend (CTRL+Z respectively fg in Linux) on operating-system level, see e.g. unix.stackexchange.com on that issue.
In practice, I had the binary myMonteCarloExperiment.x and the wrapper-script myMCjobStarter.sh. The maximum compute time I had was a day. I would fill the queue with hundreds of calls of the wrapper-script with the boundary condition that only one at a time of them should be running. The script would check whether there is already a process myMonteCarloExperiment.x started anywhere on the compute cluster, if not, it would start an instance. In case there was a suspended process, the wrapper script would forward it and let it run for 23 hours and 55 minutes, and suspend the process then. In any other case, the wrapper script would report an error.
This approach does not implement a job heartbeat, but it does indeed run a lengthy job. It also keeps the queue administrator happy by avoiding that job logs of Hangfire have to be cleaned up.
Further references
How to prevent a Hangfire recurring job from restarting after 30 minutes of continuous execution seems to be a good read
Here is my problem, I got a WCF project, which doesnt really matter in fact because it's more about C#/.NET I believe. In my WCF Service when client is requestinq one of the methods I make the validation of the input, and if it succeeds I start some business logic calculactions. I want to start this logic in another thread/task so after the input validation I can immediately return response. Its something like this:
XXXX MyMethod(MyArgument arg)
{
var validation = _validator.Validate(arg);
if (validation.Succeed)
{
Task.Run(() => businessLogic())
}
return MyResponseModel();
}
I need to make it like this because my buesinessLogic can take long time calculactions and database saves in the end, but client requesting the Service have to know immediately if the model is correct.
In my businessLogic calculactions/saves that will be running in background thread I have to catch exceptions if something fail and save it in database. (its pretty big logic so many exceptions can be thrown, like for example after calculactions im persisting the object in the database so save error can be thrown if database is offline for example)
How to correctly implement/what to use for such a requirements? I am just giving consideration if using Task.Run and invoking all the logic in the action event is a good practice?
You can do it like this.
Be aware, though, that worker processes can exit at any time. In that case outstanding work will simply be lost. Maybe you should queue the work to a message queue instead.
Also, if the task "crashes" you will not be notified in any way. Implement your own error logging.
Also, there is no limit to the number of tasks that you can spawn like this. If processing is too slow more and more work will queue up. This might not at all be a problem if you know that the server will not be overloaded.
It was suggested that Task.Run will use threads and therefore not scale. This is not necessarily so. Usually, the bottleneck of any processing is not the number of threads but the backend resources being used (database, disk, services, ...). Even using hundreds of threads is not in any way likely to be a bottleneck. Async IO is not a way around backend resource constraints.
I wrote some code that mass imports a high volume of users into AD. To refrain from overloading the server, I put a thread.sleep() in the code, executed at every iteration.
Is this a good use of the method, or is there a better alternative (.NET 4.0 applies here)?
Does Thread.Sleep() even aid in performance? What is the cost and performance impact of sleeping a thread?
The Thread.Sleep() method will just put the thread in a pause state for the specified amount of time. I could tell you there are 3 different ways to achieve the same Sleep() calling the method from three different Types. They all have different features. Anyway most important, if you use Sleep() on the main UI thread, it will stop processing messages during that pause and the GUI will look locked. You need to use a BackgroundWorker to run the job you need to sleep.
My opinion is to use the Thread.Sleep() method and just follow my previous advice. In your specific case I guess you'll have no issues. If you put some efforts looking for the same exact topic on SO, I'm sure you'll find much better explanations about what I just summarized before.
If you have no way to receive a feedback from the called service, like it would happen on a typical event driven system (talking in abstract..we could also say callback or any information to understand how the service is affected by your call), the Sleep may be the way to go.
I think that Thread.Sleep is one way to handle this; #cHao is correct that using a timer would allow you to do this in another fashion. Essentially, you're trying to cut down number of commands sent to the AD server over a period of time.
In using timers, you're going to need to devise a way to detect trouble (that's more intuitive than a try/catch). For instance, if your server starts stalling and responding slower, you're going to continue stacking commands that the server can't handle (which may cascade in other errors).
When working with AD I've seen the Domain Controller freak out when too many commands come in (similar to a DOS attack) and bring the server to a crawl or crash. I think by using the sleep method you're creating a manageable and measurable flow.
In this instance, using a thread with a low priority may slow it down, but not to any controllable level. The thread priority will only be a factor on the machine sending the commands, not to the server having to process them.
Hope this helps; cheers!
If what you want is not overload the server you can just reduce the priority of the thread.
Thread.Sleep() do not consume any resources. However, the correct way to do this is set the priority of thread to a value below than Normal: Thread.Current.Priority = ThreadPriority.Lowest for example.
Thread.Sleep is not that "evil, do not do it ever", but maybe (just maybe) the fact that you need to use it reflects some lack on solution design. But this is not a rule at all.
Personally I never find a situation where I have to use Thread.Sleep.
Right now I'm working on an ASP.NET MVC application that uses a background thread to load a lot of data from database into a memory cache and after that write some data to the database.
The only feature I have used to prevent this thread to eat all my webserver and db processors was reduce the thread priority to the Lowest level. That thread will get about to 35 minutes to conclude all the operations instead of 7 minutes if a use a Normal priority thread. By the end of process, thread will have done about 230k selects to the database server, but this do not has affected my database or webserver performance in a perceptive way for the user.
tip: remember to set the priority back to Normal if you are using a thread from ThreadPool.
Here you can read about Thread.Priority:
http://msdn.microsoft.com/en-us/library/system.threading.thread.priority.aspx
Here a good article about why not use Thread.Sleep in production environment:
http://msmvps.com/blogs/peterritchie/archive/2007/04/26/thread-sleep-is-a-sign-of-a-poorly-designed-program.aspx
EDIT Like others said here, maybe just reduce your thread priority will not prevent the thread to send a large number of commands/data to AD. Maybe you'll get better results if you rethink all the thing and use timers or something like that. I personally think that reduce priority could resolve your problem, although I think you need to do some tests using your data to see what happens to your server and other servers involved in the process.
You could schedule the thread at BelowNormal priority instead. That said, that could potentially lead to your task never running if something else overloads the server. (Assuming Windows scheduling works the way the documentation on scheduling threads mentions for "some operating systems".)
That said, you said you're moving data into AD. If it's over the nework, it's entirely possible the CPU impact of your code will be negligible compared to I/O and processing on the AD side.
I don't see any issue with it except that during the time you put the thread to sleep then that thread will not be responsive. If that is your main thread then your GUI will become non responsive. If it is a background thread then you won't be able to communicate with it (eg to cancel it). If the time you sleep is short then it shouldn't matter.
I don't think reducing the priority of the thread will help as 1) your code might not even be running on the server and 2) most of the work being done by the server is probably not going to be on your thread anyway.
Thread.sleep does not aid performance (unless your thread has to wait for some resource). It incurs at least some overhead, and the amount of time that you sleep for is not guaranteed. The OS can decide to have your Thread sleep longer than the amount of time you specify.
As such, it would make more sense to do a significant batch of work between calls to Thread.Sleep().
Thread.Sleep() is a CPU-less wait state. Its overhead should be pretty minimal. If execute Thread.Sleep(0), you don't [necessarily] sleep, but you voluntarily surrender your time slice so the scheduler can let lower priority thread run.
You can also lower your thread's priority by setting Thread.Priority.
Another way of throttling your task is to use a Timer:
// instantiate a timer that 'ticks' 10 times per second (your ideal rate might be different)
Timer timer = new Timer( ImportUserIntoActiveDirectory , null , 0 , 100 ) ;
where ImportUserIntoActiveDirectory is an event handler that will import just user into AD:
private void ImportUserIntoActiveDirectory( object state )
{
// import just one user into AD
return
}
This lets you dial things in. The event handler is called on thread pool worker threads, so you don't tie up your primary thread. Let the OS do the work for you: all you do is decide on your target transaction rate.
I have a Windows Service that performs a long-running process. It is triggered by a timer and the entire process can take a few minutes to complete. When the timer elapses the service instantiates a management object that performs the various tasks, logs the results and then exits.
I have not implemented anything to handle those occasions when the server is shutdown during the middle of the process. It could cause some problems. What is the best practice to handle this?
Can only give vague suggestions since I don't know what task you are actually doing.
If it is something to do w/ database, there is transaction that can be rolled back if it is not committed.
If it involves some file manipulation, perhaps take a look at this article on Transactional NTFS. You can use it in combination w/ TransactionScope object to ensure atomic transaction.
If you are dealing with web services, well the service boundary will dictate when one transaction starts / ends and when the other one begins, use compensation model (if you break something on your part, you need to provide a way later on, after recovery, a way to notify / execute compensation scripts on the other end. (Think about ordering book online and how to handle backorder, cancellation, etc.)
For tracking mechanism, log every steps and the timelines for troubleshooting if something like shutdown occurs.
If your describing essentially a batch process its ok to have a timer that does work at an interval - much of the world works that way.
If its long running, try to keep your units of work, or batches, small enough that your process can at least check to see if its been signaled to stop or not. This will allow the service to exit gracefully instead of essentially ignoring the service stop message.
Somewhere in your timer function you have a property, IsShutdownRequired or some such, that your checking (assuming some loop processing). This property is set to true in the service stop control message, which allows your process to gracefully exit by either not trying to do more work, or as Jimmy suggested, rolling back that work if in a transaction.
Ideally, smaller batches would be better than one big one.
I am developing a multithread server which works nice so far - 1 separate thread for client accepting, threadpool for data reading and processing. Today I have added new thread for doing some stuff and sending messages to client every 500 ms (just 2-5 messages). I have noticed quite massive slowdown but Im not sure why - its separate thread and its not due to iteration and locking collections, because when I add //before SendMessage call, it was still as fast as before.
The SendMessage basically iterates all connected clients and for each of them calls SendData method which writes data to their networkstream.
What am I missing? I still think those are different threads and I hope its not due to stream.write..
Thank you in advance!
If you can try to post a code sample or a summary, your message sending implementation would make a good candidate.
First, purely general advice.
This is a good time to whip out a profiler. This kind of guessing is tempting, and often a good mental excercise, but most of the time programmers are wrong about what they think is making their software slow. A profiler will tell you, for example, if your program is spending 90% of its execution time inside of one method.
Second, a speculative guess.
It sounds like your message command runs off a timer. Make sure that you aren't having issues with reentrancy - for example if your sendmessage loop takes longer than 500ms to complete (and together with creating a new thread and multiple unpredictable latency network calls it could well do that), and you have the whole operation in a lock, then the timer will keep spawning off threadpool threads that are sitting in that lock waiting for the previous operation to complete - and there is a finite number of available threadpool threads. To check if this is a problem you don't even need a profiler, when latency gets bad pause the debugger and check up on your list of currently executing threads.
If this is the case consider doing something else - like have a single thread that runs in an infinite loop using a waithandle as a blocking mechanism and timer that sets the waithandle every 500ms.
But it will be much easier to help you if you post some code snippets, and run a profiler (Ants or DotTrace both work great).
Threads & threadpools for things like socket servers is the old way to do things. It's very unscalable (optimally you would like to not have more threads than cores), and full of locks.
Try converting your code to asynchronous code. You only need 1 thread, and you get callbacks whenever input arrives or when new data can be sent. The resulting code is much faster and doesn't have these bottleneck problems.
I know the advice of: no no, rewrite everything you should do it like this, is not really helpful, since it doesn't answer the exact question you asked. But if you do have the time, I still think it's a good advice. Or else, it's good advice for the next server you'll make ;^)