I am using Quartz.net.
I have configure job with attribute DisallowConcurrentExecution. I want single instance of that job execute.
I have configured trigger that fire every 10 seconds but in some situation my job get more than minutes to complete. Once this happen I can not see Last Execution Time and Next execution correct. It still refer to old time.
I am new to quartz but I know that thread pool might schedule job that in queue and when one instance complete and new will get start because of attribute configuration but why it is not maintaining time of execution properly.
Please help.
Double-posted here: https://github.com/quartznet/quartznet/issues/173
This works as designed. Quartz considers your trigger misfired as it
didn't run when it was supposed to (job's concurrent execution
protection prohibited it). You need to tweak your misfire handling
configuration.
http://www.quartz-scheduler.net/documentation/quartz-2.x/tutorial/more-about-triggers.html
Related
I run multiple instance of my application, and I have configured Hangfire to run as part of my Startup.cs configurations.
I want to generate a monthly report, and I'd like to ensure it's getting enqueued only once. DisableConcurrentExecution doesn't help as it prevents execution in the same time.
I read about Mutex as well:
When we create multiple background jobs based on this method, they will be executed one after another on a best-effort basis with the limitations described below. If there’s a background job protected by a mutex currently executing, other executions will be throttled (rescheduled by default a minute later), allowing a worker to process other jobs without waiting.
According to my understanding, Mutex will prevent concurrent execution, but it'll run my reports X times (where X is the number of my instances), one after another.
How can I ensure to enqueue a cron job only once?
How can I add the job without having to call it through an endpoint (e.g. POST <server>/api/enqueue_jobs
I don't have snippets to provide because I'm stuck with the configuration itself, I hope this answer won't be closed because I put efforts in trying to solve it by my own.
I have implemented a recurring job which needs to run every minute. Now and then the job has a hickup as an API-Call is involved which can take a bit longer to response. So that the job is enqued a second time, even though it wasn't finished in the previous run.
My Question:
How do I prevent a Hangfire job to run if another instance of the same job is already running?
Thank you!
Is it possible to send a heartbeat to hangfire (Redis Storage) to tell the system that the process is still alive? At the moment I set the InvisibilityTimeout to TimeSpan.MaxValue to prevent hangfire from restarting the job. But, if the process fails or the server restarts, the job will never be removed from the list of running jobs. So my idea was, to remove the large time out and send a kind of heartbeat instead. Is this possible?
I found https://discuss.hangfire.io/t/hangfire-long-job-stop-and-restart-several-time/4282/2 which deals with how to keep a long-running job alive in Hangfire.
The User zLanger says that jobs are considered dead and restarted once you ...
[...] are hitting hangfire’s invisibilityTimeout. You have two options.
increase the timeout to more than the job will ever take to run
have the job send a heartbeat to let hangfire’s know it’s still alive.
That's not new to you. But interestingly, the follow-up question there is:
How do you implement heartbeat on job?
This remains unanswered there, a hint that that your problem is really not trivial.
I have never handled long-running jobs in Hangfire, but I know the problem from other queuing systems like the former SunGrid Engine which is how I got interested in your question.
Back in the days, I had exactly your problem with SunGrid and the department's computer guru told me that one should at any cost avoid long-running jobs according to some mathematical queuing theory (I will try to contact him and find the reference to the book he quoted). His idea is maybe worth sharing with you:
If you have some job which takes longer than the tolerated maximal running time of the queuing system, do not submit the job itself, but rather multiple calls of a wrapper script which is able to (1) start, (2) freeze-stop, (3) unfreeze-continue the actual task.
This stop-continue can indeed be a suspend (CTRL+Z respectively fg in Linux) on operating-system level, see e.g. unix.stackexchange.com on that issue.
In practice, I had the binary myMonteCarloExperiment.x and the wrapper-script myMCjobStarter.sh. The maximum compute time I had was a day. I would fill the queue with hundreds of calls of the wrapper-script with the boundary condition that only one at a time of them should be running. The script would check whether there is already a process myMonteCarloExperiment.x started anywhere on the compute cluster, if not, it would start an instance. In case there was a suspended process, the wrapper script would forward it and let it run for 23 hours and 55 minutes, and suspend the process then. In any other case, the wrapper script would report an error.
This approach does not implement a job heartbeat, but it does indeed run a lengthy job. It also keeps the queue administrator happy by avoiding that job logs of Hangfire have to be cleaned up.
Further references
How to prevent a Hangfire recurring job from restarting after 30 minutes of continuous execution seems to be a good read
I created a job that implements IStatefulJob and according to the quartz docs
"if a job is stateful, and a trigger attempts to 'fire' the job while it is already
executing, the trigger will block (wait) until the previous execution completes"
Is there anyway way to remove the block and kill the newly fired instance of the job?
The job I am running can have wildly different run times based on the amount of data behind it and I am concerned that if we have a number of jobs waiting to run that it could have a negative effect...
Thanks
Unfortunately no. As a job implementor you are responsible for making sure that job will keep track whether it has reached its time limit of 'good behavior'. Normally there's no need as jobs take somewhat expected time to complete.
Same goes when you want to interrupt all jobs in scheduler, you need to implement IInterruptableJob and set flag that your main job loop watches.
You can always rethink the design. It shouldn't be problem to queue same job as it has the same duty to do. With misfire instructions you can configure misfired (queued too long) instanced to be discarded and wait for the next fire time.
I have the following problem with Quartz:
A job is scheduled to run every 10 minutes. Sometimes (rarely) the job might take longer than 10 minutes. In such cases, Quartz will put the same job on the queue to run after the current one (same job) is executing. Normally that is no problem; the job will run two times in a row and all is well and functioning. However, in some cases, the second time the job will also take more than 10 minutes. I would expect that Quartz will just put it in the queue one more time. Instead this job never gets queued and is not run again. Everything else is normal besides this job, which never runs again until the system is restarted.
Is this the expected behavior? Is there any way that I can modify it to better suit my needs?
There was an issue with more jobs running and the QUARTZ thread pool being maxed out.