I would like to use Hangfire to create long running fire and forget task. If the web server dies and the background job is retried, I would like it to pick up where it left off.
In the example below, let's say that foo.RetryCount reaches 3 -> server restarts -> Hangfire reruns the job. In this case I would only like to run the task 7 more times (based on MaxAttemps), instead of restarting from zero.
I thought Hangfire persisted the arguments passed to the method in their current state, but as far as I can tell they are reset.
var foo = new Foo { RetryCount = 0, MaxAttemps = 10 };
BackgroundJob.Enqueue(() => RequestAndRetryOnFailure(foo));
void RequestAndRetryOnFailure(Foo foo)
{
// make request to server, if fail, wait for a
// while and try again later if not foo.MaxAttemps is reached
foo.RetryCount++;
}
I use hangfire extensively for a lot of different actions and have a constant need to reschedule a job that started but couldn't execute due to certain constraints.
The persistency you are referring to happens in the serialized version of the job that's enqeued but no longer kept once it does execute.
What I would recommend is, schedule the job to execute after certain amount if the server is not available. This will also help restart the job if the job is scheduled and hangfire reboots.
var foo = new Foo { RetryCount = 0, MaxAttemps = 10 };
BackgroundJob.Enqueue(() => RequestAndRetryOnFailure(foo));
void RequestAndRetryOnFailure(Foo foo)
{
// make request to server, if fail, wait for a
// while and try again later if not foo.MaxAttemps is reached
if (request to server failed)
{
foo.RetryCount ++;
If (foo.RetryCount < foo.MaxAttempts)
BackgroundJob.Schedule(RequestAndRetryOnFailure(foo), Timespan.FromSeconds(30));
else
return; // do nothing
}
}
Related
I have a few jobs executed one after the other via ContinueJobWith<MyHandler>(parentJobId, x => x.DoWork()).
However, the second job is not getting processed and always sits in Awaiting state:
The job itself is like this:
Why this can happen and where to check for resultion?
We are using Autofac as DI container, but we have our own JobActivator implementation because we have to deal with multitenancy.
We are using SQL Server 2019 for storage.
Hangfire version is 1.7.10
This is MVC 5 application
I've not seen any errors/exceptions in any logs or during debugging
After going through this I've added this to our Autofac registration
builder.RegisterType<BackgroundJobStateChanger>()
.As<IBackgroundJobStateChanger>()
.InstancePerLifetimeScope();
This made no difference.
This is how the jobs are executed:
var parentJobId = _backgroundJobClient.Schedule<Handler>(h => h.ConvertCertToTraining(certId, command.SetUpOneToOneRelationship), TimeSpan.FromSeconds(1));
var filesCopyJObId = _backgroundJobClient.ContinueJobWith<Handler>(parentJobId, h => h.CopyAttachedFiles());
_backgroundJobClient.ContinueJobWith<Handler>(filesCopyJObId, h => h.NotifyUser(command.CertificationToBeConvertedIds, _principal.GetEmail()));
All the parameters are either int, bool or string. If I enqueue the awaiting jobs by hand, they are executed without issues.
I've added Hangfire logging, but could not see any issues there: server starts, stops, jobs change status, but could not see any obvious errors there.
What other things I should consider or where/how should I debug this?
From the looks of it, the first job with ID 216348 completed successfully but your second job with ID 216349 is waiting on the parent ID of 216347. According to Hangfire documentation and experience, the parentID should be of the job that you are waiting to finish before executing the second job.
According to Hangfire documentation on ContinueJobWith, "Continuations are executed when its parent job has been finished". From your screenshots, it is not clear whats going on with JobID: 216347. Once this job, 216347 completes, job with ID 216349 should kick off. If you are expecting 216349 to start after 216348 finishes, check your code and make sure correct ParentID is passed to the second job.
Update
Based on this thread, add the ContinuationsSupportAttribute to GlobalJobFilters.Filter where you configure Hangfire service. This should make your Hangfire instance aware of continuation jobs.
GlobalJobFilters.Filters.Add(new ContinuationsSupportAttribute());
During the investigation, it turned out that we were replacing JobFilterProviderCollection with our own collection:
var filterProviderCollection = new JobFilterProviderCollection
{
new MyFilterProvider(...)
};
var backgroundJobClient = new BackgroundJobClient(JobStorage.Current, filterProviderCollection);
MyFilterProvider looked like this:
public IEnumerable<JobFilter> GetFilters(Job job)
{
return new JobFilter[]
{
new JobFilter(new HangfireTenantFilter(_tenantDetail, _principal), JobFilterScope.Global, null),
new JobFilter(new HangfireFunctionalityFilter(_functionalityFilter), JobFilterScope.Global, null),
};
}
It turned out that code that was doing work on Continuation only took filters from this filter collection and ContinuationsSupportAttribute was not executed there in the right time. So re-adding default Hangfire filters from GlobalJobFilters.Filters fixed the situation:
public IEnumerable<JobFilter> GetFilters(Job job)
{
var customFilters = new List<JobFilter>()
{
new JobFilter(new HangfireTenantFilter(_tenantDetail, _principal), JobFilterScope.Global, null),
new JobFilter(new HangfireFunctionalityFilter(_functionalityFilter), JobFilterScope.Global, null),
};
customFilters.AddRange(GlobalJobFilters.Filters);
return customFilters;
}
I have an application which is used to trigger long running data queues. By long running, I mean around 12-16 hours per queue and either of them cannot be executed in parallel. Each queue has individual steps which need to succeed before the next one runs.
I have already increased the timeouts while initializing ChromeDriver upto 1000 minutes
webDriver == new ChromeDriver(path,options,TimeSpan.FromMinutes(1000));
I am using WebDriverWait for checking after 1000 mins that all steps have been succeeded. In case of a failure, I still have to wait for 1000 minutes before I can tell the dev team about the failure.
Is there a better approach to solve this problem? It is also keeping my browser open for 1000 mins
Regarding your question -- is there a better way to solve this problem? With Selenium, not really. You'd have better luck taking a different approach, such as API, than through UI testing. However, it's still possible, just not ideal.
My best idea for this problem would be to set up some sort of controller that can manage your WebDriver instances and also keep track of the 12-16 hour queue time. Since I don't have any specific information about your project architecture or the queues you are testing, this will be a very generic implementation.
Here's a simple DriverManager class, that controls creating & terminating WebDriver sessions:
public class DriverManager
{
public IWebDriver CreateDriver
{
// code to initialize your WebDriver instance here
}
public void CloseWebDriverSession
{
Driver.Close();
Driver.Quit();
}
}
Next, here's a test case implementation that utilizes DriverManager to close & reopen WebDriver as needed.
public class TestBothQueues
{
// this driver instance will keep track of your session throughout the test case
public IWebDriver driver;
[Test]
public void ShouldRunBothQueues
{
// declare instance of DriverManager class
DriverManager manager = new DriverManager();
// start a webdriver instance
driver = manager.CreateDriver();
// run the first queue
RunFirstQueue();
// terminate the WebDriver so we don't have browser open for 12 hours
manager.CloseWebDriverSession();
// wait 12 hours
Thread.Sleep(TimeSpan.FromHours(12));
// start another WebDriver session to start the second queue
driver = manager.CreateDriver();
// run the second queue
RunSecondQueue();
// terminate when we are finished
manager.CloseWebDriverSession();
}
}
A few notes on this:
You can also convert this code into a while loop if you would like to start a WebDriver instance to check the queue on a time interval. For example, if the queue takes 12-16 hours to finish, you may want to wait 12 hours, then check the queue once per hour until you can verify it is completed. That would look something like this:
// first, wait initial 12 hours
Thread.Sleep(TimeSpan.FromHours(12));
// keep track of whether or not queue is finished
bool isQueueFinished = false;
while (!isQueueFinished);
{
// start webdriver instance to check the queue
IWebDriver driver = manager.CreateDriver();
// check if queue is finished
isQueueFinished = CheckIfQueueFinished(driver);
// if queue is finished, while loop will break
// if queue is not finished, close the WebDriver instance, and start again
if (!isQueueFinished)
{
// close the WebDriver since we won't need it
manager.CloseWebDriverSession();
// wait another hour
Thread.Sleep(TimeSpan.FromHours(1));
}
}
Hope this helps.
I want to schedule a timertriggered method to call other methods but somehow the CronJob method won't run if I use it to call one of my own methods, I simply get this console output:
"
Found the following functions:
...ProcessQueueMessage
...Functions.CronJob
Job host started
"
and nothing else happens for a couple of minutes and then it might suddenly start working. But if I only use the CronJob() method for running it's own Console.WriteLine("Timer job fired") statement everything works.
I have been trying to find a solution to this problem for hours now but no one seems to have the same problem. Any ideas on what I'm doing wrong?
public static void CronJob([TimerTrigger("*/3 * * * * *", RunOnStartup = true)] TimerInfo timerInfo)
{
Console.WriteLine("Timer job fired! ");
DoTask();
}
private static void DoTask()
{
Console.WriteLine("Doing task...");
}
Main method:
static void Main()
{
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
var host = new JobHost(config);
config.UseTimers();
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
}
Any ideas on what I'm doing wrong?
According to your description, it is not related with whether you call code directly. The root reason is that a blob lease (the Singleton Lock) is taken for a default time of 30 seconds.
As Rob Reagan mention that you could set JobHostConfiguration.Tracing.ConsoleLeve
to Verbose. When the webjob hangs you could get the information "Unable to aquire Singleton lock".
For more detail info you could refer to this issue.
When the listener starts for a particular TimerTrigger function, a blob lease (the Singleton Lock) is taken for a default time of 30 seconds. This is the lock that ensures that only a single instance of your scheduled function is running at any time. If you kill your console app, that lease will still be held until it expires naturally
I have a function that supposes to run every night at 12 AM and to do some job
usually it takes 2 hours...
I want to create a trigger that calls it.
so I created an Azure function app with time trigger that calls with HTTP request to my controller that calls my function.
the controller function I created just for test.
[HttpGet]
public async Task<bool> updateFromRegAdmin()
{
try
{
RegEditApi_Service.retrieveRegAdminApiCredentials();
return true;
}
catch (Exception e)
{
Logger.writeToLog(Logger.LOG_SEVERITY_TYPE.Error, "", "updateFromRegAdmin ", e.Message);
return false;
}
}
so as I said the function "retrieveRegAdminApiCredentials" runs 2 hours.
and the problem is the request comes to timeout after a few minutes...
so how can I create a request that just triggers the inner function and let it run in the background?
by the way, I can't create a trigger on the server without an HTTP request because my company has scaled servers on Azure(it will run my trigger multiple time and create DB duplicates).
my previous solution to that was...
public class JobScheduler
{
public static void Start()
{
IScheduler scheduler = StdSchedulerFactory.GetDefaultScheduler();
scheduler.Start();
IJobDetail job = JobBuilder.Create<GetExchangeRates>().Build();
ITrigger trigger = TriggerBuilder.Create()
.WithDailyTimeIntervalSchedule
(s =>
s.WithIntervalInHours(24)
.OnEveryDay()
.StartingDailyAt(TimeOfDay.HourAndMinuteOfDay(00, 00))
)
.Build();
scheduler.ScheduleJob(job, trigger);
}
}
public class GetExchangeRates : IJob
{
public void Execute(IJobExecutionContext context)
{
Random random = new Random();
int randomNumber = random.Next(100000, 900000);
Thread.Sleep(randomNumber);
RegEditApi_Service.retrieveRegAdminApiCredentials();
}
}
If I understand you correctly, what you have is an Azure Function Timer trigger, that sends an HTTP request to your server with "RegEditApi_Service.retrieveRegAdminApiCredentials()".
The problem is, your function times out. To solve this, you should have the HTTP endpoint behind "retrieveRegAdminApiCredentials()", return immediately on accepting the request.
If you need some return value from the server, you should have the server put a message on some queue ( like Azure Storage queue) and have another Azure Function that listens to this queue, and accepts the message.
If the result of the long operation is relatively small, you can just have the result in the message. Otherwise, you would need to perform some operation, but this operation should be much quicker, because you have already performed the long running operation, and kept the answer, so now you will just retrieve it, and possibly do some cleanup.
You can also look into Azure Durable Functions, it is intended for this use case, but is still in preview, and I'm not sure how much benefit it will give you :
https://learn.microsoft.com/en-us/azure/azure-functions/durable-functions-overview#pattern-3-async-http-apis
Looks like you need a dedicated component able to schedule and execute a queue of tasks. There are nice frameworks for that, but if you dislike those for whatever reason, then make sure you initiate/reuse idle thread and force long execution there. As such, your API will return something alike: 200, OK meaning that process has started successfuly.
Key idea: distinct your threads explicitly. That's actually quite challenging.
Azure functions by default run to a maximum of 15 minutes (maybe 5, too lazy to check the documentation right now :-) ).
If your function is on a Consumption Plan, you can't increase this time. You can do it if you host your function on a App Service plan.
I am working on a wpf(c#) that is currently connected to a database. Does anyone have an idea how I may query or check the database(SQL Server 2008) connection every 15 seconds to check for any updates and show notification to user if it can connect or not?
Any code/information would be greatly appreciated. Thanks!
Take a look at SQLDependency, that provides a notice when something changes in a certain query.
To test for changes in data passively, use SQLDependency as already suggested. Otherwise you need to implement a timer which runs asynchronously and polls the database, comparing with the last result.
private async Task PollDb()
{
int lastCount = -1;
while (true)
{
await Task.Delay(15000); // 15 second interval
var newRecord = await Task.Factory.StartNew <ICollection<object>>(p =>
{
/* Get results here */
return new List<object>();
});
if(newRecord.Count() != lastCount)
{
// Update app
}
lastCount = newRecord.Count();
}
}
If you just want to check the database is responding without the burden of a heavy query, just execute a query 'SELECT GETDATE()'