I have a list of API from different client saved in my Database table and all the API have different time interval for there API to be called. What should be my approach to call the API. New data may be added in the List of API table . Should I go for Dynamic Timers?
I have an application (GUI) which clients use to add new records.
These records represent an API url and the time (Schedule) at which that API should be called.
Your Challenge is to write code that is able to call all the Client specified API's at the specified schedule/time.
To me - API calling & Handling the responses (Storing into DB etc) should be one component. and, scheduling when to call which API - should be other component (Something like cron job). This way - when the time is right appropriate API call would be triggered. This also gives you flexibility to do multiple tries/retries in a day etc.
Update after your comment:
You have an application (GUI) which clients use to add new records.
These records represent an API url and the time (Schedule) at which that API should be called.
Your Challenge is to write code that is able to call all the Client specified API's at the specified schedule/time.
If I have got that problem right - my original suggestion stands.
Component 1 - Scheduler
Use Quartz.net (or create your own using a Timer etc) - and create a Service (say WCF) or Process which will read records from Database and identify all the schedules and the API urls that need to be called. When the scheduled time happens Quartz.net will trigger your handler method - where you will make a Call to Component 2 and pass on the API url.
Component 2 - API Engine
When it receives a call from Component 1 - it will make the API call and fetch the response. Store/process it as required.
There are various schedulers that can be used to do this automatically. For example, you could use Quartz.NET and its AdoJobStore. I haven't used that myself, but it sounds appropriate:
With the use of the included AdoJobStore, all Jobs and Triggers configured as "non-volatile" are stored in a relational database via ADO.NET.
Alternatively, your database may well have timers built into it. However, if this is primarily an academic exercise (as suggested by "your challenge") you may not be able to use these.
I would keep a table of scheduled tasks, with columns specifying:
When the task should next be run
What the task should do
How to work out the next iteration of that task afterwards
If the task has been started, when it was started
If the task completed, when it completed
You can then write code in an infinite loop to just scan that table, e.g. once per minute. It should look for all tasks with a "next time" earlier than now that haven't completed:
If the task hasn't been started, update the row to show that it has been started (now), and start executing the task
If the task was started recently, ignore it
If the task was started "a long time ago" (i.e. longer than it would take to run successfully), either mark it as "broken" somehow, or restart
When a task completes successfully, update the row to indicate that it's finished, and add another row for the next time it should be started.
You'll need to work out exactly what your error strategy is:
How long should the gap be between a task starting and you deciding it's failed?
Do you always want to restart the task, or should some failures be permanent?
Do you need to record how often a task failed, and give up after a certain number of tries?
What do you do if you explicitly notice that the task has failed while you're executing it? (Rather than just by the fact that it was started a long time ago.)
For extra reliability, you'd need to think of other aspects too:
Do you need multiple task runners?
How can you spot when a task runner has failed, and restart that?
How do you deal with multiple task runners trying to start the same task at the same time?
You may not need to actually implement everything here, but it's worth considering them.
Related
I run multiple instance of my application, and I have configured Hangfire to run as part of my Startup.cs configurations.
I want to generate a monthly report, and I'd like to ensure it's getting enqueued only once. DisableConcurrentExecution doesn't help as it prevents execution in the same time.
I read about Mutex as well:
When we create multiple background jobs based on this method, they will be executed one after another on a best-effort basis with the limitations described below. If there’s a background job protected by a mutex currently executing, other executions will be throttled (rescheduled by default a minute later), allowing a worker to process other jobs without waiting.
According to my understanding, Mutex will prevent concurrent execution, but it'll run my reports X times (where X is the number of my instances), one after another.
How can I ensure to enqueue a cron job only once?
How can I add the job without having to call it through an endpoint (e.g. POST <server>/api/enqueue_jobs
I don't have snippets to provide because I'm stuck with the configuration itself, I hope this answer won't be closed because I put efforts in trying to solve it by my own.
In my multi-tenant application I have a background process that runs in a webjob and can take several minutes. The time varies according to each customer's data.
But sometimes, when I'm testing something, I start the process and (looking at the logs) soon I realize something is wrong and I want to cancel that specific run.
I cannot just kill all the messages in the queue or stop the WebJob, because I'd be killing the processes that are running for the other customers.
And I want to do it programmatically so I can put a Cancel button in my web application.
I was not able to find the best architecture approach (or a pattern) to work with this kind of execution cancellation.
I read about passing a CancellationTokenSource, but I couldn't think of how I would call the Cancel() method on the specific run that I want to cancel. Should I store all currently running tokens in a static class? And then send another message to the webjob telling that I want to cancel it?
(I think that might be the answer, but I'm afraid I'm overthinking. That's why I'm asking it here.)
My Function is as simple as:
public static void EngineProcessQueue([QueueTrigger("job-for-process")] string message, TextWriter log)
{
// Inside this method there is a huge codebase
// and I'm afraid that I'll have to put the "if (token.IsCancelled)" in lots of places...
// (but that's another question)
ProcessQueueMessage(message, log);
}
QueueTrigger is essentially a function trigger. The Cancel you want should not be supported.
Because once the function execution method is entered, the specific business logic code may have asynchronous operations. Assuming that even if we delete or stop the QueueTrigger at this time, business data will be affected and rollback cannot be achieved.
The following is my personal suggestion,
because I think the cancel operation can be improved from the business logic:
Use redis cache, and create a object name of mypools, to store your bussiness command.
When running webjob, we can get all Queue, we also can find in Azure Storage Explore. And we can save it in mypools with specical command.
The format of command should be ClientName-TriggerName-Status-Extend. Such as Acompany-jobforprocess-run-null, when this command has not been executed yet, we can modify it with Acompany-jobforprocess-cancel-null.
We can set Azure WebJob queue name at runtime. Then dynamically handle business in the program.For the executed business, data rollback is performed.
The partition class in the tabular AMO library has a method for refreshing the partition (RequestRefresh). I can use the AMO library to fire this off, however this method appears to be asynchronous and I cannot find a way of monitoring this request to know when the processing has completed (either refreshed or failed).
The partition class does have a "State" property, but in practice, this always appears to report as ready, even during processing or after a failure in refreshing the data that's caused no data to be written into the partition.
I need to be able to programatically refresh my cube partitions, but have tasks that I need to schedule after the build has completed, I could watch the refresh time, but that feels like the wrong way to do this and any failed attempts do not appear to change this value (therefore requiring some form of timeout or other method for detecting failed refreshes).
Please add the following line after RequestRefresh. SaveChanges is synchronous and the refresh operation isn't actually executed until SaveChanges is run:
partition.RequestRefresh(RefreshType.Full);
db.Model.SaveChanges();
I've been building a web service to synchronize data between SalesForce and Zendesk at my company. In the process of doing so, I've built several optimizations to drastically reduce execution time, such as caching some of the larger datasets that are retrieved from each service.
However, this comes at a price. When caching the data, it can upwards to 3-5 minutes to download everything through SalesForce and Zendesk's APIs.
To combat this, I was thinking of having a background worker that automatically cached all the required data every day a midnight. However, I'm not sure what the best method of doing this would be.
Would it suffice to build a class that merely has a worker thread that checks every several minutes to see if it is after midnight, and activate it on launch from Global.asax. Or is there some sort of scheduler already in existence?
EDIT
There seems to be some division between using something like:
FluentScheduler or Quartz.net to house everything within my applications.
Versus using something like windows task scheduler and writing a secondary application to call a function of my application to do so. It seems that using a third party library would be more simple, but is there any inherent benefit to using the Windows Task Scheduler.
I think you want to add your data caching logic to a project of type "console application". You'll be able to deploy this to your server and run it as a scheduled task using windows "Task Scheduler". If you've not worked with this project type or scheduled tasks before there are stack overflow questions which should help here, here, and here. You can add command line parameters if you need and you should have a look at adding a mutex so that only one instance of your code will ever run at once.
add an endpoint that will know how do it and use the windows task scheduler to call that new caching endpoint.
I have an Work Tracker WPF application which deployed in Windows Server 2008 and this Tracker application is communicating with (Tracker)windows service VIA WCF Service.
User can create any work entry/edit/add/delete/Cancel any work entry from Worker Tracker GUI application. Internally it will send a request to the Windows service. Windows Service will get the work request and process it in multithreading. Each workrequest entry will actually create n number of work files (based on work priority) in a output folder location.
So each work request will take to complete the work addition process.
Now my question is If I cancel the currently creating work entry. I want to to stop the current windows service work in RUNTIME. The current thread which is creating output files for the work should get STOPPED. All the thread should killed. All the thread resources should get removed once the user requested for CANCEL.
My workaround:
I use Windows Service On Custom Command method to send custom values to the windows service on runtime. What I am achieving here is it is processing the current work or current thread (ie creating output files for the work item recieved).and then it is coming to custom command for cancelling the request.
Is there any way so that the Work item request should get stopped once we get the custom command.
Any work around is much appreciated.
Summary
You are essentially talking about running a task host for long running tasks, and being able to cancel those tasks. Your specific question seems to want to know the best way to implement this in .NET. Your architecture is good, although you are brave to roll your own rather than using existing frameworks, and you haven't mentioned scaling your architecture later.
My preference is for using the TPL Task object. It supports cancellation, and is easy to poll for progress, etc. You can only use this in .NET 4 onwards.
It is hard to provide code without basically designing a whole job hosting engine for you and knowing your .NET version. I have described the steps in detail below, with references to example code.
Your approach of using the Windows Service OnCustomCommand is fine, you could also use a messaging service (see below) if you have that option for client-service comms. This would be more appropriate for a scenario where you have many clients talking to a central job service, and the job service is not on the same machine as the client.
Running and cancelling tasks on threads
Before we look at your exact context, it would be good to review MSDN - Asynchronous Programming Patterns. There are three main .NET patterns to run and cancel jobs on threads, and I list them in order of preference for use:
TAP: Task-based Asynchronous Pattern
Based on Task, which has been available only since .NET 4
The prefered way to run and control any thread-based activity from .NET 4 onwards
Much simpler to implement that EAP
EAP: Event-based Asynchronous Pattern
Your only option if you don't have .NET 4 or later.
Hard to implement, but once you have understood it you can roll it out and it is very reliable to use
APM: Asynchronous Programming Model
No longer relevant unless you maintain legacy code or use old APIs.
Even with .NET 1.1 you can implement a version of EAP, so I will not cover this as you say you are implementing your own solution
The architecture
Imagine this like a REST based service.
The client submits a job, and gets returned an identifier for the job
A job engine then picks up the job when it is ready, and starts running it
If the client doesn't want the job any more, then they delete the job, using it's identifier
This way the client is completely isolated from the workings of the job engine, and the job engine can be improved over time.
The job engine
The approach is as follows:
For a submitted task, generate a universal identifier (UID) so that you can:
Identify a running task
Poll for results
Cancel the task if required
return that UID to the client
queue the job using that identifier
when you have resources
run the job by creating a Task
store the Task in a dictionary against the UID as a key
When the client wants results, they send the request with the UID and you return progress by checking against the Task that you retrieve from the dictionary. If the task is complete they can then send a request for the completed data, or in your case just go and read the completed files.
When they want to cancel they send the request with the UID, and you cancel the Task by finding it in the dictionary and telling it to cancel.
Cancelling inside a job
Inside your code you will need to regularly check your cancellation token to see if you should stop running code (see How do I abort/cancel TPL Tasks? if you are using the TAP pattern, or Albahari if you are using EAP). At that point you will exit your job processing, and your code, if designed well, should dispose of IDiposables where required, remove big strings from memory etc.
The basic premise of cancellation is that you check your cancellation token:
After a block of work that takes a long time (e.g. a call to an external API)
Inside a loop (for, foreach, do or while) that you control, you check on each iteration
Within a long block of sequential code, that might take "some time", you insert points to check on a regular basis
You need to define how quickly you need to react to a cancellation - for a windows service it should be within milliseconds, preferably, to make sure that windows doesn't have problems restarting or stopping the service.
Some people do this whole process with threads, and by terminating the thread - this is ugly and not recommended any more.
Reliability
You need to ask: what happens if your server restarts, the windows service crashes, or any other exception happens causing you to lose incomplete jobs? In this case you may want a queue architecture that is reliable in order to be able to restart jobs, or rebuild the queue of jobs you haven't started yet.
If you don't want to scale, this is simple - use a local database that the windows service stored job information in.
On submission of a job, record its details in the database
When you start a job, record that against the job record in the database
When the client collects the job, mark it for delayed garbage collection in the database, and then delete it after a set amount of time (1 hour, 1 day ...)
If your service restarts and there are "in progress jobs" then requeue them and then start your job engine again.
If you do want to scale, or your clients are on many computers, and you have a job engine "farm" of 1 or more servers, then look at using a message queue instead of directly communicating using OnCustomCommand.
Message Queues have multiple benefits. They will allow you to reliably submit jobs to a central queue that many workers can then pick up and process, and to decouple your clients and servers so you can scale out your job running services. They are used to ensure jobs are reliably submitted and processed in a highly decoupled fashion, and this can work locally or globally, but always reliably, you can even then combine it with running your windows service on cloud workers which you can dynamically scale.
Examples of technologies are MSMQ (if you want to maintain your own, or must stay inside your own firewall), or Windows Azure Service Bus (WASB) - which is cheap, and already done for you. In either case you will want to use Patterns and Best Practices for Enterprise Integration. In the case of WASB then there are many (MSDN), many (MSDN samples for BrokeredMessaging etc.), many (new Task-based API) developer resources, and NuGet packages for you to use