I have been requested to use Amazon SQS in our new system. Our business depends on having some tasks/requests from the clients to our support agents, and once the client submit his task/request, it should be queued in my SQL Server database, and all queued tasks should be assigned to the non-busy agent because the flow says that the agent can process or handle one task/request at the meantime, so, If I have 10 tasks/requests came to my system, all should be queued, then, the system should forward the task to the agent who is free now and once the agent solves the task, he should get the next one if any, otherwise, the system should wait for any agent until finishing his current task to assign a new one, and for sure, there should not be any duplication in tasks/requests handling ... and so on.
What do I need, now?
Simple reference which can clarify what is Amazon SQS as this is my first time to use queuing service?
How can I use the same with C# and SQL Server? I have read this topic but I still feel that there is something messing as I am not able to start. I am just aiming at the way which I can process the task in run-time and assign it to an agent, then close it and getting a new one as I explained above.
Asking us to design a system based on a paragraph of prose is a pretty tall order.
SQS is simply a cloud queue system. Based on your description, I'm not sure it would make your system any better.
First off, you are already storing everything in your database, so why do you need to store things in the queue as well? If you want to have queue semantics while storing stuff in your database you could consider SQL Server Service Broker (https://technet.microsoft.com/en-us/library/ms345108(v=sql.90).aspx#sqlsvcbr_topic2) which supports queues within SQL. Alternatively unless your scale is pretty high (100+ tasks/second maybe) you could just query the table for tasks which need to be picked up.
Secondly, it sounds like you might have a workflow around tasks that could extend to more than just a single queue for agents to pick them up. For example, do you have any follow up on the tasks (emailing clients to ask them how their service was, putting a task on hold until a client gets back to you, etc)? If so, you might want to look at Simple Workflow Service (https://aws.amazon.com/swf/) or since you are already on Microsoft's stack you can look at Windows Workflow (https://msdn.microsoft.com/en-us/library/ee342461.aspx)
BTW, SQS does not guarantee "only one" delivery by default, so if duplication is a big problem for you then you will either have to do your own deduplication or use FIFO queues (http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html) which support deduplication, but are limited to 300 transactions/second (aka: roughly 100 messages/second accounting for the standard send -> receive -> delete APIs. Using batching obviously that number could be much higher, but considering your use case it doesn't sound like you would be able to use batching without a lot of work).
Related
I have a subscription model, and want to perform renew-related logic like issue new invoice, send emails, etc. For example, user would purchase the subscription today, and the renewal is in a year's time. I've been using an Azure Queue recently, and think it would apply for such a renewal.
Is it possible to use the Azure Queue by pushing messages using BrokeredMessage.ScheduledEnqueueTimeUtc (http://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.brokeredmessage.scheduledenqueuetimeutc.aspx) for such long term scheduled messages?
I've used it for shorter-term, like sending notifications in 1 minute time and it works great.
This way, I can have even multiple processes listening to the queue, and be sure that only one process would perform the renewal logic. This would solve a lot of locking-related problems, as that is kind of built-in the Azure Queue via leasing and related features.
Yes, you can use it for long-term scheduling, scheduled messages have the same guaranties as normal ones. But there are few things you need to be aware of:
ScheduledEnqueueTimeUtc is a time when message is going to be available (within hundreds of miliseconds) on the queue but not necessary delivered, this depends on load and state of the queue. So it's fine for business processes but not for time sensitive (milliseconds) usage. Not a problem in your case, unless your subscription cancellation is really time sensitive.
It affects your storage quota ( Not really a problem with current quotas, but if you think about years this might be a problem)
As far as I'm aware you can't access scheduled messages before ScheduledEnqueueTimeUtc, they are invisible.
Extremely awesome source of informations on azure messaging
From technological perspective it's fine but in your case I would also think about other potential problems if you think about years:
Message versioning
What happens when you would like to change Azure to something else (AWS?)
What if you decide to change in next year Azure Service Bus for NServiceBus
I am basically creating a site for recruiters. One of the functionality in my application requires posting to Facebook periodically. The posting frequency can be from 0(Never) to 4(High)
For Eg. If a recruiter has 4 open jobs and he has posting frequency set to 4, each job should be posted as per it's turn: 1st job on 1st day, 2nd job on 2nd, 3rd job on 3rd etc, on 5th day again 1st job (round robin fashion).
Had he set the posting frequency to 2, two jobs would be posted daily (thus each job would be posted every 2 days)
My only question is what type of threading should I create for this since this is all dynamic!! Also, any guidelines on what type of information should I store in database?
I need just a general strategy to solve this problem. No code..
I think you need to seperate it from your website, I mean its better to run the logic for posting jobs in a service hosted on IIS ( I am not sure such a thing exists or not, but I guess there is).
Also you need to have table for job queue to remember which jobs need to be posted, then your service would pick them up and post them one by one.
To decide if this is the time for posting a job you can define a timer with a configurable interval to check if there is any job to post or not.
Make sure that you keep the verbose log details if posting fails. It is important because it is possible that Facebook changes its API or your API key becomes invalid or anything else then you need to know what happened.
Also I strongly suggest to have a webpage for reporting the status of jobs-to-post queue, if they failed what was the causes of problem.
If you program runs non-stop, you can just use one of the Timer classes available in .NET framework, without the need to go for full-blown concurrency (e.g. via Task Parallel Library).
I suspect, though, that you'll need more than that - some kind of mechanism to detect which jobs were successfully posted and which were "missed" due program not running (or network problems etc.), so they can be posted the next time the program is started (or network becomes available). A small local database (such as SQLite or MS SQL Server Compact) should serve this purpose nicely.
If the requirements are as simple as you described, then I wouldn't use threading at all. It wouldn't even need to be a long-running app. I'd create a simple app that would just try to post a job and then exit immediately. However, I would scheduled it to run once every given period (via Windows Task Scheduler).
This app would check first if it hasn't posted any job yet for the given posting frequency. Maybe put a "Last-Successful-Post-Time" setting in your datastore. If it's allowed to post, the app would just query the highest priority job and then post it to Facebook. Once it successfully posts to Facebook, that job would then be downgraded to the lowest priority.
The job priority could just be a simple integer column in your data store. Lower values mean higher priorities.
Edit:
I guess what I'm suggesting is if you have clear boundaries in your requirements, I would suggest breaking your project into multiple applications. This way there is a separation of concerns. You wouldn't then need to worry how to spawn your Facebook notification process inside your web site code.
I have a project that I need to make a service that we will add to it about 500 RSS for different sites and we want this service to collect new RSS feeds from these sources and save Title and URL in my SQL Server database.
How can I determine the best architecture design, and what codes would help me in that?
These indications are not specific to your stack (c#, asp.net), but I would definitely not recommend doing anything from the request-response cycle of your web app. It must be done in an asynchronous fashion, but results can be served from the database that you populate with the feed entries.
It's likely that you'll have to
build an architecture where you
poll each feed every X minutes. Whether it's using a cron job, or
a daemon that runs continuously,
you'll have to poll each feed one
after other other (or with some kind
of concurrency, but the design is
the same). Please make use of the
HTTP headers likes Etags and
If-Modified to avoid polling data
that hasn't been updated.
Then, you will need to parse the
feeds themselves. It's very likely
that you'll have to support
different flavors of RSS and Atom, but most parsers actually support
both.1.
Finally, you'll have to store the
entries and, more importantly before
you insert them, make sure you
haven't already added them. You
should use the the id or guid
for the entries, but it's likely
that you'll have to use your own
system too (links, hash...) because
many feeds do not have these.
If you want to reduce the amount of polling that you'll have to do, while still keeping timely results, you'll have to implement PubSubHubbub for the feeds which support it.
If you don't want to deal with any of the numerous issues exposed earlier (polling in a timely maner, parsing content, diffing to keep uniqueness of entries...), I would recommand using Superfeedr as it deals with all the pain points.
I am not going to go into details about implementation or detailed architecture here (mostly from lack of time at this particular moment), but I will say this:
It's not the web service that should consume the RSS feeds, it should merely be responsible of spawning the work to do so asynchronously.
You should not use threads from the ThreadPool to do this, for two reasons. One is that the work can be assumed to be more or less time consuming (ThreadPool is recommended primarily for short-running tasks), and, perhaps more important, ThreadPool threads are used to serve incoming web requests; don't want to compete with that.
I'm doing a project with some timing constraints right now. Setup is: A web service accepts (tiny) xml files and I have to process these, fast.
First and most naive idea was to handle this processing in the request dispatcher itself, but that didn't scale and was doomed from the start.
So now I'm looking at a varying load of incoming requests that each produce ~ 50 jobs on my side. Technologies available for use are limited due to the customers' rules. If it's not Sql Server or MS MQ it probably won't fly.
I thought about going down the MS MQ route (Web service just submitting messages, multiple consumer processes lateron) and small proof of concept modules worked like a charm.
There's one problem though: The priority of these jobs might change a lot, in the queue. The system is fairly time critical, so if we - for whatever reasons - cannot process incoming jobs in a timely fashion, we need to prefer the latest ones.
Basically the usecase changes from reliable messaging in general to LIFO under (too) heavy load. Old entries still have to be processed, but just lost all of their priority.
Is there any manageable way to build something like this in MS MQ?
Expanding the business side, as requested:
The processing of the incoming job is bound to some tracks, where physical goods are moved around. If I cannot process the messages in time, the things are "gone".
I still want the results for statistical purpose, but really need to focus on the newer messages now.
Think of me being able to influence mechanical things and reroute things moving on a track - if they didn't move past point X yet..
So, if i understand this, you want to be able to switch between sorting the queue by priority OR by arrival time, depending on the situation. MSMQ can only sort the queue by priority AND by arrival time.
Although I understand what you are trying to do, I don't quite see the business justification for it. Can you expand on this?
I would propose using a service to move messages from the incoming queue to a number of work queues for processing. Under normal load, there would be a several queues, each with a monitoring thread.
Under heavy load, new traffic would all go to just one "panic" queue under the load dropped. The threads on the other work queues could be paused if necessary.
CheersJohn Breakwell
I have a require ment to read data from a table(SQL 2005) and send that data to other application for every 5 seconds. I am looking for the best approach to do the same.
Right now I am planning to write a console application(.NET and C#) which will read the data from sql server 2005(QUEUE table which will be filled through different applications) and send to other application through TCP/IP(Central server). Run that console application under schedule task for every 5 seconds. I am assuming scheduled task will take care to discard new run event if task is already running(avoid to run concurrent executions).
Does any body come accross similar situation? Please share your experience and advice me for best approach.
Thanks in advance for your valuable time spending for my request.
-Por-hills-
We have done simliar work. If you are going to query a sql database every 5 seconds, be sure to use a stored procedure that is optimized to be very fast. It should not update data unless aboslutely necessary. This approach is typically called 'polling' and I've found that it is acceptable if your sqlserver is not otherwise bogged down with too many other calls.
In approaches we've used, a Windows Service that does the polling works well.
To communicate results to another app, it all depends on what your other app is doing and what type of interface you can make into it, and how quickly you need the results. The WCF class libraries from Microsoft provide many workable approaches for real time communication. My preference is to write to the applications database, and then have the application read the data (if it works for that app). If you need something real time, WCF is the way to go, and I'd suggest using a stateless protocol like http if < 5 sec response time is required, (using standard HTTP posts), or TCP/IP if subsecond response time is required.
since I assume your central storage is also SQL 2005, have you considered using what SQL Server 2005 offers out of the box to achieve your requirements? Rather than pool every 5 seconds, marshal and unmarshal TCP/IP, implement authentication and authorization for the TCP/IP pipe, scale TCP transmission with boxcaring, manage message acknowledgments and retries, deal with central site availability, fragment large messages, implement fairness in transmission and so on and so forth, why not simply use Service Broker? It does all you need and more, out of the box, already tested, already tuned for performance and scalability.
Getting reliable messaging right is not trivial and you should focus your efforts in meeting your business specifics, not reiventing the wheel.
I would recommend writing a Windows Service (since you are C#) that has some timer which runs every 5 seconds. That way you wont be starting and stopping an application all the time, it can run even when there is no one logged into the machine, and it will automatically start when the machine is restarted.
For one of my projects, I needed to do something periodically. I opted for a service and set up a timer that takes care of reading the data. You might consider that solution. It has worked well for me.
I suggest to create a windows service and not an application and to perform the timing yourself - create a timer and execute one step on each timer event. For the communication you have many choices - I would consider using standard technologies like a webservice or Winows Communication Foundation.
Besides this custom solution I would evaluate if the task can be solved using Microsoft Integration Services .
Finally other question comes to mind - why do you need this application? Why doesn't/don't the application(s) consuming the data query the database? Is the expensive polling required? Is it possible for the data producers to signal the availibilty of new data directly to the data consumers?
I am not sure about the details of your project, specifically related to security but maybe it would be better to create an SSIS package and schedule it as a job?