Email Notification Architecture Questions

Email Notification Architecture Questions - c#

We are looking to develop an email notification service where emails can be scheduled daily, weekly, or hourly based on certain actions that happen within our system (User Registration) or summary emails sent every Friday for example.
What is the best way to handle making sure duplicate emails are not sent? We thought about maybe having the application write a record to a queue whenver a system action happens but this would seem to have one more point of failure. Or maybe just making all notifications data driven, for example, select all users where created on date is greater than now. But with this scenario we need a way to make sure if the service runs again duplicate emails are not sent..
any ideas would be great!

My 2 cents
1) Queues. Queues are great for the tasks where you want to have a 'single entry and single exit' type of architecture. Queues decouple the systems and allow you to load balance the system. They are usually used with the multiple workers on one and another end. You will just add (maybe a lot) of messages to the queue, and later run a bulk dequeue. IMO that is irrational memory and resource consumption.
2) Data-driven via Users. Much easier to implement, however for each notification you will check every user and will put heavy load on db.
3) Data-driven via UserNotifications. Alternatively, you can create a separate table UserNotifications, where each user will be added once he has registered. It is much easier to select the needed users within given time frame and you don't store them in memory. Once notification sent, you remove the user from UserNotifications table.

Related

Web application + SQL server triggers / thresholds

We have a web application that is hosted in IIS. In our database that serves the application we have all kinds of different data values. We are trying to figure out a way to have an email sent to a client if a certain data value exists or exceeds a threshold value.
Generic Example:
Say we have a table that lists widgets and their 'in inventory' quantity. Every time someone sells a widget, this quantity value would be depleted. We want to send an email to the manager when the widget quantity gets below 5 and tell him to reorder more widgets.
We don't want to have sql triggers that check the quantity any time a 'depletion' transaction takes place. Instead, we want some type of background monitoring process that checks the level of the widgets on a timed basis. How can we accomplish this? Windows Service / WinForm application? Something built into IIS that will run ASP.net C# code?

Polling based monitoring should be your last resort. It uses too many resources for a simple task and most of the time it will only see that it's not the case to do anything. And it doesn't even scale when your data grows.
Instead, you should focus on the code that changes those values and act then, on in spot. And the check will also be lighter: only one item being checked not all, and only once, not every x seconds/minutes/hours/...
Apart from the architectural considerations, to answer your question, just as Jonathan said: anything that can read a database and send emails will do, but I'd consider a Windows Service for this job because that's what they were made for: background jobs running all the time, unrelated to the host users. You also get some extra benefits like automatic startup and recovery options.

Anything that can read the database and send an email could accomplish this - console app, winforms app, web app -- it doesn't really matter.
It may be more efficient to monitor when the values are changed (what changes them? A web application?) and have that application also send notifications

Thread.Sleep for up to seven days in a Azure WebJob

I'm currently working on a Azure WebJob application that has a queue newUsersQueue. This queue is populated by a associated website, which adds new users to the queue when they create an acount, and the objective of the webjob is to send an email to users if they have had no activity during their first week of membership. Each object in the queue has data to identify the user, and a DateTime RegisteredDate.
I am assuming the queue will be ordered by date users join, so that oldest users end up first in the queue, because of FIFO(first in first out)
My current approach is then to grab the first item in the queue from the webjob, and simply Thread.Sleep(DateTime.Now.Subtract(RegisteredDate.addDays(7)))
Since resources are limited, I'm now worrying this might be expensive. Maybe there are other issues too that I haven't thought about(this thread might sleep for up to 7 days, waste of a thread?)?
Is there a better(more cost-effective) approach to achieve this? I've been considering using a timer or somesuch. what would the benefits of a timer/other approach be over sleeping?
Is there any risk of loosing a queue-message from sleeping for a week after popping it of the queue? ( is it loaded to memory and removed from queue? ) persistence is of course important too.
From what I've learned so far my decision essentially boils down to how Azure Webjobs handle queues. Will it startup new threads at will to handle queued messages, or will it stick to 1 thread, and have it take it's time with the existing queue?
In other words, will the above method start a new thread for each new user, or will stick to one thread and handle queued objects one at a time?

Create a scheduled WebJob.
Choose recurring, run it whenever you want, make sure that the logic you use to check for users in your criteria is efficient, that's probably the most expensive part assuming you have a lot of users. If not, this isn't expensive, really.
One way you could do this if you're worried about expensive queries is add new users to another table, check this table every day for users that have logged in, remove them from this table. If a user gets to 7 days, send an email, then remove or do whatever you want to do with them. Then, assuming you only had 300 new members a week, you'd only have to query 300 users, not 10k.

Scalability and availability

I am quite confused on which approach to take and what is best practice.
Lets say i have a C# application which does the following:
sends emails from a queue. Emails to send and all the content is stored in the DB.
Now, I know how to make my C# application almost scalable but I need to go somewhat further.
I want some form of responsibility of being able to distribute the tasks across say X servers. So it is not just 1 server doing all the processing but to share it amoungst the servers.
If one server goes down, then the load is shared between the other servers. I know NLB does this but im not looking for an NLB here.
Sure, you could add a column of some kind in the DB table to indicate which server should be assigned to process that record, and each of the applications on the servers would have an ID of some kind that matches the value in the DB and they would only pull their own records - but this I consider to be cheap, bad practice and unrealistic.
Having a DB table row lock as well, is not something I would do due to potential deadlocks and other possible issues.
I am also NOT indicating using threading "to the extreme" here but yes, there will be threading per item to process or batching them up per thread for x amount of threads.
How should I approach and what do you recommend on making a C# application which is scalable and has high availability? The aim is to have X servers, each with the same application and for each to be able to get records and process them but have the level of processing/items to process shared amoungst the servers so incase if one server or service fails, the other can take on that load until another server is put back.
Sorry for my lack of understanding or knowledge but have been thinking about this quite alot and had lack of sleep trying to think of a good robust solution.

I would be thinking of batching up the work, so each app only pulled back x number of records at a time, marking those retrieved records as taken with a bool field in the table. I'd amend the the SELECT statement to pull only records not marked as taken/done. Table locks would be ok in this instance for very short periods to ensure there is no overlap of apps processing the same records.
EDIT: It's not very elegant, but you could have a datestamp and a status for each entry (instead of a bool field as above). Then you could run a periodic Agent job which runs a sproc to reset the status of any records which have a status of In Progress but which have gone beyond a time threshold without being set to complete. They would be ready for reprocessing by another app later on.
This may not be enterprise-y enough for your tastes, but I'd bet my hide that there are plenty of apps out there in the enterprise which are just as un-sophisticated and work just fine. The best things work with the least complexity.

NService Bus - Content based routing & auditing - is my approach ok?

I have a little trouble deciding which way to go for while designing the message flow in our system.
Because the volatile nature of our business processes (i.e. calculating freight costs) we use a workflow framework to be able to change the process on the fly.
The general process should look something like this
The interface is a service which connects to the customers system via whatever interface the customer provides (webservices, tcp endpoints, database polling, files, you name it). Then a command is sent to the executor containing the received data and the id of the workflow to be executed.
The first problem comes at the point where we want to distribute load on multiple worker services.
Say we have different processes like printing parcel labels, calculating prices, sending notification mails. Printing the labels should never be delayed because a ton of mailing workflows is executed. So we want to be able to route commands to different workers based on the work they do.
Because all commands are like "execute workflow XY" we would be required to implement our own content based routing. NServicebus does not support this out of the box, most times because it's an anti pattern.
Is there a better way to do this, when you are not able to use different message types to route your messages?
The second problem comes when we want to add a monitoring. Because an endpoint can only subscribe to one queue for each message type we can not let all executors just publish a "I completed a workflow" message. The current solution would be to Bus.Send the message to a pre configured auditing endpoint. This feels a little like cheating to me ;)
Is there a better way to consolidate published messages of multiple workers into one queue again? If there would not be problem #1 I think all workers could use the same input queue however this is not possible in this scenario.

You can try to make your routing not content-based, but headers-based which should be much easier. You are not interested if the workflow is to print labels or not, you are interested in whether this command is priority or not. So you can probably add this information into the message header...

Any architecture tips for sending out daily, weekly email updates that require calculation

I have a web app that will send out daily, weekly email updates depending on the user permissions and their alert settings (daily, weekly, monthly, or none).
Each email to an account (which would have multiple users) requires a few DB calls and calculations. Thus making these daily/weekly emails pretty expensive as the number of users increase.
Are there any general tips on writing these services? I'm looking for some architecture tips or patterns and not really topics like email deliverability.

I would cache the data before the processing time, if you are having to handle very large sets of information, so that the DB 'calculations' can be omitted from the processing cycle at the specific times. Effectively break the processing up so that the DB intensive stuff is done a bit before the scheduled processing of the information. When it comes time to actually send these emails out, I would imagine you can process a very large volume quickly without a whole lot of tuning up front. Granted, I also don't know what kind of volume we're talking about here.
You might also thread the application so that your processing data is further split into logical chunks to reduce the overall amount of data that has to be processed all at once, depending on your situation it might streamline things, granted, I normally don't recommend getting into threading unless there is a good reason to, and you may have one. At the very least, use a background worker type of threaded process and fire off a few dependent on how you segment your data.
When handling exceptions, remember to now let those bring your processing down, handle them through logging of some sort or notification and then move on, you wouldn't want an error to mess things up for further processing, I'm sure you probably planned for that though.
Also, send your emails asynchronously so they don't block processing, it's probably an obvious observance but sometimes little things like that are overlooked and can create quite the bottleneck when sending out lots of emails.
Lastly, test it with a reasonable load beforehand, and shoot for well over capacity.

You may want to check out sql reporting services.
You may have to translate the current setup into the sql reporting format but in return you'll get a whole administrative interface for scheduling the report generation, allowing users to modify the report inputs, caching historical/current reports, and the ability for users to manage their own email subscriptions.
http://msdn.microsoft.com/en-us/library/ms160334.aspx

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.