I have a process where about 1500 mails are sent once a week.
The process I have it in a django
command that I plan to put in a crontab
. The process has a loop in which it is verified if the user want to receive emails and the language in which will receive, like this one:
for user in users:
# Check if user accept emails
if user['send_mail']:
# Get language to email
lang = ""
if user['lang'] == "es":
lang = "es"
elif user['lang'] == "fr":
lang = "fr"
else:
lang = "en"
email = user['email']
# Send email
send_mail()
It is not much, 1500 mails, but I want to leave it scalable, since the amount of mails depends on the number of registered users for the platform.
I do not know if it is now scalable or it is better to use redis queue
or celery
.
I am using Amazon Simple Emails Service
(SES
).
You have two different issues to deal with here:
First, that while it is pretty easy to SEND 1500 emails, there are complex realities to whether those 1500 emails will be RECEIVED. Your email can easily be blocked or diverted to a spam folder. Your whole domain could be blocked by some mail services. To limit the possibility of these difficulties, you need to have DKIM and SPF records set up properly.. and there are other things that commercial mail senders do to keep things smooth. So if you are not interested in taking on that challenge you are better off working with a professional service like SES.
But sure, you can also just use postfix or any other mail relay software to set up your own mail server locally, even right on the same machine. Set up your own DNS records and send the mail directly to the recipent without SES or anyone else to deal with.. but then you have to deal with any spam blocker problems.
Second is that, assuming you use SES, you have to make sure that all your emails are safely delivered from yourself to Amazon. This is where trouble can come in. You don't want to generate half your emails and have them delivered, then due to let's say a network outage, have a problem.. and have no way of sending only those that were not sent without resending all. It can be a tricky bit of code to write perfectly.
The easiest solution technically is to install a local SMTP relay server (eg. postfix) configured with Amazon as its "smarthost". Configure django to use "localhost" as its SMTP server.
With that in place, when your cron job runs, it will only take a few seconds, because all the emails go straight into postfix's directories on your local drive and are queued there.
Then postfix, because it is configured with SES's SMTP server as its smarthost (sometimes called smart relay), won't send any email directly to the recipient, but will forward all the emails to SES to be delivered to the final recipient. If there's any problem doing that, postfix (or whatever mail relay software you prefer) will retry each message until things work out.
It's made for that, it's tried, tested, works...
So that is the easiest path for you.
If you choose to use the SES REST API, then it is the responsibility of your code to make sure that each message is delivered to Amazon exactly once and only once. If you loop through 1000 emails and then there is a network failure or crash and you fail to send the last 500 emails, it will be the problem of your code to recover from that without resending the first 1000 emails again. And for that, yes, queuing systems are useful. Celery or just RabbitMQ by itself can work. Or just make a queue by storing records in your database of what messages need to be sent then deleting those records as each email is sent.
But writing code like that which works perfectly in every circumstance can be tricky. Sometimes it is ok to re-invent the wheel.. sometimes you need a better wheel :) But in this case I think you are better off using an SMTP relay server.