Creating an e-mail parser as a service?

Creating an e-mail parser as a service? - parsing

I am trying to hash out how I would create an e-mail parser. I understand technically how to do it, but I cannot figure out implementation details.
So, user sends an e-mail to an address, mail server receives and my app parses it based upon subject, content and drops it in a bucket (e-mail account or database) and then I can act upon it.
So do I use an existing mail server software (like Zimbra, which we already have running) or do I create an app that listens on port 25 and does specifically what I need? (meaning no mail server sofware running on this box, etc)
My goal here is to create myself a series of organization tools for personal use in an automated way based upon what I e-mail myself.

Writing something to listen on port 25 and act as an SMTP server will be involved and probably overkill for what you want.
I think there are two main options. The first is to leave your existing mail server in place and then poll an account on that mail server over IMAP (or POP3) to retrieve the emails and then process them using a script. It really doesn't matter what language you're comfortable with as there are libraries for handling IMAP connections and then parsing the email in most languages.
Alternatively you could look at a service like http://CloudMailin.com that does this for you. It will receive the email and send it to a web app that you could create via an http post in something like JSON format.

I would go for a python script which polls the mailbox (basing on a cron job). Python allows you to access IMAP very easily and has powerful regular expression functions to parse the email content.
Try something like:
import imaplib, email
import re
M= imaplib.IMAP4_SSL('imap.gmail.com')
M.login('user', 'pass')
M.select('Imap_folder')
typ, data = M.search(None, 'FROM', '"*"')
for num in data[0].split():
typ, data = M.fetch(num, '(RFC822)')
email_body = data[0][1] # getting the mail content
mail = email.message_from_string(email_body) # parsing the mail content to get a mail object
foo = re.compile("your regular expr here", re.MULTILINE)
res = foo.search(email_body)

Related

How to programmatically parse emails on an Outlook server and execute a script/task

My intranet web application (written in C#/ASP.Net MVC) sends email notifications in certain situations. I would like to intercept replies to such emails and perform actions based on the content of such replies.
I have no preference for scripting language - it could be Powershell, Python, VBA, anything - as long as I can parse the subject and body of the email, I can then alter the database of my web application through this script and pick up changes with an automated task, but I really have no clue where to start. I would be really grateful if someone could point me in the right direction.
QUESTION
How can I intercept emails sent to the Outlook server and perform action based on the content of such emails?

It sounds like you need an inbound email parsing service. I've worked with the one from SendGrid and it will catch any replies to a specific email address, and then post the email contents to an action on a controller in your MVC app. This will give you access to the full email contents and you can process it as needed.
See Setting up Inbound Parse

Receive, Store, Interact with emails rails application

With my rails application, I'm supposed to provide following features:
There a limited number of users interacting with my system (in order of 10 to 20)
Like any normal mail client users should be able to have an inbox page showing received message, response to individual email and etc....
The mail client part cannot be an external application, they want everything packaged into a single application!
Normally These emails should be stored for future use
In order to send a receive email, we do not need to setup a mail server. They will provide the server and we will fetch the message with POP3 or something else. Same goes for sending emails.
The application itself often needs to look into these message as well, so it should be able to access corresponding email objects.
Separate part of these applications can be handled with individual gems such as Mailman, ActionMailer, and etc...
But what would be your suggestions to get this done?

I suggestion customizing an open source solution according to your needs. This is a gem/project that you should look at https://github.com/mailboxer/mailboxer It has all the features that you mentioned and its straightforward in its customizations.

Accept and parse information from email

What I would like to accomplish is for a user to be able to send an email to a designated email address. Then once that email has been obtained, run a script that parses the body section of the email and carries out various tasks with the information provided in the email
(The data coming in from the email will be structured in an xml type format).
In my mind this seems like a simple task to accomplish but I'm not all that familiar with the inner workings of email. My questions are:
How will I know once an email has been obtained from the sender so that it can be processed?
How can I use php to obtain the text found in the email?

If you have PHP running on your mail server, then it's possible to pipe incoming mail to a PHP script. Then, your PHP script can parse the body of each incoming message as it arrives. See http://harrybailey.com/2009/02/send-or-pipe-an-email-to-a-php-script/ for more info.

I agree with the other responses, that piping to a php application should be your first option (if the mail server and php are the same)
Otherwise if you have an external mail server you will need to poll the email address at regular intervals and check for new emails.
Extracting plain text body from an email message is not too hard, but getting to attachments can be a pain sometimes.
You will need to use a cronjob to trigger your script at regular intervals
To fetch the email from the server and parse it, I have used the php IMAP functions
http://php.net/manual/en/book.imap.php
You can also use 3rd party services like MandrillApp which will receive the message, break it into parts, and call your application via a webhook.
http://help.mandrill.com/entries/21699367-Inbound-Email-Processing-Overview

Running a PHP script on email arrival in an IMAP Server

I'm trying to implement a webmail in PHP. I would like to write a PHP CLI script which is run on every email arrival to store some parts of (not all of) incoming email into database for search purposes. Then when the user finished searching and chose an email to show, a connection is made to mail server to retrieve the complete email. In order to implement this scenario I need to make some sort of connection among emails within database and mail server.
Since my knowledge of working with mail servers is limited to Zend Framework's API, what I believe I need in order to retrieve an email from an IMAP server is a message number or a message unique id (this later one seems not to be supported by all mail servers).
To this point, I've managed to find .forward (and some other ways) to introduce my PHP CLI script to MTAs to be run on every email arrival. This way I can store emails to database. But this won't do since message unique id is created by MDA so MTA do not know of it and they can not provide it to me. This means I can not find emails later when I want to retrieve them from mail server.
At last, here's my question: Is there a way to introduce a PHP CLI script to a MDA for emails' arrival? If this is dependent on the mail server, which servers do support this and how? My personal choice would be Dovecot or Courier, but any other mail server would do as well.

This is tricky -- there are many ways on how to setup delivery. Some of them work with the underlying mail store directly, bypassing your IMAP server altogether, while others use e.g. Dovecot's facilities.
Have you considered building on top of the notify plugin which ships with Dovecot?

It seems like it's impossible to introduce such a PHP CLI script to IMAP server (at least I'm sure of Dovecot). Anyway, the work around I found for this problem is to use my own PHP script to insert the new mails into IMAP server and retrieve their id's and then store the id in database for future references. To be clear, email are given to my PHP CLI script by MTA, not MDA. As I said before this is done easily using .forward file.
[UPDATE]
Unfortunately it seems this solution can not be implemented as well. The way to insert a new email to IMAP server is APPEND command, and to have the UID of the recently added mail server must support UIDPLUS extension. Neither Dovecot nor Courier supports this extension at the moment! If they did it seems the server would return the UID with a APPENDUID response.
[UPDATE]
It is my bad since Courier does support UIDPLUS. So this solution is valid and the one I'm going to implement.

Receive mail in Ruby/Rails

How would i go about to receive mails in a Ruby on Rails application without going through a mail server like PostFix or to fetch them by pop3 etc.
What i was to do is to catch all mails sent to #mydomain.com and just do something with them in my application. I don't need to store the mails or anything like that.
Is this posible?

I just implemented this for my SAAS to autoprocess mailer-bounce notification messages.
Call me, call you?
You call me
You can set up a local mail server. It would then respond to an incoming email, and start up a rails executable to process the email. This method is NOT recommended since starting up rails is a big task (takes multiple secs and lots of memory). You don't want a Rails bad boy started up just because you received an email. You'd be writing your own DDOS attack. (Attacking yourself.)
I call you
Instead, poll for email on your own schedule by using a single job to process all currently waiting emails. You need to set up a background job handler since stock rails is focused on responding to web requests. I use delayed_job, but there are other alternatives including kicking off a cron job every so in often.
Another benefit is that you don't need to manage a mail server. Leave that headache to someone else. Then use the Ruby library net::imap to read the incoming mail and process it.
If your process doesn't recognize the email format, then forward the msg to a human for processing.
And be sure that if the process sends mail in addition to reading/processing it, that the process uses a different email address as its From address. Otherwise, odds are good that sometime along the way, you'll end up in an email loop and many gigabytes of messages going back and forth. For example, your process receives a message, responds to it, but in the meantime the sender (a human) has switched on vacation response. And your robot then responds to the vacation response..... oops....
Writing your own mail server
Re:
How would i go about to receive mails in a Ruby on Rails application without going through a mail server like PostFix or to fetch them by pop3 etc.
What i was to do is to catch all mails sent to #mydomain.com and just do something with them in my application. I don't need to store the mails or anything like that.
Direct answer: Yes, you could do this by writing an smtp server and setting up dns so your machine will be the mail destination for the domain. Your smtp server would process the messages on the fly, they would not be stored on your system at any point.
Is this a good idea? No, not at all. While appearances may be to the contrary, email is a store and forward system. Trying to avoid storing the messages before your app processes them is not smart. It would be a very very poor "optimization." However, using an access protocol (POP3 or IMAP) is a good way to avoid the costs of installing, configuring and managing a mail server.

You can do this if you write your own mail server, or if your mail server supports hooks to run external programs upon receipt of mail (e.g. procmail).
If you don't have procmail available (or, if on something like Exchange Server, don't feel like writing custom rules or extensions), you're simply better off using a pop3 library to fetch mail.
Obviously, writing a mail server is more difficult than any of the alternatives.
If you're mostly worried about checking potentially hundreds of email accounts, that's solvable by configuring your email server properly. If you're on a hosted provider, ask your server administrator about creating a "catch-all" account that routes all mail to unknown addresses to a single account.

If you're aiming to avoid having to poll a server, consider the IMAP IDLE command. I've successfully written a Ruby client that opens a connection to an IMAP server, and gets told by the server when new mail arrives.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart