Search email addresses in pcap - parsing

I have an idea of program that should search email addresses in correspondingly tags "mail from", "rcpt to", "to", "from" of smtp, imap and pop3 protocols. I'm interested in only those tags, and no any other.
I used ngrep, but i couldn't write an regexp to search an email between other emails (in case of multiple addresses in "to" tag).
The solution i found was to open the pcap files one by one using fopen and search with memmem for the tags mentioned above. Then i would parse that string to separate the address from aliases, in some cases. The problem is that i have a lot of files needed to be parsed, so i don't have the time necessary to open files, allocate memory, read, search and retrieve email addresses and delete the files after processing.
My question is that are there any other robust and fast methods for c or c++ to get these email addreses or could you help me to write the regex to match addresses like this?
I could find that the "To" tag could have strings like this after it:
To: "Some Name" some.name#domain.com, Some Other , johndoe#somewhere.in,

Related

alter email address in DB to render it un-deliverable

I need to alter email addresses of users to render them undeliverable (not ever a real address), however it needs to be reversible so that the original is visible or at least retrievable (without storing it elsewhere).
For example john#example.com -> NONAME_john#exampleNOTHING.com might work as it can be changed back.
However, the problem is that I cannot KNOW that the above-resulting address is not a real email address. Maybe there is a real address out there called NONAME_john#exampleNOTHING.com.
The requirements are that the address needs to be valid (in terms of having '#' and '.com' etc) but won't send.
Maybe my requirements are a contradiction and hence not possible? Does anyone know?
Your requirements are kind of a contradiction. The only way to make an addres unworkable is to render it practically invalid.
Note that "invalid" can mean at least three things;
it doesn't parse as an e-mail address (e.g. replace # by ⓐ)
the user name is invalid at that domain
the domain name is invalid
The point is that for 2 and 3 there is no easy way to know for sure if that is the case.
What you also could do is obfuscate the e-mail addres by applying a reversable transformation, like e.g. base64 encoding it. This falls under category 1 above.
For example in Python 3:
In [1]: import base64
In [2]: base64.b64encode('john#example.com'.encode('utf-8'))
Out[2]: b'am9obkBleGFtcGxlLmNvbQ=='
In [3]: base64.b64decode(b'am9obkBleGFtcGxlLmNvbQ==').decode('utf-8')
Out[3]: 'john#example.com'

GET by Email fails when looking for Email with a "+"

I have an application that stores information about a person onto a database, but when I try to use the URL to GET a user based on their email address users with a + in their email cannot be found.
Example URL that returns person:
https://www.someURL.com/api/people/johnsmith#someemail.com
Example URL that does not return person (returns null):
https://www.someURL.com/api/people/jane+doe#someemail.com
Both emails are in the database as written in the URL so it does not appear to be a typo issue, and I am using postman to test the GET method. Why am I not able to find them, and how can I make it so that they can be found even with the + character?
Working postman request
NOT working postman request
When I search with id I am able to find the person so I know the person exists.
Verification that person exists
My suggestion would be: change your server implementaion from GET to POST and provide an email as a String parameter within the body of request. It'll prevent this and any similar issue with escaping special characters in URI.
If it's not possible, try to frame email address with a single ' or double " quotes, depending on how your web server treats incoming request it may help as well.
Nice to know that "+" is not really a 'valid' character for a lot of email providers for a reason. For instance, Gmail will not let you to create an email address with anything but [A-z0-9] (alphanumeric) and dot (.) characters. I'm pretty sure they were tired of validating input emails with complex regular expression and just limited it to basic ones.
'+' is a reserved character in URIs, so in order to prevent it being interpreted as a space character you would need to percent-encode it. In your example, replace '+' with '%2B'.
https://www.someURL.com/api/people/jane%2Bdoe#someemail.com
There are other characters that are allowed in email addresses but are reserved characters in URIs, so it would be best to percent-encode the whole email address, just in case.

IMAP search with a full string

I am currently trying to search some email in my Inbox with a string like that "not working anymore" but the problem is that imap will return me the list of email containing the 3 words and not the string "not working anymore".
Any idea how to resolve this? I have no messages containing this string, but the IMAP returns me 2 results because of the problem explained above.
a1 SEARCH BODY "not working anymore"
* SEARCH 4090 7752
Filter/verify the server's result in your client code.
The longer answer is that IMAP servers used to do what you want, but as users wanted smarter fuzzier faster searching, servers started implementing it. For example, users wanted to find messages like this:
foo foo foo foo foo foo foo not working
anymore foo foo foo foo
The whitespace does not match your search term. Do you think the server should return that message?
The IMAP specification is silentish on the issue. As I read it, if a message contains the exact string, that specification says the message must match the search, but if the message doesn't, then... well, then server and client should act to satisfy the user and bring about world peace. This implies that the search result you get from the server contains all the messages you want, so filtering won't miss any results.

generate email address that links to a message thread like Facebook in ruby on rails app

Facebook sends email notifications when a new message has arrived in a facebook message thread. The email allows you to reply on it without going to Facebook.
I think it is being done by Facebook by generating a reply to email address that is linked to the message thread.
Example of such a reply-to email adress of a facebook email notification (I modified some characters, so it won't work):
m+51r6w8e000000bu1jfpbziio6jmfnvvtkaevxrgojnel8qv#reply.facebook.com
I'm trying to implement a similar feature in my rails app.
I'm still a newbie in rails and wondering how I should approach this issue.
I was trying to encrypt the id of my message thread using the encryptor gem, then using this as an email adress in the form: encryptedId#mydomain.com. Issue is that the encrypted output contains characters that are not allowed in an email address.
As I know little about encryption I googled and found the possibility to base64 encode the encrypted output. This is common practice for urls. But still, this has characters (for example %) that are not allowed in an email adress.
I found that RC4 should be an encrytpion algorythm that has hexadecimal output. But the encryptor gem gives me 1 non-hexadecimal character when using this algorythm, so it doesn't seem to work. Conclusion: I'm a bit stuck.
Maybe I'm looking to far. Are there other appoaches that I could consider?
EDIT: extra info: I'm trying to make the email address non-guessable.
Thanks!
If you are trying to keep your response email addresses non-predictable, you can create your email address out of a concatenation of:
some unique aspect of the message thread such as a row ID
a similar unique attribute of the user being sent the email
a MD5 encoded hash of both of those items plus a unique string known only by your system
a random salt to the MD5
So if user 7812 posts in thread 8299 you could make your base string
u7812t8299
then take that string "u7812t8299" plus the time the email was sent (say 12:31), and a string known to your system like "purpleumbrella"
Your result string is "u7812t82991231purpleumbrella". Using:
Digest::MD5.hexdigest("u7812t82991231purpleumbrella")
we get an MD5 hash of:
5822aceca1f70afdb06f53b5c7e4df99
now send the user an e-mail with a return address of
u7812t8299-1231-5822aceca1f70afdb06f53b5c7e4df99#yoursite
When you get an e-mail back to that address, your system will know that it's for user 7812 posting in thread 8299, and because only your system knows the password required to create the MD5 sum for this combination that would result in an MD5 string starting with 5822aceca1, you can verify to a certain extent that this is not a randomly generated email by someone trying to spam your system.

What do the flags in a Maildir message filename mean?

I'm cleaning up some old Maildir folders, and finding messages with names like:
1095812260.M625118P61205V0300FF04I002DC537_0.redoak.cise.ufl.edu,S=2576:2,ST
They don't show up in my IMAP client, so I presume there's some semaphore indicating the message already got moved somewhere else. Is that the case, and can the files be deleted without remorse?
The 'M' is just part of the unique filename and has nothing to do with the fact that the mail doesn't show up in mail clients.
The 'T' at the end of the filename, after the ':' sign, however tells the IMAP server that this message is Trashed.
See http://cr.yp.to/proto/maildir.html
IMAP, is a protocol for communicating to a message storage, the actual storage is standardised in other ways. The filename looks like a Maildir filename where I think does not put any meaning into the first part of the filename, but you have to check with your software manual.

Resources