I run a small e mail client build with delphi and indy 10. Some mails i receive have the mime format or html format. With the current code I just copy the bode.lines to a memo.lines
MyMailMemo.Lines.AddStrings
(TIdMessage(Msg.Body);
How do I copy the content of mime emails?
MIME-encoded emails do not use the TIdMessage.Body property. They use the TIdMessage.MessageParts property instead, where textual MIME parts are stored as TIdText objects and attachments are stored as TIdAttachment-derived objects. You have to look at the TIdMessage.ContentType property to know whether you are working with an HTML email or a MIME email. Even then, chances are that HTML emails are actually MIME encoded, as they usually include an alternative plain-text MIME part for non-HTML email readers. You can loop through the TIdMessage.MessageParts looking for a TIdText object whose ContentType is HTML, then copy the TIdText.Body content into your TMemo.
Related
Do I have to specify a MIME type if the uploaded file has no extension?
In other words is there a default general MIME type?
You can use application/octet-stream for unknown types.
RFC 2046 states in section 4.5.1:
The "octet-stream" subtype is used to
indicate that a body contains
arbitrary binary data.
RFC resources:
We should use RFC-7231 (HTTP/1.1 Semantics and Content) as reference instead of RFC-2046 (Media Types) because question was clearly about HTTP Content-Type.
Also RFC-2046 does not clearly define unknown types but RFC-7231 does.
Short answer:
Do not send MIME type for unknown data.
To be more clear: Do not use Content-Type header at all.
References:
RFC-7231Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content3.1.1.5. Content-Type
A sender that generates a message containing a payload body SHOULD
generate a Content-Type header field in that message unless the
intended media type of the enclosed representation is unknown to the
sender.
That section clearly tells you to leave it out if you don't know it for sure.
It also tells that receiver could assume that type is application/octet-stream but thing is that it might also be something else.
What's different then?
RFC-20464.5.1. Octet-Stream Subtype
The recommended action for an implementation that receives an
"application/octet-stream" entity is to simply offer to put the data
in a file, with any Content-Transfer-Encoding undone, or perhaps to
use it as input to a user-specified process.
And, as already stated above:
RFC-72313.1.1.5. Content-Type
If a Content-Type header field is not present, the recipient
MAY either assume a media type of "application/octet-stream"
([RFC2046], Section 4.5.1) or examine the data to determine its type.
Conclusion:
If you define it as "application/octet-stream" then you are telling that you know that it is "application/octet-stream".
If you don't define it then you are telling that you don't know what it is and leave decision to receiver and receiver could then check if it walks like duck and...
I prefer application/unknown, but result will be surely the same as application/octet-stream
I am using Microsoft Graph to fetch mail and I recently noticed when an email has a .eml attachment, there are two cases:
If the sender attach that email through drugging emails to the composer, the attachment will be an item attachment type. <-- I can handle this case
If the sender attaches a .eml file through clicking "attach file", that attachment will be a type of file attachment. Up to this point, I think it is fine to be a file attachment. But when I try to fetch that attachment, the attachment content type is application/octet-stream which is wrong. Shouldn't it be message/rfc822? With application/octet-stream, I cannot create that attachment from our server.
It isn't wrong, application/octet-stream simply represents a generic/unknown binary file. From RFC 2046 ยง 4.5.1:
The "octet-stream" subtype is used to indicate that a body contains arbitrary binary data.
Your application can make its own determination on how to handle the file. In this case, the .eml is just a text file. You can simply fetch the attachment and treat it as raw text.
I am working on an email client, and I wonder what is the correct algorithm for deciding whether an attachment is a regular attachment (a downloadable file like pdf, video, audio, etc...) or an inline attachment (which is just an embedded part of an HTML letter).
Until recently, I've checked whether body type (assuming the message part is not multipart, otherwise I would recursively parse ir further) is not TEXT. That is, whether it's APPLICATION, IMAGE, AUDIO or VIDEO. If that's the case, I looked at whether the nineth element is equal to ATTACHMENT or INLINE. I thought that if it's INLINE, then it is an embedded HTML particle, rather than a regular attachment.
However, recently I have across an email that contained some HTML message body and regular attachments. The problem is that its body structure looked like this:
1. mutlipart/mixed
1.1. mutlipart/alternative
1.1.1. text/plain
1.1.2. multipart/relative
1.1.2.1. text/html
1.1.2.2. Inline jpeg
1.1.2.3. Inline jpeg
1.2. pdf inline (why 'inline'? Should be 'attachment')
1.3. pdf inline (why 'inline'? Should be 'attachment')
The question is, why downloadable pdf files are of type INLINE? And what is the appropriate algorithm for determining whether a file is embedded html particle or a downloadable file? Should I look at the parent subtype to see whether it's relative or not and disregard inline vs attachment parameters?
There really is no defined one-size-fits-all algorithm. inline or attachment is something the sender sets, and is a hint on whether they want it to be displayed inline (automatically rendered), as an attachment (displayed in a list), or neither (no preference).
There is also what is sometimes called "embedded" attachments, which are attachments with a Content-ID (this is in the body structure response) and is referenced by a cid: reference in an <img> tag or the like.
So, this pretty much has to be done heuristically.
It really depends on your needs and your clients capabilities, but here is a list of heuristics you may consider using in some combination (some of these are mutually exclusive):
If it is marked 'attachment', treat it as an attachment.
If it is marked inline, and it is something you can treat as inline (image/*, maybe text/* if you like), then it is inline.
If it has a Content-ID, treat it inline.
If it has a Content-ID, and the HTML section references it, treat it as embedded (that is, the HTML viewer will render it); If it was not referenced, treat it as inline (or attachment) as your requirements dictate.
If it is neither, and it is something you want to treat as inline, then treat it as inline.
If nothing applies, treat it as an attachment.
Ignore the disposition, and treat it as inline if you wish (such as making all images always inline)
Also, the original version of inline only meant the sender wanted it automatically rendered; this is often conflated with referenced by the HTML section (which I've called embedded). These are not quite the same.
So I have an attachment on my incoming Pop3 message,
Msg.MessageParts.Items[msgpart] as TidAttachmentFile
but, is it possible to get the content of this attached file in the format specified by Msg.MessageParts.Items[msgpart].ContentTransfer (so base64 ascii) instead of creating a temp file, calling SaveToFile, and then re-reading the file and re-connverting back to base64?
If TIdMessage.NoDecode is set to False, then no, it is not possible. When NoDecode is False, TIdMessageClient decodes the email as it is being read off the socket and places decoded binary data into attachment objects. The only way to get the original base64 data is to set TIdMessage.NoDecode to True and parse the raw email data manually (it is stored as-is in the TIdMessage.Body) as you would effectively be disabling TIdMessageClient's entire decoding system.
On the other hand, if you just want to avoid the temp file, you can use the TIdMessage.OnCreateAttachment event to have Indy create TIdAttachmentMemory objects instead of the default TIdAttachmentFile objects. The base64 will still be auto-decoded and stored as binary in the attachment, but at least the attachment would be solely in memory so your re-encode would be faster.
Wondering if there's a way I can avoid fetching the attachments as well.
Yes, it returns the entire email source. The attachments are encoded as email parts.
You can use the Ruby Mail library to extracts all the attachments.
If all you want to do, is download the email body(ies), there isn't a clean way of doing this (at least none that I disovered).
What I had to do, was first download the headers and the body structures. Once I had the headers, I could determine what type of email it was (multi-part, alternative, or just a single bodied email).
Once I knew the structure, I could download the plain text or html body part as a body section.
Does that help?
--Dave
I don't know about ruby , but it can be done only fetching Email Header.
I am Fetching Email header in Python like below:
resp, data = obj.uid('FETCH' , ','.join(map(str,uid_lst)),'(RFC822.HEADER RFC822.SIZE)')
where uid_lst is the list of uids of emails which you want to fetch.
Note: Email which has value of header field Content-Type = 'Multipart-Mixed' has an attachment.