datauris: are PNG and JPEG MIME types interchangable in modern browsers? - data-uri

I have noticed that if you take a base64 string representing the raw bytes of either a JPG or a PNG, call this <B>, and you send a datauri to the browser using either:
data:image/png;base64,<B>
or
data:image/jpeg;base64,<B>
all four combinations work (by work I mean Chrome renders them), the four combinations being
<B> is raw png image, and the data uri uses the png type
<B> is raw png image, and the data uri uses the jpeg type (was expecting a failure!)
<B> is raw jpeg image and the data uri uses the png type (was expecting a failure!)
<B> is raw jpeg image, and the data uri uses the jpeg type
Why is this? The binary encoding of jpeg and png are not the same. I was expecting that if <B> was the raw bytes of a png, the jpeg datauri would fail to render, and visa versa.

Data URLs are described in the RFC 2397 proposed standard (The "data" URL scheme) from August 1998:
data:[<mediatype>][;base64],<data>
This document doesn't really go into implementation details such as error handling.
It's worth noting that the media type part is optional and defaults to plain text in 7-bit US-ASCII:
If <mediatype> is omitted, it defaults to text/plain;charset=US-ASCII.
Now, from the context of your question I assume you're really talking about HTML documents and <img> tags. Whether the information inside the src attribute comes from an inline data URL string or an HTTP network request is possibly secondary and I suspect that raw binary data is handled by the same routines inside the browser no matter its source.
You can emulate the same behaviour by sending arbitrary Content-Type headers with regular image files. This can be accomplished by (mis)configuring your web server, writing a download script in a server-side language or just renaming the files. And in fact you don't even need HTML:
In short, it all relates to the ability of browsers to recover from error conditions that arises from the fundamental design philosophy of the World Wide Web, which kind of evolved organically.

Related

Control over format when using RequestImageFileAsync in Blazor WebAssembly

Blazor Web assembly has a convenience method that converts an IBrowserFile containing an image into a resized version - which is handy for resizing large images prior to uploading them.
This method takes a format as a string which determines the format of the output file.
Is there anywhere a list of valid formats that this property will accept? Can you specify the compression or bit depth values on the resulting file?
Currently, If I take an existing .jpg file and convert it using a format string of "jpg" the resulting file, although smaller in pixel dimensions is actually about double the size on disk. A 4000x3000 image at about 2.8MB can be "reduced" to a 2000x1500 image that's 7.7MB in size. Which is obviously not helping when the purpose is to reduce upload size. I could easily upload the 2.8MB file and resize it more efficiently on the server.
var imageFile = await file.RequestImageFileAsync("jpg", 2000, 2000);
This suggests I'm using the method incorrectly - but Microsoft's documentation on this method gives no clues as to what valid "Format" strings might, only stating that it is a string type. I've tried ".jpg", "JPEG", "jpg" - all of which seem to produce the same valid jpg file. What should I be passing here to actually reduce the file size?
See https://developer.mozilla.org/en-US/docs/Web/Media/Formats/Image_types
It's actually not "image/jpg" but "image/jpeg". If you specify non-existent format, the fallback (at least for me) seems to be "image/png". That's why you always got a valid image but with the same filesize.
I think this method uses html types:
html types
Try "image/jpg".
Be careful, though, this is a request to the browser, and the browser can send back whatever it wants. I believe this will work fine on all browsers, but you'd better check some of the common culprits (Hi, Opera!) to confirm.

IMAP - rule for differentiating between inline and regular attachments

I am working on an email client, and I wonder what is the correct algorithm for deciding whether an attachment is a regular attachment (a downloadable file like pdf, video, audio, etc...) or an inline attachment (which is just an embedded part of an HTML letter).
Until recently, I've checked whether body type (assuming the message part is not multipart, otherwise I would recursively parse ir further) is not TEXT. That is, whether it's APPLICATION, IMAGE, AUDIO or VIDEO. If that's the case, I looked at whether the nineth element is equal to ATTACHMENT or INLINE. I thought that if it's INLINE, then it is an embedded HTML particle, rather than a regular attachment.
However, recently I have across an email that contained some HTML message body and regular attachments. The problem is that its body structure looked like this:
1. mutlipart/mixed
1.1. mutlipart/alternative
1.1.1. text/plain
1.1.2. multipart/relative
1.1.2.1. text/html
1.1.2.2. Inline jpeg
1.1.2.3. Inline jpeg
1.2. pdf inline (why 'inline'? Should be 'attachment')
1.3. pdf inline (why 'inline'? Should be 'attachment')
The question is, why downloadable pdf files are of type INLINE? And what is the appropriate algorithm for determining whether a file is embedded html particle or a downloadable file? Should I look at the parent subtype to see whether it's relative or not and disregard inline vs attachment parameters?
There really is no defined one-size-fits-all algorithm. inline or attachment is something the sender sets, and is a hint on whether they want it to be displayed inline (automatically rendered), as an attachment (displayed in a list), or neither (no preference).
There is also what is sometimes called "embedded" attachments, which are attachments with a Content-ID (this is in the body structure response) and is referenced by a cid: reference in an <img> tag or the like.
So, this pretty much has to be done heuristically.
It really depends on your needs and your clients capabilities, but here is a list of heuristics you may consider using in some combination (some of these are mutually exclusive):
If it is marked 'attachment', treat it as an attachment.
If it is marked inline, and it is something you can treat as inline (image/*, maybe text/* if you like), then it is inline.
If it has a Content-ID, treat it inline.
If it has a Content-ID, and the HTML section references it, treat it as embedded (that is, the HTML viewer will render it); If it was not referenced, treat it as inline (or attachment) as your requirements dictate.
If it is neither, and it is something you want to treat as inline, then treat it as inline.
If nothing applies, treat it as an attachment.
Ignore the disposition, and treat it as inline if you wish (such as making all images always inline)
Also, the original version of inline only meant the sender wanted it automatically rendered; this is often conflated with referenced by the HTML section (which I've called embedded). These are not quite the same.

Getting the raw base64 content out of an attachment in Indy?

So I have an attachment on my incoming Pop3 message,
Msg.MessageParts.Items[msgpart] as TidAttachmentFile
but, is it possible to get the content of this attached file in the format specified by Msg.MessageParts.Items[msgpart].ContentTransfer (so base64 ascii) instead of creating a temp file, calling SaveToFile, and then re-reading the file and re-connverting back to base64?
If TIdMessage.NoDecode is set to False, then no, it is not possible. When NoDecode is False, TIdMessageClient decodes the email as it is being read off the socket and places decoded binary data into attachment objects. The only way to get the original base64 data is to set TIdMessage.NoDecode to True and parse the raw email data manually (it is stored as-is in the TIdMessage.Body) as you would effectively be disabling TIdMessageClient's entire decoding system.
On the other hand, if you just want to avoid the temp file, you can use the TIdMessage.OnCreateAttachment event to have Indy create TIdAttachmentMemory objects instead of the default TIdAttachmentFile objects. The base64 will still be auto-decoded and stored as binary in the attachment, but at least the attachment would be solely in memory so your re-encode would be faster.

Display response of Dropbox API thumbnail() in Rails app

I have successfully returned a thumbnail() request (using the Dropbox SDK) in my rails app, but I don't understand how to process the response. I would like to show the thumbnail on a webpage.
I also tried to save the response to a tmp file, but get a UndefinedConversionError ("\xFF" from ASCII-8BIT to UTF-8) error.
I'm actually doing exactly what you are asking for. What I did was to convert the returned bytes into a base64 string. In C# it's quite easy as there is a convert function to do that.
On the webpage you then have to set the src attribute of a img tag to
<img src="data: image/jpg;base64,PlaceBase64StringHere"...../>
There is a little overhead in the converted string, but it's very easy to handle and you use the power of the client browser to render the image.

Translate binary characters to a human readable string?

So let's say we have a string that is like this:
‰û]M§Äq¸ºþe Ø·¦ŸßÛµÖ˜eÆÈym™ÎB+KºªXv©+Å+óS—¶ê'å‚4ŒBFJF󒉚Ү}Fó†ŽxöÒ&‹¢ T†^¤( OêIº ò|<)ð
How do I turn it into a human readable string of chars, cuz like it was a wierd output of HTML from a webserver that is text I think cuz half the web page loaded correctly. Do I need to read it with like C or Python or something. That's only a snippet of the string.
If that is in fact supposed to be a human-readable string, you'll need to figure out what character encoding it uses and translate. It's also possible that the string is compressed, encrypted, or represents binary data. It would be helpful to know where you got your string from.
I'm guessing your web server isn't sending the correct mime-type. I'd suggest taking a look at the http headers using Firefox's Live Headers plugin. If a web server decides to send you a pdf, but doesn't set the mime-type, you'll just see garbage on your screen. Alternatively, save the page to a file, and then run these commands from Cygwin or a unix shell:
file mypage.htm
strings mypage.htm
The first will tell you if the header bytes follow any recognizable pattern. The second will strip out and display all the human readable text.

Resources