My Rails application allows users to reply to certain emails in order to post a comment to the app. When email is replied to, the entire body including the original email gets included. Some clients put a line between the original message and the message details and some don't. I was wondering is there is a standard way to parse the email and not include the message details in the post body.
Thanks!
Ideally all the mails need to follow the RFC standards to compose the mail so that all the web browsers and clients can parse them correctly. There are soo many rfc which corresponds to it.
MIME is specified in six linked RFC memoranda: RFC 2045, RFC 2046, RFC 2047, RFC 4288, RFC 4289 and RFC 2049, which together define the specifications
reference: http://en.wikipedia.org/wiki/MIME
But this was the ideal scenario, some clienst appends the messages in different formats too. Am not aware of any ready-to-use software but one way which even rediffmail/yahoo uses is to parse down the mail body and search for different patterns like
---------- Forwarded message ----------
From: xyza#gmail.com
Date: 17 Feb 2013 18:50
Subject: Re: Openings
To: abc#rediffmail.com
or
From: xyz [xyz#tcs.com]
Sent: Wednesday, February 20, 2013 4:18 PM
To: Anshul
Subject: RE: Hi
It has to be a regex check and pattern matching to accomplish the desired output.
Related
Recently I started learning Web scraping. For this purpose I need to focus on URLs and there basic structures. I considered two URLs from Amazon and Priceline for home work purpose.
The some basic concepts of URL
A query string comes at the end of a URL, starting with a single
question mark, “?”.
Parameters are provided as key-value pairs and separated by an
ampersand, “&”.
The key and value are separated using an equals sign, “=”
most web frameworks will allow us to define “nice
looking” URLs that just include the parameters in the path of a URL
Amazon URL
https://www.amazon.com/books-used-books-textbooks/b/?ie=UTF8&node=283155&ref_=nav_cs_books_788dc1d04dfe44a2b3249e7a7c245230
As per my understanding:
Parameters
ie=UTF8
node = 283155
ref_=nav_cs_books_788dc1d04dfe44a2b3249e7a7c245230
Key Values
ie UTF8
node 283155
ref_ nav_cs_books_788dc1d04dfe44a2b3249e7a7c245230
Priceline URL
https://www.priceline.com/relax/in/3000005381/from/20210310/to/20210317/rooms/1?vrid=16e829a6d7ee5b5538fe51bb7e6925e8
This url is based on the hotel booking in Chicago from 03/10/2021 to 03/17/2021.
As per my understanding:
key values
from 20210310 2021 - 03 -10
to 20210317 2021 - 03 -17
rooms 1
I did not find out anything more than that. I just make sure am I missing something? Can those URLS analysis more precisely?
Tips that may help are:
Data can be posted via GET or POST. What you are describing with URLs is GET. POST is when you don't see anything in the url.
In both cases getting familiar with using your browser's developer console will help you explore how websites work. In Chrome, you can hit F12 or right click any element and select "inspect element." This is especially helpful when trying to inspect data that is passed using POST, since you can't see them in the url. Use the "network" tab while clicking around to see what the website is doing in the background.
Lastly, just play around with websites. For example, when you browse Amazon you might notice the urls look like https://www.amazon.com/Avalon-Organics-Creme-Radiant-Renewal/dp/B082G172GL/?_encoding=UTF8 but if you play around with it you notice you can delete out the title and the url still works like this: https://www.amazon.com/dp/B082G172GL
This is a more general question for trying to understand why there are what appear to be 3 different permutations of essentially the same key-value in the vanilla Twilio api response back either when a message resource is returned or webhook'd to a third party application.
Here is the abridged JSON response returned to a requesting client:
//mock message id values
{ "SmsMessageSid": "MM2fb7744c9a752cb554b4a6371c6756d8",
//... abridged
"SmsSid": "MM2fb7744c9a752cb554b4a6371c6756d8",
//...abridged
"MessageSid": "MM2fb7744c9a752cb554b4a6371c6756d8"
}
Google surfaces a support doc on MessageSid, but nothing else is clear on the data dictionary for the other two properties.
My questions are:
Why are there 3 keys with the same value?
Are there any instances in which those values would be different for any of the keys?
Which key-value should I persist if I want to save the id for this specific message?
This may help:
SMS Request Parameters
MessageSid A 34 character unique identifier for the message. May be used to later retrieve this message from the REST API.
SmsSid Same value as MessageSid. Deprecated and included for backward compatibility.
Use MessageSid
Twilio Developer Evangelist here. Alan's answer is correct. We ended up writing a blog post based on this question! You can check it out here: https://www.twilio.com/blog/programmable-messaging-sids
I'm currently building a Ruby SDK for the Graph API.
I'm working with delta queries on the message resource endpoints, specifically list-messages.
I need to specify two preferences utilizing the Prefer header(s):
allow unsafe HTML - "outlook.allow-unsafe-html"
maximum items per page/request - "odata.maxpagesize={num}"
There aren't any examples in the docs showing how this can be accomplished. I'm not sure whether they need to be concatenated into a single value or whether to specify multiple HTTP headers (or if this is even supported). Clarification here would be super helpful
According to RFC7240:
A client MAY use multiple instances of the Prefer header field in a single message, or it MAY use a single Prefer header field with multiple comma-separated preference tokens. If multiple Prefer header fields are used, it is equivalent to a single Prefer header field with the comma-separated concatenation of all of the tokens.
So you can use multiple Prefer header fields defining distinct preferences:
POST /foo HTTP/1.1
Host: example.org
Prefer: respond-async, wait=100
Prefer: handling=lenient
Date: Tue, 20 Dec 2011 12:34:56 GMT
Or you may use a single Prefer header field with a comma-separated list of values:
POST /foo HTTP/1.1
Host: example.org
Prefer: handling=lenient, wait=100, respond-async
Date: Tue, 20 Dec 2011 12:34:56 GMT
I'm getting a body hash mismatch when the POST body of an XML web service request contains international characters.
From what I've read, it sounds like international characters in a POST body have to be encoded before calculating the OAuth body hash. UTF-8 for CAFÉ of "CAF%c3%89" doesn't seem to work with the MasterCard Match web service. I'm having trouble with the tool we're using (iWay Service Manager) re-interpreting "CAFÉ" back to "CAFÉ". Before I figure out how to squeeze an encoder in before the OAuth step, I was hoping to confirm with someone who had dealt with this issue. What is the proper encoding to use on a POST body with international characters (or is my problem likely to be something else)?
For calculating MasterCard OAuth body hash, the recommended encoding is UTF-8. Also the Core SDK made available by MasterCard uses UTF-8 encoder to encode oauth_body_hash.
I have a website written on ruby 1.8.5 and rails 1.2.6.
There is a feedback page.
So.
i've got a model class:
class Feedback::Notify < ActionMailer::Base
def answer_for_managers(question)
recipients test#test.com
from "feedback#test.com"
subject "Обратная связь: ответ на вопрос"
body "question" => question
content_type "text/html"
end
end
Then i have a controller:
class Feedback::QuestionController < Office::BaseController
def update
Feedback::Notify.deliver_answer_for_managers(#question)
end
end
The problem is when a message is sent its subject looks like: =?utf-8?Q?=d0=9e=d0=b1=d1=80=d0=b0=d1=82=d0=bd=d0=b0=d1=8f_=d1=81=d0=b2=d1=8f=d0=b7=d1=8c=3a_=d0=a1=d0=be=d1=82=d1=80=d1=83=d0=b4=d0=bd=d0=b8=d0=ba_=d0=be=d1=82=d0=b2=d0=b5=d1=82=d0=b8=d0=bb_=d0=bd=d0=b0_=d0=b2=d0=be=d0=bf=d1=80=d0=be=d1=81_=d0=ba=d0=bb=d0=b8=d0=b5=d0=bd=d1=82=d0=b0_=23=35=36_=d0=be=d1=82_=32=36=2e=30=38=2e=32=30=31=31_=31=31=3a=33=33?=
so it's url-encoded.
Is there any way to prevent converting subject text to url-encoding?
All files are in UTF8 encoding
If you would put unescaped UTF-8 characters into header fields, you would be violating the respective standards RFC 822 and RFC 5322 which state that header fields can only be composed of (7-bit) ASCII characters.
Thus, ActionMailer does the right thing here and escapes the UTF-8 characters. As nothing in the headers states that another encoding is to be used, the recipient (and all intermediate servers) have no other chance than to follow that standard as they have no other clue which encoding might have been used.
As RFC 822 is rather old (but still authoritative for email), UTF-8 just didn't exist as it was specified. The escaping is a workaround specified by RFC 2047 which exactly specifies what you see in the header. MUAs are expected to unescape the text and display the proper glyphs on rendering.
Note that it is entirely possible to send unicode text inside the message body (most of the time inside a MIME container). There it is possible to specify the actual data encoding and the transport encoding using additional headers. See RFC 2045 ff. for more details.
Please read either the RFCs or have a look at the Wikipedia entry on Unicode and e-mail.
i solve my problem by adding default 'Content-Transfer-Encoding' => '7bit' in my ActionMailer
have a look in the API docs