ASCII characters in ElasticSearch with Tire - ruby-on-rails

I have a website in Brazilian portuguese. I'm using Elasticsearch to run our site search.
When the visitors search from our site, everything works, but codebasehq give some exceptions (errors) like this: Tire::Search::SearchRequestFailed
nested: JsonParseException[Invalid UTF-8 middle byte 0x72\n at [Source: [B#42dcdefd; line: 1, column: 46]]; }]","status":500}
These errors only came from URLs that I don't know where are these links, for example:
?q=Acess%F3rios (error)
?q=Acessórios (ok)
?q=Acess%C3%B3rios (ok)
I don't know how to fix this error, I'm trying to stop to generate that errors in codebasehq.

The error seems to be coming from Elasticsearch, which trips on the invalid JSON received.
In general, Tire handles accented characters in searches just fine:
# encoding: UTF-8
require 'tire'
s = Tire.search do
query { string 'Žluťoučký' }
end
p s.results
You should enable the Tire logging with:
Tire.configure { logger STDERR, level: "debug" }
or with the Rails logger, to find the offending JSON, debug it, and possibly post more information here.

Related

Rails Invalid request parameters: Invalid encoding for parameter with backticks

Hope someone can lend some insight here. Getting some rails errors for:
Invalid request parameters: Invalid encoding for parameter: Breaking Up Was Easy In The 90�s
This happens when I recieve POST data containing backticks. Example would be a song title or artist that someone used a backtick instead of an apostrophe like this:
{
"TITLE": "Breaking Up Was Easy In The 90`s",
"ARTIST": "Sam Hunt"
}
All my searching is coming up with fixing rails query parameters, not request parameters. Is there a middleware solution I can use to intercept this and fix it?
You can rescue from this error using StandardError Expection
(Since every error & exception class inherits from StandardError it is sufficient to Rescue)
rescue StandardError => e
// get that Title string in a variable and use
Title.gsub('`', ''')
end

ActionMailer in Ruby on Rails breaks when given an email address that has special characters

I'm using ActionMailer to send emails, it works when given a normal email address, here is my code snippet
logger = Logger.new('logfile.log')
logger.info(#user.email)
mail(
charset: "UTF-8",
to: #user.email,
from: #from,
subject: #subject
)
in the log it shows the email address just fine with special characters.
but then I go to my development log and see this in the mail object, everything else is right
To: =?UTF-8?B?w7FAw7EuY29tPg==?=
I've tried to wrap it in quotes and using a different format like so:
("\"#{#user.name}\" <#{#user.email}>")
which translates to:
"name name" <test_ñ#yahoo.com>
no luck on these either, I just get a similar gibberish
=?UTF-8?Q?=22name_name=22_<test=5F=C3=B1#yahoo.com>?=
also tried to use "test_ñ"#yahoo.com
same results:
=?UTF-8?Q?=22test=5F=C3=B1=22#yahoo.com>?=
what am I missing here? is it something with encoding configs?
That's not gibberish. That's MIME encoding (RFC 2047). Everything's fine.
You can paste that line in MIME online decoder and ensure it decodes to To: my_user#exampl.com (email address redacted).

The twitter API seems to escape ampersand, <, > but nothing else?

I wrote the following test tweet:
&“”‘’®©™¶•·§–—
Then fetched it using the 'user_timeline' api call. The following json was returned:
...
"text": "&“”‘’®©™¶•·§–—",
...
It seems strange the ampersand is the only escaped symbol.
Are there any other escaped symbols? I can't find a definitive list in the docs.
Alternatively, is it possible to specify if the api should return escaped/ unescaped characters?
Edit
Test tweet:
<>=+
Returns:
...
'text': '<>=+'
...

invalid byte sequence in UTF-8 with Rails flash messages

we use Rails 3 and Spree Commerce for our online shop and we have a payment provider, that returns errors in a redirect URL if some occur. When an error occurs, we present that string with flash messages to the user.
Yesterday, something didn't work, and the payment provider returned this string in the redirect URL, which should be presented to the user inside a flash message:
errormsg=Bitte+versuchen+Sie+es+sp%E4ter+nochmals.
I debugged a little bit, and the string looks like this when decoded (e.g. is written to flash[:error]):
Bitte versuchen Sie es sp\xE4ter nochmals.
And after that, an error is raised, when rails tries to render the flash message:
invalid byte sequence in UTF-8
Can someone tell me, how to fix this? The error should contain a german ä and not \xE4. I tried setting # encoding: utf-8 to the beginning of the controller and the view, but this doesn't help.
'Obviously, your payment provider uses ISO-8859-1 or similar to send german umlauts.
As your rails app uses UTF-8, you can convert the message of your provicder.
Assume you stored the message in variable msg, use
utf_msg = params[:errormsg].force_encoding('ISO-8859-1').encode('UTF-8')
you can also check, if the resulting encoding is valid:
uft_msg.valid_encoding?
and outpunt a different message to avoid errors.

How to debug HTTP AUTH params in Rails?

Rubyists,
something's wrong with my HTTP AUTH params that are coming into my Rails 3 app. The password has some whitespace at the end. I was debugging my client app and it looks like it is sending it correctly.
I am doing this in my app:
params[:auth_username], params[:auth_password] = user_name_and_password(request)
Then I am sending this into Warden.
I would like to see the raw data to see if the whitespace is there. How to do that?
Edit: I have debugged the wire between httpd and thin process and I am pretty sure the data are coming correctly. It must be something wrong in my Rails 3.0.10. I was able to decode the base64 string that is coming in the headers and it did not contain any whitespace.
This really looks like BASE64 decoder issue. Maybe a padding problem. My string is:
Qmxvb21iZXJnOnRjbG1lU1JT
which decodes to
Bloomberg:tclmeSRS
correctly using non-Ruby base64 decoders. Even in Ruby:
>> Base64.decode64 "Qmxvb21iZXJnOnRjbG1lU1JT"
=> "Bloomberg:tclmeSRS"
I don't get it. Searching for a bugreport in Rails or something like that.
Edit: So it turns out our Apache httpd proxy adds something to the header:
Authorization: Basic Qmxvb21iZXJnOnRjbG1lU1JT, Basic
This leads to the incorrect characters at the end of the password, because:
>> Base64.decode64('Basic Qmxvb21iZXJnOnRjbG1lU1JT, Basic'.split(' ', 2).last || '')
=> "Bloomberg:tclmeSRS\005\253\""
The question is now - is this correct? Is it a bug in httpd or rails?
Rails user_name_and_password method makes a call to decode_credentials that performs the following, then splits using ":" :
::Base64.decode64(request.authorization.split(' ', 2).last || '')
Applied to your data :
::Base64.decode64("Qmxvb21iZXJnOnRjbG1lU1JT".split(' ', 2).last || '').split(/:/, 2)
=> ["Bloomberg", "tclmeSRS"]
Everything seems to be ok, the problem sits elsewhere IMO. To dump the authorization data from your controller :
render :text => "Authorization: #{request.authorization}"

Resources