I'm parsing a webpage with nokogiri and then iterating through css selectors until I find the I'm looking for Then I run a regex to match the javascript portion only, and then try to parse it with JSON.parse but that returns error invalid token at ',{ ... If I run puts on the matched data it shows it without the prepended comma but the error occurs when I run JSON.parse JSON::ParserError: 822: unexpected token at ',{"skuAttr":"200007763:201336106;491:200004763#145cm","skuPropIds":"
file=File.open('product.html')
doc=Nokogiri::HTML.parse(file)
doc.css("script").each do |page|
if page.text=~/skuProducts/
skudata = page.text[/var skuProducts=\[(.+?)\];/, 1]
puts skudata
parsed = JSON.load(skudata)
end
end
If you're consistently seeing that comma prefixed and the rest of the JSON string looks valid... then why not just remove that leading comma and then try the JSON parse?
Related
In my rails app, when I add %dd or %ff in url parameter, why it returns invalid byte sequence in UTF-8?
I have a regex ^[a-zA-Z0-9_]+$ to catch if string includes letters + numbers + underscores only. Then when I add %dd, or %ff in my url parameter, it returns invalid byte sequence in UTF-8 error.
What does %dd and %ff means?
UPDATE:
My controller:
def search
regex = '^[a-zA-Z0-9_]+$'
#search = params[:search]
unless #search.match(alpha_num_under_regex).nil?
#users = User.find_by_name(#search)
render 'api/v1/users/show', status: 200, formats: :json
else
#users = []
render 'api/v1/users/show', status: 422, formats: :json
end
My URL:
localhost:3000/api/v1/users/show?search=%dd
When params search=%d it return Bad Request which is ok. But when I added another d, search=%dd or search=a%dd, it returns Action Controller: Exception caught - invalid byte sequence in UTF-8.
The question is, how can I pass invalid byte sequence in UTF-8 error?
From Wiki:
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier (URI) under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier (URI) set, which includes both Uniform Resource Locator (URL) and Uniform Resource Name (URN). As such, it is also used in the preparation of data of the application/x-www-form-urlencoded media type, as is often used in the submission of HTML form data in HTTP requests.
The query search=%dd is according to above treated/interpreted as search=<BYTE_WITH_ORD_VALUE_0xDD>. Ruby expects this string to be UTF-8, but 0xDD is not a valid UTF-8 symbol.
To avoid this problem and pass what was intended, one should URL-escape the search query explicitly by substituting % ⇒ %25 (the latter is apparently the percent-encoded percent sign itself.)
localhost:3000/api/v1/users/show?search=%25dd
the above will send %dd query to rails.
NB to be safe, one should build url queries according to the common rule, specified in the article linked above:
[List of reserved characters]
Other characters in a URI must be percent encoded.
I'm using HTTParty to send data to a remote API, however the API is complaining because the JSON being sent by HTTParty appears to be being escaped, and is thus deemed invalid.
Here's what I'm doing:
query = {"count"=>1,
"workspaces"=>
{123445=>
{"title"=>"Test Project",
"description"=>"",
"start_date"=>"2015-06-01T00:00:00.000Z",
"due_date"=>"2015-08-31T00:00:00.000Z",
"price_in_cents"=>8000,
"currency"=>"USD",
"status_key"=>130,
"custom_field_values_attributes"=>[],
"workspace_groups_attributes"=>
[{"created_at"=>"2015-07-13T11:06:36-07:00",
"updated_at"=>"2015-07-13T11:06:36-07:00",
"name"=>"Test Customer",
"company"=>true,
"contact_name"=>nil,
"email"=>nil,
"phone_number"=>nil,
"address"=>nil,
"website"=>nil,
"notes"=>nil,
"id"=>"530947",
"custom_field_values_attributes"=>[]}],
"id"=>123445}},
"results"=>[{"key"=>"workspaces", "id"=>123445}]}
Calling to_json on query escapes the JSON too:
"{\"count\":1,\"workspaces\":{\"123445\":{\"title\":\"Test Project\",\"description\":\"\",\"start_date\":\"2015-06-01T00:00:00.000Z\",\"due_date\":\"2015-08-31T00:00:00.000Z\",\"price_in_cents\":8000,\"currency\":\"USD\",\"status_key\":130,\"custom_field_values_attributes\":[],\"workspace_groups_attributes\":[{\"created_at\":\"2015-07-13T11:06:36-07:00\",\"updated_at\":\"2015-07-13T11:06:36-07:00\",\"name\":\"Test Customer\",\"company\":true,\"contact_name\":null,\"email\":null,\"phone_number\":null,\"address\":null,\"website\":null,\"notes\":null,\"id\":\"530947\",\"custom_field_values_attributes\":[]}],\"id\":123445}},\"results\":[{\"key\":\"workspaces\",\"id\":123445}]}"
Is this expected behavior to escape the JSON? Or I'm wondering if the hash I'm building for query is invalid for JSON purposes?
Any help would be greatly appreciated.
Calling to_json on query doesn't yield escaped JSON.
Try puts query.to_json to see that.
You see backslashes because #inspect method on String (and this method is called to display contents of variables to console) displays String enclosed in double quotes, and it has to escape quotes which are in the given string itself.
Your problem is probably not having proper Content-Type headers. You should do something like this:
result = HTTParty.post(url, body: query.to_json, headers: {'Content-Type' => 'application/json'})
data = {
"CEO": "William Hummel",
"CFO": "Carla Work"
}
I'm trying to parse the json data above with JSON.parse(data) in IRC, but it won't work.
I'm getting the following error: "SyntaxError: (irb):44: syntax error, unexpected ':', expecting tASSOC"
JSON.parse takes a string argument. You are trying to construct a Hash using JSON syntax. Use a string instead:
data = '{"CEO": "William Hummel", "CFO": "Carla Work"}'
JSON.parse(data)
I saved a file named array.json on my Dropbox folder and i access to it via Dropbox API. All works fine, but when i retrieve JSON content i cannot JSON.parse that string!!
session = DropboxSession.new(APP_KEY, APP_SECRET)
session.set_access_token(ACCESS_TOKEN_KEY, ACCESS_TOKEN_SECRET)
client = DropboxClient.new(session, ACCESS_TYPE)
json = client.get_file(DIRECTORY + '/array.json')
#json = JSON.parse json
Error:
743: unexpected token at '{"Nome" : "Mario Rossi",
"C.F." : "ABCDEFGHILMNOP",
"Booking Assistance" : "MARIO",
"Status of reservation" : "25/11/2011"}'
JSON string is valid!! if i copy this string and paste it (manually) as parameter in JSON.parse(), json is parsed correctly!! So i think is a encoding problem...but where i wrong?
We have abandoned the json parsing backend that is the default in Rails. The default backend is YAML based and imo a useless mess. After several gotchas parsing unicode, and dates in some cases, we discovered that the backend can be replaced via configuration.
You can substitute the parsing backend in an initializer
ActiveSupport::JSON.backend = "JSONGem"
There are several gems that can be used as the backend, we just use the json gem
gem 'json'
I am using Ruby on Rails 3 and I would like to solve an issue with the following code where a web client application receive back some JSON data from a web service application that uses a Rack middleware in order to respond.
In the web client app model I have
response_parsed = JSON.parse(response.body)
if response_parsed["account"]
...
else
return response
end
In the above code the response.body come back from the web service app that uses a Rack middleware to respond to the web client:
accounts = Account.where(:id => ids)
[200, {'Content-Type' => 'application/json'}, accounts.to_json] # That is, response.body = accounts.to_json
Data transmission is ok, but I get the following error
TypeError
can't convert String into Integer
*Application Trace*
lib/accounts.rb:107:in `[]'
The line 107 corresponds to
if response_parsed["account"]
...
Where and what is the problem? How to solve that?
If I try to debug the respons.body I get
# Note: this is an array!
"[{\"account\":{\"firstname\":\"Semio\",\"lastname\":\"Iaven\"\"}}]"
If I'm saying something you already realize, forgive me.
It looks like your response is a one-element array with a hash in it as the first element. Because the response is an array, when you use the [] it is expecting a integer representing the index of the item in the array you'd like to access, and that is what the error message means--it expected that you'd tell it the integer value of the item you wanted, but instead you gave it a string.
If you instead do:
response_parsed[0]['account']
It seems like you'd get what you want.