I have a third party JSON feed which is huge - lots of data. Eg
{
"data": [{
"name": "ABC",
"price": "2.50"
},
...
]
}
I am required to strip the quotation marks from the price as the consumer of the JSON feed requires it in this way.
To do this I am performing a regex to find the prices and then iterating over the prices and doing a string replace using gsub. This is how I am doing it:
price_strings = json.scan(/(?:"price":")(.*?)(?:")/).uniq
price_strings.each do |price|
json.gsub!("\"#{price.reduce}\"", price.reduce)
end
json
The main bottle neck appears to be on the each block. Is there a better way of doing this?
If this JSON string is going to be serialised into a Hash at some point in your application or in another 3rd-party dependency of your code (i.e. to be consumed by your colleagues or modules), I suggest negotiating with them to convert the price value from String to Numeric on demand when the json is already a Hash, as this is more efficient, and allows them to...
...handle edge-case where say if "price": "" of which my code below will not work, as it would remove the "", and will be a JSON syntax error.
However, if you do not have control over this, or are doing once-off mutation for the whole json data, then can you try below?
json =
<<-eos
{
"data": [{
"name": "ABC",
"price": "2.50",
"somethingsomething": {
"data": [{
"name": "DEF",
"price": "3.25", "someprop1": "hello",
"someprop2": "world"
}]
},
"somethinggggg": {
"price": "123.45" },
"something2222": {
"price": 9.876, "heeeello": "world"
}
}]
}
eos
new_json = json.gsub /("price":.*?)"(.*?)"(.*?,|})/, '\1\2\3'
puts new_json
# =>
# {
# "data": [{
# "name": "ABC",
# "price": 2.50,
# "somethingsomething": {
# "data": [{
# "name": "DEF",
# "price": 3.25, "someprop1": "hello",
# "someprop2": "world"
# }]
# },
# "somethinggggg": {
# "price": 123.45 },
# "something2222": {
# "price": 9.876, "heeeello": "world"
# }
# }]
# }
DISCLAIMER: I am not a Regexp expert.
This is truly a fools errand.
JSON.parse('{ "price": 2.50 }')
> {price: 2.5}
As you can see from this javascript example the parser on the consuming side will truncate the float to whatever it wants.
Use a string if you want to provide a formatted number or leave formatting up to the client.
In fact using floats to represent money is widely known as a really bad idea since floats and doubles cannot accurately represent the base 10 multiples that we use for money. JSON only has a single number type that represents both floats and integers.
If the client is going to do any kind of calculations with the value you should use an integer in the lowest monetary denomation (cents for euros and dollars) or a string that's interpreted as a BigDecimal equivilent type by the consumer.
Related
I'm having an issue like this. Not all zones are returning with:
ActiveSupport::TimeZone.all.sort_by {|t| t.name}.map { |tz|
#symbol = tz.tzinfo.identifier.gsub(/[^_a-zA-Z0-9]/, '_').squeeze('_').upcase!
tz.to_s #> (GMT+00:00) Edinburgh for example
}
I need to use the .to_s to get the UTC (GMT+00:00). With the above, London is missing and I assume others. This one works great:
ActiveSupport::TimeZone::MAPPING.sort_by {|k,v| k}.map { |k,v|
#symbol = k.gsub(/[^_a-zA-Z0-9]/, '_').squeeze('_').upcase!
k #> London London is included with this method
}
I cannot use this method because I do not know how to get the (GMT+00:00) in (GMT+00:00) London
Has the bug return? How to get all the zones show for the first example?
Edit.
I'm using GraphQL-ruby. I've created an enum to return a list of time zones:
# Taken from: https://gist.github.com/pedrocarmona/f41d25e631c1144045971c319f1c9e17
class Types::TimeZoneEnumType < Types::BaseEnum
ActiveSupport::TimeZone.all.sort_by {|t| t.name}.map { |tz|
symbol = tz.tzinfo.identifier.gsub(/[^_a-zA-Z0-9]/, '_').squeeze('_').upcase
value("TZ_#{symbol}", tz.to_s)
}
end
Then inside query_type.rb
[..]
field :time_zones, Types::TimeZoneEnumType, null: false
[..]
Next, inside graphiql, I make the query:
query timeZones{
__type(name: "TimeZoneEnum") {
enumValues {
name
description
}
}
}
Which returns something like, except London:
[
[..]
{
"name": "TZ_AMERICA_LA_PAZ",
"description": "(GMT-04:00) La Paz"
},
{
"name": "TZ_AMERICA_LIMA",
"description": "(GMT-05:00) Lima"
},
{
"name": "TZ_EUROPE_LISBON",
"description": "(GMT+00:00) Lisbon"
},
{
"name": "TZ_EUROPE_LJUBLJANA",
"description": "(GMT+01:00) Ljubljana"
},
{
"name": "TZ_EUROPE_MADRID",
"description": "(GMT+01:00) Madrid"
},
[..]
]
After Ljubljana I should see "London" but it's not there.
If I run
ActiveSupport::TimeZone.all.sort_by {|t| t.name}.map { |tz|
[ tz.tzinfo.identifier.gsub(/[^_a-zA-Z0-9]/, '_').squeeze('_').upcase, tz.to_s ]
}.sort
the result includes the entries ["EUROPE_LONDON", "(GMT+00:00) Edinburgh"], ["EUROPE_LONDON", "(GMT+00:00) London"], i.e. EUROPE_LONDON is duplicated.
I don't know how the GraphQL library is operating, but I'm assuming it's deduplicating the data and returning a single entry for EUROPE_LONDON (enums are normally unique). Moscow is the same - it has values for Moscow and St Petersburg - so you could test by looking at the results for EUROPE_MOSCOW.
This is my column:
[
{ id: 1, value: 1, complete: true },
{ id: 2, value: 1, complete: false },
{ id: 3, value: 1, complete: true }
]
First, is there a "correct" way to work with a jsonb scheme? should I redesign to work with a single json instead of the array of hashes?
I have about 200 entries on the database, the column status has 200 of those itens.
How would I perform a query to get the count of true/false?
How can I query for ALL complete itens? I can query for the database rows in which the json has an item complete, but I can't query for all the itens, in all rows of the database that are complete.
Appreciate the help, thank you
Aha! I found it here:
https://levelup.gitconnected.com/how-to-query-a-json-array-of-objects-as-a-recordset-in-postgresql-a81acec9fbc5
Say your dataset is like this:
[{
"productid": "3",
"name": "Virtual Keyboard",
"price": "150.00"
}, {
"productid": "1",
"name": "Dell 123 Laptop Computer",
"price": "1300.00"
},
{
"productid": "8",
"name": "LG Ultrawide Monitor",
"price": "190.00"
}]
The proper way to count it, is like this:
select items.name, count(*) as num from
purchases,jsonb_to_recordset(purchases.items_purchased) as items(name text)
group by items.name
order by num Desc
Works like a charm and is extremely fast.
To do it in Rails, you need to use Model.find_by_sql(....) and indicate your select therem. I'm sure there are probably better ways to do it.
I have a performance issue in my application. I would like to gather some ideas on what I can do to improve it. The application is very easy: I need to add values inside a nested table to get the total an user wants to pay out of all the pending payments. The user chooses a number of payments and I calculate how much it is they will pay.
This is what I have:
jsonstr = "{ "name": "John",
"surname": "Doe",
"pending_payments": [
{
"month": "january",
"amount": 50,
},
{
"month": "february",
"amount": 40,
},
{
"month": "march",
"amount": 45,
},
]
}"
local lunajson = require 'lunajson'
local t = lunajson.decode(jsonstr)
local limit -- I get this from the user
local total = 0;
for i=1, limit, 1 do
total = total + t.pending_payments[i].amount;
end;
It works. At the end I get what I need. However, I notice that it takes ages to do the calculation. Each JSON has only twelve pending payments (one per month). It is taking between two to three seconds to come up with a result!. I tried in different machines and LUA 5.1, 5.2., 5.3. and the result is the same.
Can anyone please suggest how I can implement this better?
Thank you!
For this simple string, try the test code below, which extracts the amounts directly from the string, without a json parser:
jsonstr = [[{ "name": "John",
"surname": "Doe",
"pending_payments": [
{
"month": "january",
"amount": 50,
},
{
"month": "february",
"amount": 40,
},
{
"month": "march",
"amount": 45,
},
]
}]]
for limit=0,4 do
local total=0
local n=0
for a in jsonstr:gmatch('"amount":%s*(%d+),') do
n=n+1
if n>limit then break end
total=total+tonumber(a)
end
print(limit,total)
end
I found the delay had nothing to do with the calculation in LUA. It was related with a configurable delay in the retrieval of the limit variable.
I have nothing to share here related to the question asked since the problem was actually in an external element.
Thank #lfh for your replies.
I'm using a Ruby script to interface with an application API and the results being returned are in a JSON format. For example:
{
"incidents": [
{
"number": 1,
"status": "open",
"key": "abc123"
}
{
"number": 2,
"status": "open",
"key": "xyz098"
}
{
"number": 3,
"status": "closed",
"key": "lmn456"
}
]
}
I'm looking to search each block for a particular "key" value (yzx098 in this example) and return the associated "number" value.
Now, I'm very new to Ruby and I'm not sure if there's already a function to help accomplish this. However, a couple days of scouring the Googles and Ruby resource books hasn't yielded anything that works.
Any suggestions?
First of all, the JSON should be as below: (note the commas)
{
"incidents": [
{
"number": 1,
"status": "open",
"key": "abc123"
},
{
"number": 2,
"status": "open",
"key": "xyz098"
},
{
"number": 3,
"status": "closed",
"key": "lmn456"
}
]
}
Strore the above json in a variable
s = '{"incidents": [{"number": 1,"status": "open","key": "abc123"},{"number": 2,"status": "open","key": "xyz098"},{"number": 3,"status": "closed","key": "lmn456"}]}'
Parse the JSON
h = JSON.parse(s)
Find the required number using map
h["incidents"].map {|h1| h1['number'] if h1['key']=='xyz098'}.compact.first
Or you could also use find as below
h["incidents"].find {|h1| h1['key']=='xyz098'}['number']
Or you could also use select as below
h["incidents"].select {|h1| h1['key']=='xyz098'}.first['number']
Do as below
# to get numbers from `'key'`.
json_hash["incidents"].map { |h| h['key'][/\d+/].to_i }
json_hash["incidents"] - will give you the value of the key "incidents", which is nothing but an array of hash.
map to iterate thorough each hash and collect the value of 'key'. Then applying Hash#[] to each inner hash of the array, to get the value of "key". Then calling str[regexp], to get only the number strings like '098' from "xyz098", finally applying to_i to get the actual integer from it.
If the given hash actually a json string, then first parse it using JSON::parse to convert it to a hash.Then do iterate as I said above.
require 'json'
json_hash = JSON.parse(json_string)
# to get values from the key `"number"`.
json_hash["incidents"].map { |h| h['number'] } # => [1, 2, 3]
# to search and get all the numbers for a particular key match and take the first
json_hash["incidents"].select { |h| h['key'] == 'abc123' }.first['number'] # => 1
# or to search and get only the first number for a particular key match
json_hash["incidents"].find { |h| h['key'] == 'abc123' }['number'] # => 1
I need to extract some data from a JSON response i'm serving up from curb.
Previously I wasn't calling symbolize_keys, but i thought that would make my attempt work.
The controller action:
http = Curl.get("http://api.foobar.com/thing/thing_name/catalog_items.json?per_page=1&page=1") do|http|
http.headers['X-Api-Key'] = 'georgeBushSucks'
end
pre_keys = http.body_str
#foobar = ActiveSupport::JSON.decode(pre_keys).symbolize_keys
In the view (getting undefined method `current_price' )
#foobar.current_price
I also tried #foobar.data[0]['current_price'] with the same result
JSON response from action:
{
"data": {
"catalog_items": [
{
"current_price": "9999.0",
"close_date": "2013-05-14T16:08:00-04:00",
"open_date": "2013-04-24T11:00:00-04:00",
"stuff_count": 82,
"minimum_price": "590000.0",
"id": 337478,
"estimated_price": "50000.0",
"name": "This is a really cool name",
"current_winner_id": 696969,
"images": [
{
"thumb_url": "http://foobar.com/images/93695/thumb.png?1365714300",
"detail_url": "http://foobar.com/images/93695/detail.png?1365714300",
"position": 1
},
{
"thumb_url": "http://foobar.com/images/95090/thumb.jpg?1366813823",
"detail_url": "http://foobar.com/images/95090/detail.jpg?1366813823",
"position": 2
}
]
}
]
},
"pagination": {
"per_page": 1,
"page": 1,
"total_pages": 131,
"total_objects": 131
}
}
Please note that accessing hash's element in Rails work in models. To use it on hash, you have to use OpenStruct object. It's part of standard library in rails.
Considering, #foobar has decoded JSON as you have.
obj = OpenStruct.new(#foobar)
obj.data
#=> Hash
But, note that, obj.data.catalog_items willn't work, because that is an hash, and again not an OpenStruct object. To aid this, we have recursive-open-struct, which will do the job for you.
Alternative solution [1]:
#foobar[:data]['catalog_items'].first['current_price']
But, ugly.
Alternative solution [2]:
Open Hash class, use method_missing ability as :
class Hash
def method_missing(key)
self[key.to_s]
end
end
Hope it helps. :)