Generating JSON response with PostgreSQL for Rails - ruby-on-rails

I'm currently executing a SQL statement in Rails and while it works I've come to realize I need a different format and am attempting to accomplish this in PostreSQL. This is my query:
sql = "SELECT one_month_high - one_month_low as one_month,
three_month_high - three_month_low as three_month,
six_month_high - six_month_low as six_month,
twelve_month_high - twelve_month_low as twelve_month,
ytd_high - ytd_low as ytd,
saved_on
FROM daily_high_lows
ORDER BY saved_on DESC;"
Which returns:
#<PG::Result:0x007fdb4aea1fa0 status=PGRES_TUPLES_OK ntuples=380 nfields=6 cmd_tuples=380>
...
{"one_month"=>"544", "three_month"=>"214", "six_month"=>"9","twelve_month"=>"122",
"ytd"=>"143", "saved_on"=>"2016-06-09 00:00:00"}
{"one_month"=>"1283", "three_month"=>"475", "six_month"=>"22","twelve_month"=>"189",
"ytd"=>"517", "saved_on"=>"2016-06-08 00:00:00"}
I've come to realize that I require a format:
[
{
name: "One Month",
data: {
2016-06-09 00:00:00: 544,
2016-06-08 00:00:00: 1283
}
},
{
name: "Three Month",
data: {
2016-06-09 00:00:00: 214,
2016-06-08 00:00:00: 475
}
}, etc...
]
I've been trying to research how to do this but it's a bit beyond me currently so I could use some direction.

You should be able to use the to_json method available in rails. So response.to_json should yield what you need.

I think if you have a lot of records, building the JSON inside Postgres is a great approach. Postgres didn't really get very useful JSON-building functions until 9.4 though, so I recommend you be at least there if not higher. Anyway, this seems to give you what you want:
WITH seqs AS (
SELECT json_object_agg(saved_on::text, one_month_high ORDER BY saved_on) one_month_highs,
json_object_agg(saved_on::text, one_month_low ORDER BY saved_on) one_month_lows
FROM daily_high_lows
)
SELECT json_agg(j)
FROM (
SELECT json_build_object('name', 'One Month Highs', 'data', one_month_highs)
FROM seqs
UNION ALL
SELECT json_build_object('name', 'One Month Lows', 'data', one_month_lows)
FROM seqs
) x(j)
;
Tested like so:
t=# create table daily_high_lows (one_month_high integer, one_month_low integer, three_month_high integer, three_month_low integer, six_month_high integer, six_month_low integer, twleve_month_high integer, twelve_month_low integer, ytd_high integer, ytd_low integer, saved_on timestamp);
t=# insert into daily_high_lows (one_month_high, one_month_low, three_month_high, three_month_low, saved_on) values (1, 10, 3, 6, '2016-06-08');
t=# insert into daily_high_lows (one_month_high, one_month_low, three_month_high, three_month_low, saved_on) values (2, 9, 3, 8, '2016-03-09');
t=# with seqs as (select json_object_agg(saved_on::text, one_month_high order by saved_on) one_month_highs, json_object_agg(saved_on::text, one_month_low order by saved_on) one_month_lows from daily_high_lows) select json_agg(j) from (select json_build_object('name', 'One Month Highs', 'data', one_month_highs) from seqs union all select json_build_object('name', 'One Month Lows', 'data', one_month_lows) from seqs) x(j);
json_agg
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[{"name" : "One Month Highs", "data" : { "2016-03-09 00:00:00" : 2, "2016-06-08 00:00:00" : 1 }}, {"name" : "One Month Lows", "data" : { "2016-03-09 00:00:00" : 9, "2016-06-08 00:00:00" : 10 }}]
(1 row)

Related

functional construct for flattening array for multiple row insert sql query

Is there a way to generate multiple row sql query (values only) using some functional constructs of array?
I've an array of Roles that I want to insert into sqlite database.
struct Role {
var id: Int32
var name: String?
}
func updateByUserId(_ id: Int32, _ roles: [Role]) {
let sql = "INSERT INTO user_role(user_id, role_id) VALUES( \(id), \(roles.map..) )"
}
Expectation:
for instances if id is 1 and roles has an array [10, 11, 14, 15]
Generated SQL should be
INSERT INTO user_role(user_id, role_id) VALUES(1, 10), (1, 11), (1, 14), (1, 15)
SQL Syntax for Multiple row insert is
INSERT INTO MyTable ( Column1, Column2 ) VALUES(Value1, Value2),
(Value1, Value2)
You can map each role to the string (id, role), then join the array of strings with the separator ,:
let values = roles.map { "(\(id), \($0))" }.joined(separator: ", ")
let sql = "INSERT INTO user_role(user_id, role_id) VALUES\(values)"
Although for this particular scenario the SQL string computation is not problematic, it's good practice to use parametrized statements for every DB query.
Working exclusively with parametrized statements avoids vulnerabilities like SQL injection, or malformed queries that fail to execute (when dealing with strings instead of ints).
So, I'd recommend going via the above route by writing something like this:
func updateByUserId(_ id: Int32, _ roles: [Role]) -> (statement: String, params: [Int32]) {
let statement = "INSERT INTO user_role(user_id, role_id) VALUES " + Array(repeating: "(?, ?)", count: roles.count).joined(separator: ", ")
let params = roles.flatMap { [id, $0.id] }
return (statement, params)
}
For your example in the question, the output would be something like this:
(statement: "INSERT INTO user_role(user_id, role_id) VALUES (?, ?), (?, ?), (?, ?), (?, ?)", params: [1, 10, 1, 11, 1, 14, 1, 15])
You can then use the SQLite functions to create the parametrized statement and bind the given values to it.
P.S. There is also the matter of validating that the array of roles is not empty, in which case you'd get an invalid SQL as output. To handle this, you can make the function return an optional, and nil will signal an empty array. Doing this will enable a small performance improvement, as you'll be able to use String(repeating:count:), which is a little bit faster than creating an array and joing it later on:
func updateByUserId(_ id: Int32, _ roles: [Role]) -> (statement: String, params: [Int32])? {
guard !roles.isEmpty else { return nil }
return (statement: "INSERT INTO user_role(user_id, role_id) VALUES (?, ?)" + String(repeating: ", (?, ?)", count: roles.count - 1),
params: roles.flatMap { [id, $0.id] })
}

accessing list from json and iterating in postgres

I have a json as mentioned below,
{
"list": [{
"notificationId": 123,
"userId": 444
},
{
"notificationId": 456,
"userId": 789
}
]
}
I need to write a postgres procedure which interates through the list and perform either update/insert based on notification id is already present or not in DB.
I have a notification table which has notificationid and userID as columns.
Can anyone please tell me on how to perform this using postgres json operators.
Try this query:
SELECT *
FROM yourTable
WHERE col->'list'#>'[{"notificationId":123}]';
You may replace the value 123 with whatever notificationId you want to search. Follow the link below for a demo showing that this logic works:
Demo
Assuming you have a unique constraint on notificationid (e.g. because it's the primary key, there is no need for stored function or loop:
with data (j) as (
values ('
{
"list": [{
"notificationId": 123,
"userId": 444
},
{
"notificationId": 456,
"userId": 789
}
]
}'::jsonb)
)
insert into notification (notificationid, userid)
select (e.r ->> 'notificationId')::int, (e.r ->> 'userId')::int
from data d, jsonb_array_elements(d.j -> 'list') as e(r)
on conflict (notificationid) do update
set userid = excluded.userid;
The first step in that statement is to turn the array into a list of rows, this is what:
select e.*
from data d, jsonb_array_elements(d.j -> 'list') as e(r)
does. Given your sample JSON, this returns two rows with a JSON value in each:
r
--------------------------------------
{"userId": 444, "notificationId": 123}
{"userId": 789, "notificationId": 456}
This is then split into two integer columns:
select (e.r ->> 'notificationId')::int, (e.r ->> 'userId')::int
from data d, jsonb_array_elements(d.j -> 'list') as e(r)
So we get:
int4 | int4
-----+-----
123 | 444
456 | 789
And this result is used as the input for an INSERT statement.
The on conflict clause then does an insert or update depending on the presence of the row identified by the column notificationid which has to have a unique index.
Meanwhile i tried this,
CREATE OR REPLACE FUNCTION insert_update_notifications(notification_ids jsonb) RETURNS void AS
$$
DECLARE
allNotificationIds text[];
indJson jsonb;
notIdCount int;
i json;
BEGIN
FOR i IN SELECT * FROM jsonb_array_elements(notification_ids)
LOOP
select into notIdCount count(notification_id) from notification_table where notification_id = i->>'notificationId' ;
IF(notIdCount = 0 ) THEN
insert into notification_table(notification_id,userid) values(i->>'notificationId',i->>'userId');
ELSE
update notification_table set userid = i->>'userId' where notification_id = i->>'notificationId';
END IF;
END LOOP;
END;
$$
language plpgsql;
select * from insert_update_notifications('[{
"notificationId": "123",
"userId": "444"
},
{
"notificationId": "456",
"userId": "789"
}
]');
It works.. Please review this.

Show top 3 ranked values in a column tooltip

I want to make a tooltip that, when an item in a chart is hovered over, shows the top 3 names in a column that's ranked by the number of times they appear.
I found two pieces of code that each do half of the trick but have tried and failed to combine them.
Code to find the most commonly occurring value:
TopIssue =
FIRSTNONBLANK (
TOPN (
1,
VALUES ( FlagReport[Cat 3] ),
RANKX( ALL( FlagReport[Cat 3] ), [IssueCount],,ASC)
),
1)
Where IssueCount = COUNT(FlagReport[Ref No])
This works fine, but when I change the 1 -> 2 -> 3 it doesn't correlate correctly with the ranking as when I change it to 2 it doesn't show the correct value
Code to show the first 3 string values that occur:
List of Cat 3 values =
VAR __DISTINCT_VALUES_COUNT = DISTINCTCOUNT('FlagReport'[Cat 3])
VAR __MAX_VALUES_TO_SHOW = 3
RETURN
IF(
__DISTINCT_VALUES_COUNT > __MAX_VALUES_TO_SHOW,
CONCATENATE(
CONCATENATEX(
TOPN(
__MAX_VALUES_TO_SHOW,
VALUES('FlagReport'[Cat 3]),
'FlagReport'[Cat 3],
ASC
),
'FlagReport'[Cat 3],
", ",
'FlagReport'[Cat 3],
ASC
),
", etc."
),
CONCATENATEX(
VALUES('FlagReport'[Cat 3]),
'FlagReport'[Cat 3],
", ",
'FlagReport'[Cat 3],
ASC
)
)
This code shows me the first 3 string values but doesn't let me rank them.
I have been trying and failing with this for far too long considering it sounds like a theoretically simple thing to do.
I think you're making it a bit more complex than it needs to be. The first code gets you most of the way there. You just need to wrap it in a concatenate function and make sure you have the ordering set correctly.
Top3 = CONCATENATEX(
TOPN(3,
VALUES(FlagReport[Cat 3]),
RANKX(ALL(FlagReport[Cat 3]), [IssueCount], ,ASC)),
FlagReport[Cat 3], ", ", [IssueCount], DESC)
The TOPN function finds the top 3 ranked items. Then we concatenate the Cat 3 column using [IssueCount] as the order by expression.

Temporary table creation takes long time RDS Server

Below query takes long time to create temporary table, its only have "228000" distinct record.
DECLARE todate,fromdate DATETIME;
SET fromdate=DATE_SUB(UTC_TIMESTAMP(),INTERVAL 2 DAY);
SET todate=DATE_ADD(UTC_TIMESTAMP(),INTERVAL 14 DAY);
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
DROP TEMPORARY TABLE IF EXISTS tempabc;
SET max_heap_table_size = 1024*1024*1024;
CREATE TEMPORARY TABLE IF NOT EXISTS tempabc
-- (index using BTREE(id))
ENGINE=MEMORY
AS
(
SELECT SQL_NO_CACHE DISTINCT id
FROM abc
WHERE StartTime BETWEEN fromdate AND todate
);
I already created index on 'startTime' coulmn, still it tooks 20 sec to create table. Kindly help me out to reduce the creation time.
More Info:-
I changed my query earlier I was using "tempabc" temporary table to get my output, now I am using IN clause instead of temporary table and now it is taking 12 sec to execute, but still more than expected time..
Earlier(taking 20-30 sec)
DECLARE todate,fromdate DATETIME;
SET fromdate=DATE_SUB(UTC_TIMESTAMP(),INTERVAL 2 DAY);
SET todate=DATE_ADD(UTC_TIMESTAMP(),INTERVAL 14 DAY);
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
DROP TEMPORARY TABLE IF EXISTS tempabc;
SET max_heap_table_size = 1024*1024*1024;
CREATE TEMPORARY TABLE IF NOT EXISTS tempabc
-- (index using BTREE(id))
ENGINE=MEMORY
AS
(
SELECT SQL_NO_CACHE DISTINCT id
FROM abc
WHERE StartTime BETWEEN fromdate AND todate
);
SELECT DISTINCT p.xyzID
FROM tempabc s
JOIN xyz_tab p ON p.xyzID=s.ID AND IFNULL(IsGeneric,0)=0;
Now(taking 12-14 sec)
DECLARE todate,fromdate Timestamp;
SET fromdate=DATE_SUB(UTC_TIMESTAMP(),INTERVAL 2 DAY);
SET todate=DATE_ADD(UTC_TIMESTAMP(),INTERVAL 14 DAY);
SELECT p.xyzID FROM xyz_tab p
WHERE id IN (
SELECT DISTINCT id FROM abc
WHERE StartTime BETWEEN fromdate AND todate )
AND IFNULL(IsGeneric,0)=0 GROUP BY p.xyxID;
But we need to achieve 3-5 sec of execution time.
This is my explain output.
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: abc
partitions: NULL
type: index
possible_keys: ix_starttime_id,IDX_Start_time,IX_id_starttime,IX_id_starttime_prgsvcid
key: IX_id_starttime
key_len: 163
ref: NULL
rows: 18779876
filtered: 1.27
Extra: Using where; Using index; Using temporary; Using filesort; LooseScan
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: p
partitions: NULL
type: eq_ref
possible_keys: PRIMARY,IX_seriesid
key: PRIMARY
key_len: 152
ref: onconnectdb.abc.ID
rows: 1
filtered: 100.00
Extra: Using where
Explain in JSON format
EXPLAIN: {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "10139148.44"
},
"grouping_operation": {
"using_temporary_table": true,
"using_filesort": true,
"cost_info": {
"sort_cost": "1.00"
},
"nested_loop": [
{
"table": {
"table_name": "abc",
"access_type": "index",
"possible_keys": [
"ix_starttime_tmsid",
"IDX_Start_time",
"IX_id_starttime",
"IX_id_starttime_prgsvcid"
],
"key": "IX_id_starttime",
"used_key_parts": [
"ID",
"StartTime",
"EndTime"
],
"key_length": "163",
"rows_examined_per_scan": 19280092,
"rows_produced_per_join": 264059,
"filtered": "1.37",
"using_index": true,
"loosescan": true,
"cost_info": {
"read_cost": "393472.45",
"eval_cost": "52812.00",
"prefix_cost": "446284.45",
"data_read_per_join": "2G"
},
"used_columns": [
"ID",
"StartTime"
],
"attached_condition": "(`onconnectdb`.`abc`.`StartTime` between <cache>(fromdate#1) and <cache>(todate#0))"
}
},
{
"table": {
"table_name": "p",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"IX_seriesid"
],
"key": "PRIMARY",
"used_key_parts": [
"ID"
],
"key_length": "152",
"ref": [
"onconnectdb.abc.ID"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"cost_info": {
"read_cost": "9640051.00",
"eval_cost": "0.20",
"prefix_cost": "10139147.44",
"data_read_per_join": "2K"
},
"used_columns": [
"ID",
"xyzID",
"IsGeneric"
],
"attached_condition": "(ifnull(`onconnectdb`.`p`.`IsGeneric`,0) = 0)"
}
}
]
}
}
}
Please suggest.

How to find records where a column value = [given array] in rails

Setup: Rails + Postgres.
I have table A with columns
id: int, name: string, address: string, s_array: varchar[], i_array: int[], created_at: datetime)
For a row, i need to find all the rows which have similar values.
In rails, then query would look like
row = A.find(1) # any random row
ignore_columns = %w[id created_at]
A.where(row.attributes.except(*ignore_columns))
This works if we don't have column with array type.
How to find all records where a value = [given array]?
Edit:
To be clear, I want to pass multiple columns in where clause where some columns are of type array. For values in where clause, I am passing hash (row.attributes.except(*ignore_columns) is a hash)
Edit 2: Example:
Lets say I have a Query table
Query(id: Int, name: String, terms: varchar[], filters: int[], city: string, created_at: datetime)
id = primary key/integer
terms = array of string
filters = array of integer (it is an enum and we can select multiple which is saved as array)
other fields = self explanatory
Suppose I have following rows
(1, "query1", ["john"], [0], "wall", <some_date>)
(1, "query2", ["eddard", "arya"], [0, 1], "Winterfell", <some_date>)
(1, "query3", ["sansa", "arya"], [1, 2], "Winterfell", <some_date>)
Now when I add new row
row = ActiveRecord of (1, "query4", ["eddard", "arya"], [0, 1], "Winterfell", <some_date>)
What I want is to search already existing records like this
ignore_attributes = %w[id name created_at]
Query.where(row.attributes.except(*ignore_attributes))
This query should return already existing query3 so that I won't need to add new row with name query4.
The problem is that because some column types are of array type, then passing them as hash/conditions in where clause is not working.
use find_by to find_all_by and it will return all matching results.
Try this example:-
##black list of attributes
ignore_attributes = %w[id name created_at]
MyModel.where(active: true).select(MyModel.attribute_names - ignore_attributes)
=====The above query can also be chained as:- =====
##you have a city column too
MyModel.where(active: true).select(MyModel.attribute_names - ignore_attributes).where.not(:city=>["Sydney","London"])
If you need this a permanent fix,you can add this line in your model.rb file.
but its dangerous.
self.ignored_columns = %w(id name created_at)
Hope it helps :)
Your query in rails look like below
row = A.find(1)
where_clauses = row.attributes.reject{|k,v| %[id, created_at].include?(k)}
A.where(where_clauses)

Resources