How do i fill a column with specific fixed value for a table using select - psql

I want to select some data, but that data has a column that has NULL values. I want it to change it from null to a specific fixed value without changing the database
i.e.
V_fruits
number | fruit | three
1 | apple | <null>
2 | pinapple | <null>
3 | grape | <null>
4 | lemon | <null>
I want it using
Select "number","fruit",case when "three" is null then three='ofcourse' from V_fruits
I want some guidance on this please it is Psql
Expected
V_fruits
number fruit three
1 apple Of course
2 pinapple Of course
3 grape Of course
4 lemon Of course
Obtained
V_fruits
number fruit three
1 apple false
2 pinapple false
3 grape false
4 lemon false

Instead:
Select "number","fruit",case when "three" is null then 'ofcourse' END from V_fruits
The difference here is that three='ofcourse' is evaluated and returns false because three is null, therefore it can't be 'ofcourse'.
Optionally you could use:
SELECT "number", "fruit", COALESCE(three, 'ofcourse') FROM v_fruits;

Related

Hive DB - struct datatype join - with different structure elements

I am pretty new in work with Hive DB and struct data types. I used only basic SELECT statements until now.
I need to join two tables to combine them in my SELECT statement.
Tables have struct datatype with same name, but with different elements inside. This is how tables look like:
TABLE 1
table_one(
eventid string,
new struct<color:string, size:string, weight:string, number:string, price:string>,
date string
)
11 | {"color":"yellow", "size":"xl", "weight":"10", "number":"1111", "price":"1"} | 08-21-2004
12 | {"color":"yellow", "size":"xxl", "weight":"12", "number":"2111", "price":"2"} | 08-21-2004
TABLE 2
table_two(
eventid string,
new struct<number:string, price:string>,
date string,
person string)
11 | {"number":"31", "price":"1"} | 08-21-2004 | john
12 | {"number":"32", "price":"2"} | 08-21-2004 | joe
With SELECT query I need to get value of element 'color' from table_one, but instead that, I am getting value of element 'number' from table_two, query is following:
select
s.eventid,
v.date,
s.new.color,
s.new.size
from table_one s join table_two v where s.eventid = v.eventid;
With s.new.color - instead getting for example value 'yellow' from table_one, I am getting value '31' from table_two. How I am supposed to get wanted value from table_one?
Expected result:
11 | 08-21-2004 | yellow | xl
But I got:
11 | 08-21-2004 | 31 | 1
So how can I select proper value from struct datatype from desired table?
(Please have on mind that this is just simplified example of my problem, I didn't provide exact code or structures of tables to make this clearer for one who will try to provide me answer. I need to use join because I need proper values for some column from table_two)

Using Crosstab to Generate Data for Charts

I'm trying to make an efficient query to create a view that will contains counts for the number of successful logins by day as well as by type of user with no duplicate users per day.
I have 3 tables involved in this query. One table that contains all successful login attempts, one table for standard user accounts, and one table for admin user accounts. All user_id values are unique across the entire database so there are no user accounts that will share the same user_id with an admin account:
TABLE 1: user_account
user_id | username
---------|----------
1 | user1
2 | user2
TABLE 2: admin_account
user_id | username
---------|----------
6 | admin6
7 | admin7
TABLE 3: successful_logins
user_id | timestamp
---------|------------------------------
1 | 2022-01-23 14:39:12.63798-07
1 | 2022-01-28 11:16:45.63798-07
1 | 2022-01-28 01:53:51.63798-07
2 | 2022-01-28 15:19:21.63798-07
6 | 2022-01-28 09:42:36.63798-07
2 | 2022-01-23 03:46:21.63798-07
7 | 2022-01-28 19:52:16.63798-07
2 | 2022-01-29 23:12:41.63798-07
2 | 2022-01-29 18:50:10.63798-07
The resulting view I would like to generate would contain the following information from the above 3 tables:
VEIW: login_counts
date_of_login | successful_user_logins | successful_admin_logins
---------------|------------------------|-------------------------
2022-01-23 | 1 | 1
2022-01-28 | 2 | 2
2022-01-29 | 1 | 0
I'm currently reading up on how crosstabs work but having trouble figuring out how to write the query based on my table setups.
I actually was able to get the values I needed by using the following query:
SELECT
to_char(s.timestamp, 'YYYY-MM-DD') AS login_date,
count(distinct u.user_id) AS successful_user_logins,
count(distinct a.user_id) AS successful_admin_logins
FROM successful_logins s
LEFT JOIN user_account u ON u.user_id= s.user_id
LEFT JOIN admin_account a ON a.user_id= s.user_id
GROUP BY login_date
However, I was told it would be even quicker using crosstabs, especially considering the successful_logins table contains millions of records. So I'm trying to also create a version of the query using crosstabs then comparing both execution times.
Any help would be greatly appreciated. Thanks!
Turns out it isn't possible to do what I was asking about using crosstabs, so the original query I have will have to do.

With a composite index, what column order do ActiveRecord queries use to decide which composite index to search?

Rails v. 5.2.4
ActiveRecord v5.2.4.3
I have a Rails app with a MySQL database, and my app has a Skill model and a SkillAdjacency model. The SkillAdjacency model has the following attributes:
requested_skill_id, table_name: 'Skill'
adjacent_skill_id, table_name: 'Skill'
score, integer
SkillAdjacencies are used to determine how "similar" two instances of Skill are to each other.
One of the app's constraints is that you can't create more than one instance of SkillAdjacency for each combination of requested_skill and adjacent_skill, and I plan to enforce this both with ActiveModel validations and with a composite index which employs a uniqueness constraint. So far I have the following:
add_index :skill_adjacencies, [:requested_skill_id, :adjacent_skill_id], unique: true, name: 'index_adjacencies_on_requested_then_adjacent', using: :btree
However, I know that the order in which the composite columns are declared is important, so I'm considering adding this 2nd composite index to account for the other possible order:
add_index :skill_adjacencies, [:adjacent_skill_id, :requested_skill_id], unique: true, name: 'index_adjacencies_on_adjacent_then_requested', using: :btree
But because writing to an index isn't free, I only want to add the 2nd index if it will actually result in a performance benefit. The problem is, whether or not this 2nd index will be beneficial depends on whether ActiveRecord will start with adjacent_skill_id vs. requested_skill_id when searching for a composite index to search.
How can I determine what order ActiveRecord uses? Does it just use the same order that's specified in the query? For example, if I query SkillAdjacency.where(requested_skill: Skill.last, adjacent_skill: Skill.first), will it always search for a composite index composed of requested_skill 1st and adjacent_skill 2nd? If that's the case, should I cover all my bases by creating that additional composite index?
Alternately, is there some under-the-hood magic which determines if the relevant composite index exists regardless of the order provided in the query?
EDIT:
I ran EXPLAIN and saw the following:
irb(main):013:0> SkillAdjacency.where(requested_skill_id: 1, adjacent_skill_id: 200).explain
SkillAdjacency Load (0.3ms) SELECT `skill_adjacencies`.* FROM `skill_adjacencies` WHERE `skill_adjacencies`.`requested_skill_id` = 1 AND `skill_adjacencies`.`adjacent_skill_id` = 200
=> EXPLAIN for: SELECT `skill_adjacencies`.* FROM `skill_adjacencies` WHERE `skill_adjacencies`.`requested_skill_id` = 1 AND `skill_adjacencies`.`adjacent_skill_id` = 200
+----+-------------+-------------------+------------+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------+---------+-------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------+------------+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------+---------+-------------+------+----------+-------+
| 1 | SIMPLE | skill_adjacencies | NULL | const | index_adjacencies_on_requested_then_adjacent,index_adjacencies_on_adjacent_then_requested,index_skill_adjacencies_on_requested_skill_id,index_skill_adjacencies_on_adjacent_skill_id | index_adjacencies_on_requested_then_adjacent | 10 | const,const | 1 | 100.0 | NULL |
+----+-------------+-------------------+------------+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------+---------+-------------+------+----------+-------+
1 row in set (0.00 sec)
irb(main):014:0> SkillAdjacency.where(adjacent_skill_id: 200, requested_skill: 1).explain
SkillAdjacency Load (0.3ms) SELECT `skill_adjacencies`.* FROM `skill_adjacencies` WHERE `skill_adjacencies`.`adjacent_skill_id` = 200 AND `skill_adjacencies`.`requested_skill_id` = 1
=> EXPLAIN for: SELECT `skill_adjacencies`.* FROM `skill_adjacencies` WHERE `skill_adjacencies`.`adjacent_skill_id` = 200 AND `skill_adjacencies`.`requested_skill_id` = 1
+----+-------------+-------------------+------------+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------+---------+-------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------------+------------+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------+---------+-------------+------+----------+-------+
| 1 | SIMPLE | skill_adjacencies | NULL | const | index_adjacencies_on_requested_then_adjacent,index_adjacencies_on_adjacent_then_requested,index_skill_adjacencies_on_requested_skill_id,index_skill_adjacencies_on_adjacent_skill_id | index_adjacencies_on_requested_then_adjacent | 10 | const,const | 1 | 100.0 | NULL |
+----+-------------+-------------------+------------+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------+---------+-------------+------+----------+-------+
1 row in set (0.00 sec)
In both cases, I see that the value in the key column is index_adjacencies_on_requested_then_adjacent, despite each query passing in a different order for the query params. Can I assume this means the order of those params doesn't matter?

Rails: sum should retun nil if given column values is null

Currently sum function in Rails return 0.0, if provided columns data is null
============================================================================
For example:
Tablename: Price
id | name | Cost
-----------------
1 | A | 1200
2 | A | 2500
3 | A | 3000
4 | B | 5000
5 | B | 7000
6 | C |
Now,
Price.group(:name).sum(:cost)
return 6700, 12000, 0.0 , instead of 6700, 12000, nil.
So here I want nil if given columns value is 'null' or empty
SUM is ignoring null values, so columns with NULL values will always be 0 as zero + nothing is 0
To overcome this I have used condition like:
Price.where("cost IS NOT NULL).group(:name).sum(:cost)
This request will get non null cost values and sum them. After that, I can fill with NULL the cost columns of the other records.
This way I can make sure that if cost is actually 0.0 then I get sum(cost) as 0.0 instead of NULL.
Due to the implementation of sum in Rails, it is impossible to get the values before type cast.
However, you can get the expected values by selecting the values with raw SQL:
Price.group(:name).select('name, sum(cost) AS total_cost')

select distinct records based on one field while keeping other fields intact

I've got a table like this:
table: searches
+------------------------------+
| id | address | date |
+------------------------------+
| 1 | 123 foo st | 03/01/13 |
| 2 | 123 foo st | 03/02/13 |
| 3 | 456 foo st | 03/02/13 |
| 4 | 567 foo st | 03/01/13 |
| 5 | 456 foo st | 03/01/13 |
| 6 | 567 foo st | 03/01/13 |
+------------------------------+
And want a result set like this:
+------------------------------+
| id | address | date |
+------------------------------+
| 2 | 123 foo st | 03/02/13 |
| 3 | 456 foo st | 03/02/13 |
| 4 | 567 foo st | 03/01/13 |
+------------------------------+
But ActiveRecord seems unable to achieve this result. Here's what I'm trying:
Model has a 'most_recent' scope: scope :most_recent, order('date_searched DESC')
Model.most_recent.uniq returns the full set (SELECT DISTINCT "searches".* FROM "searches" ORDER BY date DESC) -- obviously the query is not going to do what I want, but neither is selecting only one column. I need all columns, but only rows where the address is unique in the result set.
I could do something like Model.select('distinct(address), date, id'), but that feels...wrong.
You could do a
select max(id), address, max(date) as latest
from searches
group by address
order by latest desc
According to sqlfiddle that does exactly what I think you want.
It's not quite the same as your requirement output, which doesn't seem to care about which ID is returned. Still, the query needs to specify something, which is here done by the "max" aggregate function.
I don't think you'll have any luck with ActiveRecord's autogenerated query methods for this case. So just add your own query method using that SQL to your model class. It's completely standard SQL that'll also run on basically any other RDBMS.
Edit: One big weakness of the query is that it doesn't necessarily return actual records. If the highest ID for a given address doesn't corellate with the highest date for that address, the resulting "record" will be different from the one actually stored in the DB. Depending on the use case that might matter or not. For Mysql simply changing max(id) to id would fix that problem, but IIRC Oracle has a problem with that.
To show unique addresses:
Searches.group(:address)
Then you can select columns if you want:
Searches.group(:address).select('id,date')

Resources