I'm confused about something in Rails (using Rails 5). I have this model
class MyEventActivity < ApplicationRecord
belongs_to :event_activity
end
and what I want to do is get a list of all the objects linked to it, in other words, all the "event_activity" objects. I thought this would do the trick
my_event_activities = MyEventActivity.all.pluck(:event_activity)
but its giving me this SQL error
(2.3ms) SELECT "event_activity" FROM "my_event_activities"
ActiveRecord::StatementInvalid: PG::UndefinedColumn: ERROR: column "event_activity" does not exist
LINE 1: SELECT "event_activity" FROM "my_event_activities"
How do I get the objects linked to the MyEventActivity objects? Note that I don't want just the IDs, I want the whole object.
Edit: This is the PostGres table as requested
eactivit=# \d event_activities;
Table "public.event_activities"
Column | Type | Modifiers
--------------------------+-----------------------------+----------------------------------------------------------------
id | integer | not null default nextval('event_activities_id_seq'::regclass)
name | character varying |
abbrev | character varying |
attendance | bigint |
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
EventActivity.joins(:my_event_activities).distinct
Returns all EventActivity objects that have associated MyEventActivity records
Or more along the lines of what you've already tried:
EventActivity.where(id: MyEventActivity.all.pluck(:event_activity_id).uniq)
But the first one is preferable for its brevity and performance.
Update to explain why the first option should be preferred
TL;DR much faster and more readable
Assume we have 100 event_activities, and all but the last (id: 100) have 100 my_event_activities for a total of 9900 my_event_activities.
EventActivity.where(id: MyEventActivity.all.pluck(:event_activity_id).uniq) performs two SQL queries:
SELECT "my_event_activities"."event_activity_id" FROM "my_event_activities" which will return an Array of 9900 non-unique event_activity_ids. We want to reduce this to unique ids to optimize the second query, so we call Array#uniq which has its own performance cost on large arrays, reducing 9900 down to 99. Then we can call the second query: SELECT "event_activities".* FROM "event_activities" WHERE "event_activities"."id" IN (1, 2, 3, ... 97, 98, 99)
EventActivity.joins(:my_event_activities).distinct performs only one SQL query: SELECT DISTINCT "event_activities".* FROM "event_activities" INNER JOIN "my_event_activities" ON "my_event_activities"."event_activity_id" = "event_activities"."id". Once we drop into the database we never have to switch back to Ruby to perform some expensive process and then make a second trip back to the database. joins is designed for performing these types of chainable and composable queries in situations like this.
The performance difference can be checked with a simple benchmark. With an actual Postgres database loaded with 100 event_activities, 99 of which have 100 my_event_activities:
require 'benchmark/ips'
require_relative 'config/environment'
Benchmark.ips do |bm|
bm.report('joins.distinct') do
EventActivity.joins(:my_event_activities).distinct
end
bm.report('pluck.uniq') do
EventActivity.where(id: MyEventActivity.all.pluck(:event_activity_id).uniq)
end
bm.compare!
end
And the results:
Warming up --------------------------------------
joins.distinct 5.922k i/100ms
pluck.uniq 7.000 i/100ms
Calculating -------------------------------------
joins.distinct 71.504k (± 3.5%) i/s - 361.242k in 5.058311s
pluck.uniq 73.459 (±13.6%) i/s - 364.000 in 5.061892s
Comparison:
joins.distinct: 71503.9 i/s
pluck.uniq: 73.5 i/s - 973.38x slower
973x slower :-O ! The joins method is meant to be used for things just like this, and this is one of the happy cases in Ruby where more readable is also more performant.
Related
My code depends on the order of records in the table. My assumption was that a table can be considered a list so that the records maintain order. I have a small update code as shown below that will update a record at a particular index in the table.
p = pieces[index]
p.position = 0
p.save
I check the order of records before this update and after this update then i see that after the update the record that is updated is moved to the last of the list. I print Piece.all to print the list. The order is maintained in mysql but when i deploy it to heroku which uses postgre the order was not maintained so this was a surprising find for me.
Is there no guarantee of order in tables and one should not depend on the order? Please correct my misunderstanding and thanks for the clarification.
You should NEVER depend on the order in my honest opinion.
Rows are returned in an unspecified order, per sql specs, unless you add an order by clause. In Postgres, that means you'll get rows in, basically, the order that live rows read on the disk.
MySQL tends to return rows in the order they're inserted, and this is why you see the different in behavior.
If you want them to always be returned in the order they were created, you can use Item.order("created_at")
You state:
My assumption was that a table can be considered a list so that the
records maintain order.
This is incorrect. A table represents an unordered set. There is no inherent ordering in the table. A result set similarly lacks ordering. The only way to guarantee the ordering of a result set is to use ORDER BY in the query.
So, an update changes values in one or more columns in one or more rows. It does not change the "ordering" of rows, because they are not ordered.
Note: Under some circumstances, a query may appear to return results in a particular order. You really should not depend on this behavior, unless the query has an explicit ORDER BY.
Tables normally are unordered, and should be presumed to be unordered unless they have a CLUSTER(ed) index. That's an important piece of information because understanding clustered indexes is somewhat useful. That said, what you receive back from a query, the resultset, should be presumed to be unordered because the join-order is always undefined.
So if order matters always be explicit and use ORDER BY. Now for illustration let's have some fun.
CREATE TABLE bar ( qux serial PRIMARY KEY, asdf text );
INSERT INTO bar (asdf) ( VALUES ('z'),('x'),('g'),('a') );
Now we've got this,
SELECT * FROM BAR;
qux | asdf
-----+------
1 | z
2 | x
3 | g
4 | a
Now we create a CLUSTERed index,
CREATE INDEX asdfidx ON bar (asdf);
CLUSTER bar USING asdfidx;
Now the order is guaranteed,
SELECT * FROM bar;
qux | asdf
-----+------
4 | a
3 | g
2 | x
1 | z
I have a Rails 4 app using ActiveRecord and Postgresql with two tables: stores and open_hours. a store has many open_hours:
stores:
Column |
--------------------+
id |
name |
open_hours:
Column |
-----------------+
id |
open_time |
close_time |
store_id |
The open_time and close_time columns represent the number of seconds since midnight of Sunday (i.e. beginning of the week).
I would like to get list of store objects ordered by whether the store is open or not, so stores that are open will be ranked ahead of the stores that are closed. This is my query in Rails:
Store.joins(:open_hours).order("#{current_time} > open_time AND #{current_time} < close_time desc")
Notes that current_time is in number of seconds since midnight on the previous Sunday.
This gives me a list of stores with the currently open stores ranked ahead of the closed ones. However, I'm getting a lot of duplicates in the result.
I tried using the distinct, uniq and group methods, but none of them work:
Store.joins(:open_hours).group("stores.id").group("open_hours.open_time").group("open_hours.close_time").order("#{current_time} > open_time AND #{current_time} < close_time desc")
I've read a lot of the questions/answers already on Stackoverflow but most of them don't address the order method. This question seems to be the most relevant one but the MAX aggregate function does not work on booleans.
Would appreciate any help! Thanks.
Here is what I did to solve the issue:
In Rails:
is_open = "bool_or(#{current_time} > open_time AND #{current_time} < close_time)"
Store.select("stores.*, CASE WHEN #{is_open} THEN 1 WHEN #{is_open} IS NULL THEN 2 ELSE 3 END AS open").group("stores.id").joins("LEFT JOIN open_hours ON open_hours.store_id = stores.id").uniq.order("open asc")
Explanation:
The is_open variable is just there to shorten the select statement.
The bool_or aggregate function is needed here to group the open_hours records. Otherwise there likely will be two results for each store (one open and one closed), which is why using the uniq method alone doesn't eliminate the duplicate issues
LEFT JOIN is used instead of INNER JOIN so we can include the stores that don't have any open_hours objects
The store can be open (i.e. true), closed (i.e. false) or not determined (i.e. nil), so the CASE WHEN statement is needed here: if a store is open, then it's 1, 2 if not determined and 3 if closed
Ordering the results ASC will show open stores first, then the not determined ones, then the closed stores.
This solution works but doesn't feel very elegant. Please post your answer if you have a better solution. Thanks a lot!
Have you tried uniq method, just append it at the end
Store.joins(:open_hours).order("#{current_time} > open_time AND #{current_time} < close_time desc").uniq
I am using Ruby on Rails 4 and MySQL. I have three types. One is Biology, one is Chemistry, and another is Physics. Each type has unique fields. So I created three tables in database, each with unique column names. However, the unique column names may not be known before hand. It will be required for the user to create the column names associated with each type. I don't want to create a serialized hash, because that can become messy. I notice some other systems enable users to create user-defined columns named like column1, column2, etc.
How can I achieve these custom columns in Ruby on Rails and MySQL and still maintain all the ActiveRecord capabilities, e.g. validation, etc?
Well you don't have much options, your best solution is using NO SQL database (at least for those classes).
Lets see how can you work around using SQL. You can have a base Course model with a has_many :attributes association. In which a attribute is just a combination of a key and a value.
# attributes table
| id | key | value |
| 10 | "column1" | "value" |
| 11 | "column1" | "value" |
| 12 | "column1" | "value" |
Its going to be difficult to determin datatypes and queries covering multiple attributes at the same time.
Ruby on Rails ORM(object relational mapping) has a thing call polymorphic associations that allow a foreign key to be used to reference 2 or more other tables. This is achieved by creating an additional column called "type" that specifies the table with which the foreign key is associated with.
Does this implementation have a name from a database point of view? and is it good/bad practice?
thanks
Yes, using multiple keys to reference a unique record is known as a composite key. Whether it's good or bad practice is dependant on your database schema.
Example Scenario
Let's pretend that we have 4 tables: A, B, C and Z. Z maintains a reference to A, B, and C. Each record contains a reference to a single table. Below is two potential schema's for Z.
Single Foreign Key
We need a column to store the reference for each of the tables. That means we'll end up with NULL values for the unused columns. In future, if we introduce a D table, then we'll be required to add a new column to Z.
id | a_id | b_id | c_id
-----------------------
1 | 1 | NULL | NULL
2 | NULL | 1 | NULL
3 | NULL | NULL | 1
Composite Foreign Key
We start off with two columns for building a reference to the other tables. However, when we introduce D we do not need to modify the schema. In addition, we'll never have columns with NULL values.
id | z_id | z_type
------------------
1 | 1 | 'A'
2 | 1 | 'B'
3 | 1 | 'C'
Therefore, we can achieve some level of normalisation by using composite foreign keys. Provided that both columns are indexed, querying should be very fast. While it must be slower than using a single foreign key, the difference is insignificant.
Often it's tempting to use Rails' polymorphic associations whenever you have data that appears to be the same (Eg: Address). You should always exercise caution when coupling many models together. A good indicator you've gone too far is when you notice yourself switching based on the association type. A potential solution is to refactor common code out into a module and mix that into the models you care about instead.
Not all databases allow a composite foreign key and personally I'd shoot anyone who tried to do that to my database. Foreign keys MUST be maintained by the datbase not somethign like Rails. There are other processes which typically hit a database where this critical relationship must be checked which may not use an ORM (I certainly wouldn't use such a thing to import a 10,000, 000 record file or update a million price records or fix a data integrity problem.
I have a query used for statistical purposes. It breaks down the number of users that have logged-in a given number of times. User has_many installations and installation has a login_count.
select total_login as 'logins', count(*) as `users`
from (select u.user_id, sum(login_count) as total_login
from user u
inner join installation i on u.user_id = i.user_id
group by u.user_id) g
group by total_login;
+--------+-------+
| logins | users |
+--------+-------+
| 2 | 3 |
| 6 | 7 |
| 10 | 2 |
| 19 | 1 |
+--------+-------+
Is there some elegant ActiveRecord style find to obtain this same information? Ideally as a hash collection of logins and users: { 2=>3, 6=>7, ...
I know I can use sql directly but wanted to know how this could be solved in rails 3.
# Our relation variables(RelVars)
U =Table(:user, :as => 'U')
I =Table(:installation, :as => 'I')
# perform operations on relations
G =U.join(I) #(implicit) will reference final joined relationship
#(explicit) predicate = Arel::Predicates::Equality.new U[:user_id], I[:user_id]
G =U.join(I).on( U[:user_id].eq(I[:user_id] )
# Keep in mind you MUST PROJECT for this to make sense
G.project(U[:user_id], I[:login_count].sum.as('total_login'))
# Now you can group
G=G.group(U[:user_id])
#from this group you can project and group again (or group and project)
# for the final relation
TL=G.project(G[:total_login].as('logins') G[:id].count.as('users')).group(G[:total_login])
Keep in mind this is VERY verbose because I wanted to show you the order of operations not just the "Here is the code". The code can actually be written with half the code.
The hairy part is Count()
As a rule, any attribute in the SELECT that is not used in an aggregate should appear in the GROUP BY so be careful with count()
Why would you group by the total_login count?
At the end of the day I would simply ask why don't you just do a count of the total logins of all installations since the user information is made irrelevant by the outer most count grouping.
I don't think you'll find anything as efficient as having the db do the work. Remember that you don't want to have to retrieve the rows from the db, you want the db itself to compute the answer by grouping the data.
If you want to push the SQL further into the database, you can create the query as a view in the database and then use a Rails ActiveRecord class to retrieve the results.
In the end imo the SQL syntax is way more readable. This arel stuff is just slowing me down all the time when I only need just a tiny bit more complexity. It's just another syntax you have learn, not worth it imo. I'd stick to SQL in these cases.