I have read the Ruby docs on the query method "group", but I am having a hard time understanding how to use it.
lets say I have a table called users, and there are the fields name, email, gender.
I am able to type User.group(:name).count, which return a a hash with key value pairs of {name: count}.
Why does User.group(:name) not work?
Is there a way of grouping similar names, and accessing those records?
ex. User.group(:name).first or User.group(:name).each
It seems to me that I am thinking of using "group" incorrectly.
Why does User.group(:name) not work?
When you are using GROUP BY in SQL it needs a SELECT clause too. But it was absent in your case, and that throws error.
In your first case the query was SELECT COUNT(*) from users GROUP BY name, and this is the reason it worked.
As per your last sentence you need:
User.group(:name).select(:name).each do |record|
# work with record
end
I don't know what is the DB client you are using, but here is the idea from Postgresql GROUP BY documentation.
GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. expression can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name.
Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group (whereas without GROUP BY, an aggregate produces a single value computed across all the selected rows). When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column.
Related
In Rails, one uses ActiveRecord for querying the database. ActiveRecord's query results in an ActiveRecord::Relation object. Since we can execute ActiveRecord::Relation#select and specify arbitrary SQL select clause, sometimes the records returned by the database contains columns which does not exist in the database.
If this relation contains more than one row, then one can get the column names of the relation by using the_relation.first.attributes. When no records were returned by the query, however, this method is not possible.
Question
Is there any way to get the Query's resulting column names of an ActiveRecord::Relation even if no rows were returned?
The motivation
For example, when you're building an Daru::DataFrame instance or some other Relational Data, you'd want to obtain the attribute names even if there is no records in the result.
Yes you can get the column names
If the result is ActiveRecord::Relation then you can use something like this
the_relation.column_names
I have been using different methods to get specific fields from active record, But which one is faster and preferred to use and how are they different from one another?
User.all.collect(&:name)
User.all.pluck(:name)
User.all.select(:name)
User.all.map(&:name)
Thanks for your help in advance.
Usage of any of these methods requires different use cases:
Both select and pluck make SQL's SELECT of specified columns (SELECT "users"."name" FROM "users"). Hence, if you don't have users already fetched and not going to, these methods will be more performant than map/collect.
The difference between select and pluck:
Performance: negligible when using on a reasonable number of records
Usage: select returns the list of models with the column specified, pluck returns the list of values of the column specified. Thus, again, the choice depends on the use case.
collect/map methods are actually aliases, so there's no difference between them. But to iterate over models they fetch the whole model (not the specific column), they make SELECT "users".* FROM "users" request, convert the relation to an array and map over it.
This might be useful, when the relation has already been fetched. If so, it won't make additional requests, what may end up more performant than using pluck or select. But, again, must be measured for a specific use case.
pluck: retrieve just names from users, put them in an array as strings (in this case) and give it to you.
select: retrieve all the users from db with just the 'name' column and returns a relation.
collect/map (alias): retrieve all the users from db with all columns, put them in an array of User objects with all the fields, then transform every object in just the name and give this names array to you.
I put this in order of performance to me.
I'm looking to speed up queries to my SQL backed CoreData instance (displaying records sorted by date). I know that indexing can help decrease query time, but what's the difference between:
Highlighting the entity that an attribute belongs to, then adding a comma separated list of attributes into the indexes field as seen here:
Or highlighting the attribute, then checking the indexed box as seen here:
Adding a row with a single attribute to the Indexes list is equivalent to selecting Indexed for that attribute: It creates an index for the attribute to speed up searches in query statements.
The Indexes list is meant for compound indexes. Compound indexes are useful when you know that you will be searching for values of these attributes combined in the WHERE clause of a query:
SELECT * FROM customer WHERE surname = "Doe" AND firstname = "Joe";
This statement could make use of a compound index surname, firstname. That index would also be useful if you just search for surname, but not if you only search for firstname. Think of the index as if it were a phone book: It is sorted by surname first, then by first name. So the order of attributes is important.
In your case you should go for the single indexes first (that is, select Indexed for the attributes you like to search for). The compound index you showed could never be used if you just search for babyId, for example.
At WWDC 2017, apple updated this to instead be done by using a Fetch Index(see: https://developer.apple.com/videos/play/wwdc2017/210/?time=997)
To add it, select the entity and then go to Editor -> Add Fetch Index
I have this query in a project model:
report = self.reports.group(:key_id)
report.select('key_id, count(*) as count')
What do I need to add in order to get another column (level) from reports table?
I tried adding my column to select but that means that I have to group it as well and I only want to get the unique records by key_id
Thank you
If you want to include information about another field, then you have to include that field in the group expression or as part of an aggregate field. That's a fundamental aspect of SQL.
For example, if you want to count the number of occurrences of various values of level associated with each key_id then you can add a count(level) column. The aggregation field can get arbitrarily "fancy", such as counting up the number of occurrences of level within various bands as you've mentioned in your comment.
I'd like to use the rank() function PostgreSQL on one of my columns.
Character.select("*, rank() OVER (ORDER BY points DESC)")
But since I don't have a rank column in my table rails doesn't include it with the query. What would be the correct way to get the rank included in my ActiveRecord object?
try this:
Character.find_by_sql("SELECT *, rank() OVER (ORDER BY points DESC) FROM characters")
it should return you Character objects with a rank attribute, as documented here. However, this may not be database-agnostic and tends to get messy if you pass around the objects.
another (expensive) solution is to add a rank column to your table, and have a callback recalculate all records' rank using .order whenever a record is saved or destroyed.
edit :
another idea suitable for single-record queries can ben seen here