Rails: 4.1.2
Database: PostgreSQL
For one of my queries, I am using methods from both the textacular gem and Active Record. How can I chain some of the following queries with an "OR" instead of an "AND":
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
I want to chain the last two scopes (fuzzy_search and the where after it) together with an "OR" instead of an "AND." So I want to retrieve all People who are approved AND (whose first name is similar to "Test" OR whose last name contains "Test"). I've been struggling with this for quite a while, so any help would be greatly appreciated!
I digged into fuzzy_search and saw that it will be translated to something like:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rankxxx"
FROM "people"
WHERE (("people"."first_name" % 'abc'))
ORDER BY "rankxxx" DESC
That says if you don't care about preserving order, it will just filter the result by WHERE (("people"."first_name" % 'abc'))
Knowing that and now you can simply write the query with similar functionality:
People.where(status: status_approved)
.where('(first_name % :key) OR (last_name LIKE :key)', key: 'Test')
In case you want order, please specify what would you like the order will be after joining 2 conditions.
After a few days, I came up with the solution! Here's what I did:
This is the query I wanted to chain together with an OR:
people = People.where(status: status_approved).fuzzy_search(first_name: "Test").where("last_name LIKE ?", "Test")
As Hoang Phan suggested, when you look in the console, this produces the following SQL:
SELECT "people".*, COALESCE(similarity("people"."first_name", 'test'), 0) AS "rank69146689305952314"
FROM "people"
WHERE "people"."status" = 1 AND (("people"."first_name" % 'Test')) AND (last_name LIKE 'Test') ORDER BY "rank69146689305952314" DESC
I then dug into the textacular gem and found out how the rank is generated. I found it in the textacular.rb file and then crafted the SQL query using it. I also replaced the "AND" that connected the last two conditions with an "OR":
# Generate a random number for the ordering
rank = rand(100000000000000000).to_s
# Create the SQL query
sql_query = "SELECT people.*, COALESCE(similarity(people.first_name, :query), 0)" +
" AS rank#{rank} FROM people" +
" WHERE (people.status = :status AND" +
" ((people.first_name % :query) OR (last_name LIKE :query_like)))" +
" ORDER BY rank#{rank} DESC"
I took out all of quotation marks in the SQL query when referring to tables and fields because it was giving me error messages when I kept them there and even if I used single quotes.
Then, I used the find_by_sql method to retrieve the People object IDs in an array. The symbols (:status, :query, :query_like) are used to protect against SQL injections, so I set their values accordingly:
# Retrieve all the IDs of People who are approved and whose first name and last name match the search query.
# The IDs are sorted in order of most relevant to the search query.
people_ids = People.find_by_sql([sql_query, query: "Test", query_like: "%Test%", status: 1]).map(&:id)
I get the IDs and not the People objects in an array because find_by_sql returns an Array object and not a CollectionProxy object, as would normally be returned, so I cannot use ActiveRecord query methods such as where on this array. Using the IDs, we can execute another query to get a CollectionProxy object. However, there's one problem: If we were to simply run People.where(id: people_ids), the order of the IDs would not be preserved, so all the relevance ranking we did was for nothing.
Fortunately, there's a nice gem called order_as_specified that will allow us to retrieve all People objects in the specific order of the IDs. Although the gem would work, I didn't use it and instead wrote a short line of code to craft conditions that would preserve the order.
order_by = people_ids.map { |id| "people.id='#{id}' DESC" }.join(", ")
If our people_ids array is [1, 12, 3], it would create the following ORDER statement:
"people.id='1' DESC, people.id='12' DESC, people.id='3' DESC"
I learned from this comment that writing an ORDER statement in this way would preserve the order.
Now, all that's left is to retrieve the People objects from ActiveRecord, making sure to specify the order.
people = People.where(id: people_ids).order(order_by)
And that did it! I didn't worry about removing any duplicate IDs because ActiveRecord does that automatically when you run the where command.
I understand that this code is not very portable and would require some changes if any of the people table's columns are modified, but it works perfectly and seems to execute only one query according to the console.
Related
In our application, the Recipe model has many ingredients (many-to-many relationship implemented using :through). There is a query to return all the recipes where at least one ingredient from the list is contained (using ILIKE or SIMILAR TO clause). I would like to pose two questions:
What is the cleanest way to write the query which will return this in Rails 6 with ActiveRecord. Here is what we ended up with
ingredients_clause = '%(' + params[:ingredients].map { |i| i.downcase }.join("|") + ')%'
recipes = recipes.where("LOWER(ingredients.name) SIMILAR TO ?", ingredients_clause)
Note that recipes is already created before this point.
However, this is a bit dirty solution.
I also tried to use ILIKE = any(array['ing1', 'ing2',..]) with the following:
ingredients_clause = params[:ingredients].map { |i| "'%#{i}%'" }.join(", ")
recipes = recipes.where("ingredients.name ILIKE ANY(ARRAY[?])", ingredients_clause)
This won't work since ? automatically adds single quotes so it would be
ILIKE ANY (ARRAY[''ing1', 'ing2', 'ing3'']) which is of course wrong.
Here, ? is used to sanitise parameters for SQL query, so avoid possible SQL injection attacks. That is why I don't want to write a plain query formed from params.
Is there any better way to do this?
What is the best approach to order results by the number of ingredients that are matched? For example, if I search for all recipes that contains ingredients ing1 and ing2 it should return those which contains both before those which contains only one ingredient.
Thanks in advance
For #1, a possible solution would be something like (assuming the ingredients table is already joined):
recipies = recipies.where(Ingredients.arel_table[:name].lower.matches_any(params[:ingredients]))
You can find more discussion on this kind of topic here: Case-insensitive search in Rails model
You can access a lot of great SQL query features via #arel_table.
#2 If we assume all the where clauses are applied to recipies already:
recipies = recipies
.group("recipies.id")
# Lets Rails know you meant to put a raw SQL expression here
.order(Arel.sql("count(*) DESC"))
Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")
I have a set-up with multiple contests and objects. They are tied together with a has_many :through arrangement with contest_objs. contest_objs also has votes so I can have several contests including several objects. I have a complex SQL setup to calculate the current ranking. However, I need to specify the contest in the SQL select statement for the ranking. I am having difficulty doing this. This is what I got so far:
#objects = #contest.objects.select('"contest_objs"."votes" AS v, name, "objects"."id" AS id,
(SELECT COUNT(DISTINCT "oi"."object_id")
FROM contest_objs oi
WHERE ("oi"."votes") > ("contest_objs"."votes"))+1 AS vrank')
Is there any way in the selection of vrank to specify that WHERE also includes "oi"."contest_id" = #contest.id ?
Since #contest.id is an integer and does not present any risk of an SQL Injection, you could do the following using string interpolation :
Model.select("..... WHERE id = #{#contest.id}")
Another possible solution would be to build your subquery using ActiveRecord, and then call .to_sql in order to get the generated SQL, and insert it in your main query.
Use sanitize_sql_array:
sanitize_sql_array('select ? from foo', 'bar')
If you're outside a model, because the method is protected you have to do this:
ActiveRecord::Base.send(:sanitize_sql_array, ['select ? from foo', 'bar'])
http://apidock.com/rails/ActiveRecord/Sanitization/ClassMethods/sanitize_sql_array
You can insert variables into sql commands like this:
Model.select("...... WHERE id = ?", #contest.id)
Rails will escape the values for you.
Edit:
This does not work as stated by Intrepidd in the comments, use string interpolation like he suggested in his answer. That is safe for integer parameters.
If you find yourself inserting several strings in a query, you could consider using find_by_sql, which gives you the above mentioned ? replacement, but you can't use it with chaining, so rewriting the whole query would be needed.
I already have a working solution, but I would really like to know why this doesn't work:
ratings = Model.select(:rating).uniq
ratings.each { |r| puts r.rating }
It selects, but don't print unique values, it prints all values, including the duplicates. And it's in the documentation: http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
Model.select(:rating)
The result of this is a collection of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient):
Model.uniq.pluck(:rating)
Rails 5+
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
If you're going to use Model.select, then you might as well just use DISTINCT, as it will return only the unique values. This is better because it means it returns less rows and should be slightly faster than returning a number of rows and then telling Rails to pick the unique values.
Model.select('DISTINCT rating')
Of course, this is provided your database understands the DISTINCT keyword, and most should.
This works too.
Model.pluck("DISTINCT rating")
If you want to also select extra fields:
Model.select('DISTINCT ON (models.ratings) models.ratings, models.id').map { |m| [m.id, m.ratings] }
Model.uniq.pluck(:rating)
# SELECT DISTINCT "models"."rating" FROM "models"
This has the advantages of not using sql strings and not instantiating models
Model.select(:rating).uniq
This code works as 'DISTINCT' (not as Array#uniq) since rails 3.2
Model.select(:rating).distinct
Another way to collect uniq columns with sql:
Model.group(:rating).pluck(:rating)
If I am going right to way then :
Current query
Model.select(:rating)
is returning array of object and you have written query
Model.select(:rating).uniq
uniq is applied on array of object and each object have unique id. uniq is performing its job correctly because each object in array is uniq.
There are many way to select distinct rating :
Model.select('distinct rating').map(&:rating)
or
Model.select('distinct rating').collect(&:rating)
or
Model.select(:rating).map(&:rating).uniq
or
Model.select(:name).collect(&:rating).uniq
One more thing, first and second query : find distinct data by SQL query.
These queries will considered "london" and "london " same means it will neglect to space, that's why it will select 'london' one time in your query result.
Third and forth query:
find data by SQL query and for distinct data applied ruby uniq mehtod.
these queries will considered "london" and "london " different, that's why it will select 'london' and 'london ' both in your query result.
please prefer to attached image for more understanding and have a look on "Toured / Awaiting RFP".
If anyone is looking for the same with Mongoid, that is
Model.distinct(:rating)
Some answers don't take into account the OP wants a array of values
Other answers don't work well if your Model has thousands of records
That said, I think a good answer is:
Model.uniq.select(:ratings).map(&:ratings)
=> "SELECT DISTINCT ratings FROM `models` "
Because, first you generate a array of Model (with diminished size because of the select), then you extract the only attribute those selected models have (ratings)
You can use the following Gem: active_record_distinct_on
Model.distinct_on(:rating)
Yields the following query:
SELECT DISTINCT ON ( "models"."rating" ) "models".* FROM "models"
In my scenario, I wanted a list of distinct names after ordering them by their creation date, applying offset and limit. Basically a combination of ORDER BY, DISTINCT ON
All you need to do is put DISTINCT ON inside the pluck method, like follow
Model.order("name, created_at DESC").offset(0).limit(10).pluck("DISTINCT ON (name) name")
This would return back an array of distinct names.
Model.pluck("DISTINCT column_name")
edit 2
If you stumble across this, check both answers as I'd now use pluck for this
I have a fairly large custom dataset that I'd like to return to be echoe'd out as json. One part is:
l=Location.find(row.id)
tmp[row.id]=l
but I'd like to do something like:
l=Location.find(row.id).select("name, website, city")
tmp[row.id]=l
but this doesn't seem to be working. How would I get this to work?
thx
edit 1
alternatively, is there a way that I can pass an array of only the attributes I want included?
pluck(column_name)
This method is designed to perform select by a single column as direct SQL query Returns Array with values of the specified column name The values has same data type as column.
Examples:
Person.pluck(:id) # SELECT people.id FROM people
Person.uniq.pluck(:role) # SELECT DISTINCT role FROM people
Person.where(:confirmed => true).limit(5).pluck(:id)
see http://api.rubyonrails.org/classes/ActiveRecord/Calculations.html#method-i-pluck
Its introduced rails 3.2 onwards and accepts only single column. In rails 4, it accepts multiple columns
In Rails 2
l = Location.find(:id => id, :select => "name, website, city", :limit => 1)
...or...
l = Location.find_by_sql(:conditions => ["SELECT name, website, city FROM locations WHERE id = ? LIMIT 1", id])
This reference doc gives you the entire list of options you can use with .find, including how to limit by number, id, or any other arbitrary column/constraint.
In Rails 3 w/ActiveRecord Query Interface
l = Location.where(["id = ?", id]).select("name, website, city").first
Ref: Active Record Query Interface
You can also swap the order of these chained calls, doing .select(...).where(...).first - all these calls do is construct the SQL query and then send it off.
My answer comes quite late because I'm a pretty new developer. This is what you can do:
Location.select(:name, :website, :city).find(row.id)
Btw, this is Rails 4