I am not sure if I am going about this the best way, but I will try to explain what I am trying to do.
I have the following domain classes
class User {
static hasMany = [goals: Goal]
}
So each User has a list of Goal objects. I want to be able to take an instance of User and return 5 Users with the highest number of matching Goal objects (with the instance) in their goals list.
Can someone kindly explain how I might go about doing this?
The easiest and most efficient way to achieve this is using plain SQL. Assuming you have these tables
users [id]
goals [id, description]
user_goals [user_id, goal_id]
You can have the following query to do what you need:
set #userId=123;
select user_id, count(*) as matched from user_goals
where user_id!=#userId
and goal_id in (select ug.goal_id from user_goals ug where ug.user_id=#userId)
group by user_id order by matched desc limit 5;
This takes a user id and returns a list of other users with matching goals, sorted by the number of matches. Wrap it up in a GoalService and you're done!
class GoalService {
def findUsersWithSimilarGoals(user) {
// ...
}
}
It may also be possible to do this with criteria or HQL, but with queries like this it's usually easier to use SQL.
If you're looking for a simple match, perhaps the easiest way would be to do a findAll for each Goal and then count the number of results that each other User appears in:
Map user2Count = [:]
for (goal in myUser.goals){
for (u in User.findAllByGoal(goal)){
def count = user2Count.containsKey(u) ? user2Count.get(u) : 0
count++
user2Count.put(u, count)
}
}
// get the top 5 users
def topUsers = user2Count.entrySet().sort({ it.value }).reverse()[0..5]
This may be too slow, depending on your needs, but it is simple. If many users share the same goals then you could cache the results of findAllByGoal.
Related
Assuming this simplified schema:
users has_many discount_codes
discount_codes has_many orders
I want to grab all users, and if they happen to have any orders, only include the orders that were created between two dates. But if they don't have orders, or have orders only outside of those two dates, still return the users and do not exclude any users ever.
What I'm doing now:
users = User.all.includes(discount_codes: :orders)
users = users.where("orders.created_at BETWEEN ? AND ?", date1, date2).
or(users.where(orders: { id: nil })
I believe my OR clause allows me to retain users who do not have any orders whatsoever, but what happens is if I have a user who only has orders outside of date1 and date2, then my query will exclude that user.
For what it's worth, I want to use this orders where clause here specifically so I can avoid n + 1 issues later in determining orders per user.
Thanks in advance!
It doesn't make sense to try and control the orders that are loaded as part of the where clause for users. If you were to control that it'd have to be part of the includes (which I think means it'd have to be a part of the association).
Although technically it can combine them into a single query in some cases, activerecord is going to do this as two queries.
The first query will be executed when you go to iterate over the users and will use that where clause to limit the users found.
It will then run a second query behind the scenes based on that includes statement. This will simply be a query to get all orders which are associated with the users that were found by the previous query. As such the only way to control the orders that are found through the user's where clause is to omit users from the result set.
If I were you I would create an instance method in User model for what you are looking for but instead of using where use a select block:
def orders_in_timespan(start, end)
orders.select{ |o| o.between?(start, end) }
end
Because of the way ActiveRecord will cache the found orders from the includes against the instance then if you start off with an includes in your users query then I believe this will not result in n queries.
Something like:
render json: User.includes(:orders), methods: :orders_in_timespan
Of course, the easiest way to confirm the number of queries is to look at the logs. I believe this approach should have two queries regardless of the number of users being rendered (as likely does your code in the question).
Also, I'm not sure how familiar you are with sql but you can call .to_sql on the end of things such as your users variable in order to see the sql that would be generated which might help shed some light on the discrepancies between what you're getting and what you're looking for.
Option 1: Write a custom query in SQL (ugly).
Option 2: Create 2 separate queries like below...
#users = User.limit(10)
#orders = Order.joins(:discount_code)
.where(created_at: [10.days.ago..1.day.ago], discount_codes: {user_id: users.select(:id)})
.group_by{|order| order.discount_code.user_id}
Now you can use it like this ...
#users.each do |user|
orders = #orders[user.id]
puts user.name
puts user.id
puts orders.count
end
I hope this will solve your problem.
You need to use joins instead of includes. Rails joins use inner joins and will reject all the records which don't have associations.
User.joins(discount_codes: :orders).where(orders: {created_at: [10.days.ago..1.day.ago]}).distinct
This will give you all distinct users who placed orders in a given period of time.
user = User.joins(:discount_codes).joins(:orders).where("orders.created_at BETWEEN ? AND ?", date1, date2) +
User.left_joins(:discount_codes).left_joins(:orders).group("users.id").having("count(orders.id) = 0")
Say I'm modeling Students, Lessons, and Teachers. Given a single student enrolled in many lessons, how would I find all of their teachers of classes that are level 102? For that matter, how would I find all of their lessons' teachers? Right now, I have this:
s = Mongoid::Student.find_by(name: 'Billy')
l = s.lessons.where(level: 102)
t = l.map { |lesson| lesson.teachers }.flatten
Is there a way to do turn the second two lines into one query?
Each collection requires at least one query, there's no way to access more than one collection in one query (i.e. no JOINs) so that's the best you can do. However, this:
t = l.map { |lesson| lesson.teachers }.flatten
is doing l.length queries to get the teachers per lesson. You can clean that up by collecting all the teacher IDs from the lessons:
teacher_ids = l.map(&:teacher_ids).flatten.uniq # Or use `inject` or ...
and then grab the teachers based on those IDs:
t = Teacher.find(teacher_ids)
# or
t = Teacher.where(:id.in => teacher_ids).to_a
If all those queries don't work for you then you're stuck with denormalizing something so that you have everything you need embedded in a single collection; this would of course mean that you'd have to maintain the copies as things change and periodically sanity check the copies for consistency problems.
Does it possible to have an order by "property" with a where clause and now the "index/position" of the result?
I mean, when using order for sorting we need to be able to know the position of the result in the sort.
Imagine a scoreboard with 1 million user node, i do an order by on user node.score with a where "name = user_name" and i wan't to know the current rank of the user. I do not find how to do this using order by ...
start game=node(1)
match game-[:has_child_user]->user
with user
order by user.score
with user
where user.name = "my_user"
return user , "the position in the sort";
the expected result would be :
node_user | rank
(i don't want to fetch one million entries at client side to know the current rank/position of a node in the ORDER BY!)
This functionality does not exist today in Cypher. Do you have an example of what this would look like in SQL? Would the below be something that fits the bill? (just a sketch, not working!)
(your code)
start game=node(1)
match game-[:has_child_user]->user
with user
order by user.score
(+ this code)
with user, index() as rank
return user.name, rank;
If you have more thoughts or want to start hacking on this please open an issue at https://github.com/neo4j/neo4j/issues
For the time being there is a work around that you can do:
start n=node(0),rank_node=node(1)
match n-[r:rank]->rn
where rn.score <= rank_node.score
return rank_node,count(*) as pos;
For live example see: http://console.neo4j.org/?id=bela20
I have a domain class called Order and that class has hasMany relation with Item class.
When I am querying for the list of orders with certain restrictions I am getting as many instances of Order as there are items.
So for example Order instance has say references to 3 instances of Item then , criteria call on Order is returning 3 duplicate instances of Order. I am not sure but if it's worth mentioning that the domain class Order has fetchMode set to "eager".
I am really puzzled with what's going on there. Any help in this regard will be greatly appreciated. Snippet of code is attached:
def clazz = "cust.Order"
def criteria = clazz.createCriteria()
println("clazz == "+Order.list())// returning correct data i.e unique instance of order
def filter = {
// trimmed down all filtering criteria for debugging
}//close filter
List results = criteria.list(max:params?.max,offset:params?.offset,filter)
results.each{Object data->
println(data.getClass())
}
println("results == "+results)
Thanks again
One solution is to use this inside your query:
resultTransformer org.hibernate.Criteria.DISTINCT_ROOT_ENTITY
If you call criteria.listDistinct instead of criteria.list duplicates will be eliminated
Criteria API is just a wrapper for constructing a SQL query. In your case, the query in question has JOINs in it (because of the eager fetching), and returns a cartesian product of Orders and their matching Items. Each row returned is included in results as a separate Order instance.
The easiest way to remove duplicates is to put all the results in a Set, like this:
def resultSet = new HashSet()
resultSet.addAll(results)
println("results == " + resultSet)
You could also use dynamic finders, as in Order.findAllBy* .Depending on how complicated your filter is, this could be easy or tough :)
I have a User class that hasMany = [friends:User]
Now I am trying to display the friends list - ${user.friends}
However, I'd like to be able to apply parameters like I can with, for example, User.findAllBy(user, [max:10, sort: 'dateCreated', order: 'desc"])
Can someone kindly tell me how to do this on a one-to-many in Grails / Groovy ?
I'd use HQL:
String hql = '''
select u from User u, User u2
where u in elements(u2.friends) and u2=:user
order by u.dateCreated desc
'''
int max = ...
int offset = ...
def friends = User.executeQuery(hql, [user: user], [max: max, offset: offset])
You could try to filter the friends collection, but as soon as you do anything with it it'll get fully loaded from the database, so if you only want 10 instances you'll have wasted loading all of the rest.
I just came across the GORM Labs plugin for Grails and it offers the following functionality
"For every "hasMany" property, there is now an associated method that takes pagination properties ("offset" and "max") and produces the appropriate page. There is also a "countBars" instance property that will provide the total size of the "bars" collection. This is all done with a separate database query if the collection has not been initialized previously, so you can avoid loading all the elements of the collection"
http://www.grails.org/plugin/gorm-labs