Removing “duplicate objects” with same attributes using Array.map - ruby-on-rails

As you can see in the current code below, I am finding the duplicate based on the attribute recordable_id. What I need to do is find the duplicate based on four matching attributes: user_id, recordable_type, hero_type, recordable_id. How must I modify the code?
heroes = User.heroes
for hero in heroes
hero_statuses = hero.hero_statuses
seen = []
hero_statuses.sort! {|a,b| a.created_at <=> b.created_at } # sort by created_at
hero_statuses.each do |hero_status|
if seen.map(&:recordable_id).include? hero_status.recordable_id # check if the id has been seen already
hero_status.revoke
else
seen << hero_status # if not, add it to the seen array
end
end
end

Try this:
HeroStatus.all(:group => "user_id, recordable_type, hero_type, recordable_id",
:having => "count(*) > 1").each do |status|
status.revoke
end
Edit 2
To revoke the all the latest duplicate entries do the following:
HeroStatus.all(:joins => "(
SELECT user_id, recordable_type, hero_type,
recordable_id, MIN(created_at) AS created_at
FROM hero_statuses
GROUP BY user_id, recordable_type, hero_type, recordable_id
HAVING COUNT(*) > 1
) AS A ON A.user_id = hero_statuses.user_id AND
A.recordable_type = hero_statuses.recordable_type AND
A.hero_type = hero_statuses.hero_type AND
A.recordable_id = hero_statuses.recordable_id AND
A.created_at < hero_statuses.created_
").each do |status|
status.revoke
end

Using straight Ruby (not the SQL server):
heroes = User.heroes
for hero in heroes
hero_statuses = hero.hero_statuses
seen = {}
hero_statuses.sort_by!(&:created_at)
hero_statuses.each do |status|
key = [status.user_id, status.recordable_type, status.hero_type, status.recordable_id]
if seen.has_key?(key)
status.revoke
else
seen[key] = status # if not, add it to the seen array
end
end
remaining = seen.values
end
For lookups, always use Hash (or Set, but here I thought it would be nice to keep the statuses that have been kept)
Note: I used sort_by!, but that's new to 1.9.2, so use sort_by (or require "backports")

Related

How to "merge" "duplicate" active record object?

So, I'm building a map where i put pins (Google map).
I have a query that gives me position of my pins, however I can have multiple pins with the exact same position but a different description.
I've been banging my head against the desk on this since this morning for some reason I can't get a working solution,
Here is what I have so far:
building_permits = BuildingPermit.select('latitude, longitude, street, city, state, permit_number, description, type, id').where(:contractor_id => params[:nid])
#bp = Array.new
building_permits.each do |bp|
#bp.push({"lat" => bp.latitude, "lng" => bp.longitude, "desc" => "<p><b>#{bp.street} #{bp.city}, #{bp.state}</b></p><p><b>Description:</b>#{bp.description}</p><p><b>Permit #{bp.permit_number}</b></p><p><b>#{bp.type}</b></p>", "id" => bp.id})
end
nb_rm = 0
building_permits.each_with_index do |bp, index|
index1 = index
building_permits.each_with_index do |bp2, index|
if bp.longitude == bp2.longitude && bp.latitude == bp2.latitude && bp.id != bp2.id
#debugger
if #bp[index1].present?
#bp[index1]["desc"] << "<br /><br /><b>Description:</b>#{bp2.description}</p><p><b>Permit #{bp2.permit_number}</b></p><p><b>#{bp2.type}</b></p>"
#bp.delete_at(index-nb_rm)
nb_rm += 1
end
end
end
end
I'm sure there is something really stupid that's screwing the whole thing, but can't find it.
Try to duplicate your pin location:
# rails < 3.1
new_record = old_record.clone
#rails >= 3.1
new_record = old_record.dup
#and then
new_record.save
And after all change description of the pin.
I had a similar problem and wanted to post my solution -- I had duplicate authors and wanted to find any authors where the name attribute was equal, and move related objects (books and recommendations) onto the first of each duplicate group, then delete the others.
Author.all.each do |a|
#name = a.name
if Author.where(:name => #name).count > 1
first = Author.where(:name => #name).first
a.books.each{|b| b.update_attributes(author_id: first.id)}
a.recommendations.each{|r| r.update_attributes(author_id: first.id)}
end
end
Author.all.each do |a|
if a.books.count == 0 && a.recommendations.count == 0
a.delete
end
end
Hopefully that can be helpful to someone!

How to query many fields, allowing for NULL

I have a Rails site that logs simple actions such as when people upvote and downvote information. For every new action, an EventLog is created.
What if the user changes his or her mind? I have an after_create callback that looks for complementary actions and deletes both if it finds a recent pair. For clarity, I mean that if a person upvotes something and soon cancels, both event_logs are deleted. What follows is my callback.
# Find duplicate events by searching nearly all the fields in the EventLog table
#duplicates = EventLog.where("user_id = ? AND event = ? AND project_id = ? AND ..., ).order("created_at DESC")
if #duplicates.size > 1
#duplicates.limit(2).destroy_all
end
The above code doesn't quite work because if any of the fields happen to be nil, the query returns [].
How can I write this code so it can handle null values, and/or is there a better way of doing this altogether?
If I understood this correctly,
some of the fields can be nil, and you want to find activity logs that have same user_id, same project_id or project id can be nil.
So I guess this query should work for you.
ActivityLog.where(user_id: <some_id> AND activity: <complementary_id> AND :project_id.in => [<some_project_id>, nil] ....)
This way you would get the complementary event logs where user_id is same and project id may or may not be present
class ActivityLog
QUERY_HASH = Proc.new{ {user_id: self.user_id,
activity: complementary_id(self.id),
and so on....
} }
How about:
# event_log.rb
def duplicate_attr_map
{
:user_id,
:project_id
}
end
def duplicates
attribs = duplicate_attr_map.reject_if(&:blank?)
query = attribs.map { |attr| "#{attr} = ?" }.join(' AND ')
values = attribs.map { |attr| self.send(attr) }
EventLog.where(query, *values).order("created_at DESC")
end
def delete_duplicates(n)
duplicates.limit(n).delete_all if duplicates.size > 1
end
# usage:
# EventLog.find(1).delete_duplicates(2)
not tested, could be improved

Rails Array Conditions Query

I've written the following method to combine the References of Sections model and it's children:
def combined_references
ids = []
ids << self.id
self.children.each do |child|
ids << child.id
end
Reference.where("section_id = ?", ids)
end
But section.combined_references returns the following error:
Mysql2::Error: Operand should contain 1 column(s): SELECT `references`.* FROM `references` WHERE (section_id = 3,4)
It seems to have collected the correct values for ids, have I structured the query incorrectly?
Transform last line to:
Reference.where(section_id: ids)
and it should produce:
SELECT `references`.* FROM `references` WHERE section_id IN (3,4)
And you can shorten your code by one line with :
ids = []
ids << self.id
to
ids = [self.id]
it's invalid statement
WHERE (section_id = 3,4)
correct would be
WHERE (section_id in (3,4))
Please use:
Reference.where(:section_id => ids)
You can try something like this instead:
def combined_references
ids = self.children.map(&:id).push(self.id)
Reference.where(section_id: ids)
end
You can also query the database with:
Reference.where("section_id in (?)", ids)
The following has the most readability in my opinion:
def combined_references
Reference.where(section_id: self_and_children_ids)
end
private
def self_and_children_ids
self.children.map(&:id).push(self.id)
end

Problem sorting an array

I'm mixing 2 arrays and want to sort them by their created_at attribute:
#current_user_statuses = current_user.statuses
#friends_statuses = current_user.friends.collect { |f| f.statuses }
#statuses = #current_user_statuses + #friends_statuses
#statuses.flatten!.sort!{ |a,b| b.created_at <=> a.created_at }
The #current_user_statuses and #friends_statuses each sort correctly, but combined they sort incorrectly, with the #friends_statuses always showing up on top sorted by their created_at attribute and the #current_user_statuses on the bottom sorted by their created_at attribute.
This is the view:
<% #statuses.each do |d| %>
<%= d.content %>
<% end %>
Try:
(current_user.statuses + current_user.friends.collect(&:statuses)) \
.flatten.compact.sort_by(&:created_at)
You can not daisy chain the flatten! method like that. flatten! returns nil if no changes were made to the array. When you sort nil nothing will happen.
You need to separate them:
#statuses.flatten!
#statuses.sort! { ... }
Here's how I'd do it:
Set up the classes:
class User
class Status
attr_reader :statuses, :created_at
def initialize(stats)
#statuses = stats
#created_at = Time.now
end
end
attr_reader :statuses, :friends
def initialize(stats=[], friends=[])
#statuses = Status.new(stats)
#friends = friends
end
end
Define some instances, with some time gaps just for fun:
friend2 = User.new(%w[yellow 2])
sleep 1
friend1 = User.new(%w[orange 1])
sleep 2
current_user = User.new(%w[green 1], [friend1, friend2])
Here's how I'd do it differently; Get the statuses in created_at order:
statuses = [
current_user.statuses,
current_user.friends.collect(&:statuses)
].flatten.sort_by(&:created_at)
Which looks like:
require 'pp'
pp statuses
# >> [#<User::Status:0x0000010086bd60
# >> #created_at=2011-07-02 10:49:49 -0700,
# >> #statuses=["yellow", "2"]>,
# >> #<User::Status:0x0000010086bc48
# >> #created_at=2011-07-02 10:49:50 -0700,
# >> #statuses=["orange", "1"]>,
# >> #<User::Status:0x0000010086bb30
# >> #created_at=2011-07-02 10:49:52 -0700,
# >> #statuses=["green", "1"]>]
I'm just building a temporary containing array to hold the current_user's status, plus the status of all the friends, then flattening it.
The (&:statuses) and (&:created_at) parameters are Rails short-hand for the statuses method of the instance, or created_at method of the instance.
#statuses = (#current_user_statuses + #friends_statuses).sort_by(&:created_at)
I know there are several solutions posted for your question. But all of these solutions can kill your system when the number of statuses grow in size. For this dataset, you have to perform the sorting and pagination in the database layer and NOT in the Ruby layer
Approach 1: Simple and concise
Status.find_all_by_user_id([id, friend_ids].compact, :order => :created_at)
Approach 2: Long and efficient
class User
def all_statuses
#all_statuses ||=Status.all( :joins => "JOIN (
SELECT friend_id AS user_id
FROM friendships
WHERE user_id = #{self.id}
) AS friends ON statuses.user_id = friends.user_id OR
statuses.user_id = {self.id}",
:order => :created_at
)
end
end
Now you can get the sorted statuses in single query:
user.all_statuses
PPS: If this is my code I would further optimize the SQL. Refer to this answer for some more details.

Ruby on Rails: Search on has_many association with inner join and other conditions

I've been searching for this for a long time now, and I can't get to find the answer anywhere.
What I have is a basic search function to find houses, everything works, and till so far the main piece for getting houses looks something like this (the conditions are for example):
#houses = House.all(
:conditions => ["houses.province_id = ? AND open_houses.date > ? AND houses.surface = ?", "4", "Tue, 08 Feb 2011", "125"],
:select => 'distinct houses.*',
:joins => "INNER JOIN open_houses ON open_houses.house_id = houses.id" )
Now, I have this has_many association for the specifications of a house (like balcony, swimming pool etc)..
For this I have the standard setup. A table for the spec-names, and a table with the house_id and the specification_id.
But now, I need to add these to the search function. So someone can find a house with Swimming pool AND a balcony.
I'm sure there is a solution, but I just don't know where to put it in the code, and how.. Google just get's me to pages like it, but not to pages explaining exactly this.
This is what my params look like:
Parameters: {"dateto"=>"", "commit"=>"ZOEKEN!", "pricefrom"=>"", "location"=>"", "province_id"=>"4", "rooms"=>"", "datefrom"=>"", "surface"=>"125", "utf8"=>"✓", "priceto"=>"", "filters"=>{"by_specifications"=>["2", "5", "10"]}, "house_category_id"=>""}
Hope someone can help, if something is unclear please let me know.
Thanks
EDIT:
Oke, I've got it to work! Thanks very much!
There's only one little problem: A house shows up if either one of the specs exists for the house.. So it's an OR-OR query, but what I want is AND-AND..
So if a user searches for balcony and garage, the house must only show up if both of these exist for the house..
What I did now is this: for each specification searched, a query is being made.. (code is below)
I'm wondering, is this the right way to go?
Because it works, but I get double matches.. ( I filter them using "uniq" )
The problem with uniq is that I can't get "will_paginate" to work with it..
This is my final code:
def index
ActiveRecord::Base.include_root_in_json = false
# I'm creating conditions with a function I made, PM me for the code..
conditions = createConditions( createParameters( ) );
query = House.includes( :open_houses ).where( conditions )
unless params[:specifications].blank?
query = query.joins( :house_specifications )
query = query.group( 'open_houses.id' )
query = query.where( 'house_specifications.specification_id' => params[:specifications] )
query = query.order( 'count(open_houses.id) DESC' )
# query = query.having( 'count(open_houses.id) = ?', [params[:specifications].length.to_s] )
end
query = query.order( (params[:sort].blank?)? "open_houses.date ASC, open_houses.from ASC" : params[:sort] )
if params[:view] == "list"
page = params[:page] unless params[:page].blank?
#houses = query.all.uniq.paginate( :page => page || "1", :per_page => 5 )
else
#houses = query.all.uniq
end
respond_to do |format|
format.html
format.js
end
end
Thanks for the help, really appreciate it!
Try this (Rails 3):
House.joins(:houses_specifications).
where('houses_specifications.specification_id' => params[:by_specifications]).
where(...).all
You can have a bunch of if's and case's to add filters, group_by's, limits etc before you run that final .all on the query.
house_query = House.where(:pink => true)
unless params[:by_specifications].blank?
house_query = house_query.joins(:houses_specifications).
where('houses_specifications.specification_id' => params[:by_specifications])
end
...
#houses = house_query.all
Edit:
New and improved version that doesn't query directly on the join table for readability, group_by for distinctness and does an intersect to get houses with all specs.
query = House.joins(:open_houses).where(conditions)
unless params[:specifications].blank?
query = query.joins(:specifications).
group('houses.id').
where(:specifications => params[:specifications]).
having('count(*)=?', params[:specifications].count)
end
if params[:view] == "list"
#houses = query.all.paginate(:page => params[:page])
else
#houses = query.all
end
You can change the "having" to an "order" to get those with the most matching specs first.

Resources