Double-tally in Rails then format for upsert_all - ruby-on-rails

I have a table that looks like this:
# CategoryTransaction
category_id, transaction_id, buyer_id, seller_id
Both buyer_id and seller_id refer to the Person table (as a Person can buy and/or sell). I need to tally the combinations of [category_id, buyer_id] and [category_id, supplier_id], and store them in this table so we know how often a given Person bought/sold a given Category:
# CategoryPerson
person_id, category_id, bought_count, sold_count
This is what I’ve got so far:
# 1. Collect transactions grouped by category:
category_transactions = CategoryTransaction.all.select(:category_id, :buyer_id, :supplier_id).group_by(&:category_id)
# Looks like: { category_id: [CategoryTransaction, …], … }
# 2. Calculate the two sets of tallies:
tallies = category_transactions.collect{ |k,v| [k, v.collect(&:buyer_id).tally, v.collect(&:supplier_id).tally] }
# Looks like: [category_id, { buyer_id: count, buyer2_id: count, … }, { seller_id: count, … }]
What I’d like is to get this data into suitable format to do an upsert_all, which would look like: [{ category_id:, person_id:, bought_count: 0, sold_count: 0 }, {…}, …], so I can run something like:
CategoryPerson.upsert_all( tallies.map{ |cp| { category_id: cp.category_id, person_id: cp.person_id, bought_count: cp.bought_count, sold_count: cp.sold_count } }, unique_by: [:category_id, :person_id])
How can I get from my current category_transactions or tallies to this format for the upsert_all?

Related

Is it better to use select or loop through an array of records in Rails?

My goal is to write the query below in the cleanest, most efficient way possible and minimize hitting the DB. Appreciate any guidance in advance.
I have retrieved some records that belong to a user, like below:
english_shows = #user.shows.where(language: 'english')
Let's say the shows belong to different categories (using a foreign key), so it looks like below:
<ActiveRecord::Relation [
<Show id: 1, category_id: 1, title: 'Rick and Morty'>,
<Show id: 2, category_id: 2, title: 'Black Mirror'>,
<Show id: 3, category_id: 3, title: 'Stranger Things'>,
<Show id: 4, category_id: 3, title: 'Game of Thrones'>,
...
]
If I want to get the titles of the shows for each category, I know I can use select like this. The same thing can be done with where, but this would cause an additional DB call. ([Edit] Actually, both would hit the DB twice).
# Using select
cartoons = english_shows.select { |show| show.category_id == Category.find_by(name: 'cartoon').id}.pluck(:title)
# Using where
cartoons = english_shows.where(category_id: Category.find_by(name: 'cartoon').id)pluck(:title)
However, the select method would still result in multiple lines of long code (in my actual use case I have more category types). Is it cleaner to loop through the records like this (taken from this SO answer)?
cartoons, science_fiction, fantasy = [], [], []
#cartoon_id = Category.find_by(name: 'cartoon')
#science_fiction_id = Category.find_by(name: 'cartoon')
#fantasy_id = Category.find_by(name: 'cartoon')
english_shows.each do |show|
cartoons << show if show.category_id == #cartoon_id
science_fiction << show if show.category_id == #science_fiction_id
fantasy << show if show.category_id == #fantasy_id
end
Try this:
english_shows
.joins(:category)
.select('shows.*, categories.name as category')
.group_by(&:category)

rails array of hashes calculate one column

I have an array and it has many columns and I want to change one value of my one column.
My array is:
[
{
id: 1,
Districts: "Lakhisarai",
Area: 15.87,
Production: 67.77,
Productivity: 4271,
Year: 2015,
Area_Colour: "Red",
Production_Colour: "Orange",
Productivity_Colour: "Dark_Green",
created_at: "2018-07-24T11:24:13.000Z",
updated_at: "2018-07-24T11:24:13.000Z"
},
{
id: 29,
Districts: "Begusarai",
Area: 18.53,
Production: 29.35,
Productivity: 1584,
Year: 2015,
Area_Colour: "Red",
Production_Colour: "Red",
Productivity_Colour: "Orange",
created_at: "2018-07-24T11:24:13.000Z",
updated_at: "2018-07-24T11:24:13.000Z"
},
...
]
This is my sample array and I want my Productivity to be divided by 100 for that I am using one empty array and pushing these hashes to my array like:
j = []
b.map do |k|
if k.Productivity
u = k.Productivity/100
j.push({id: k.id, Productivity: u })
else
j.push({id: k.id, Productivity: k.Productivity })
end
Is there any simple way where I can generate this kind of array and reflect my changes to to one column. Is there any way where I don't need to push name of column one by one in push method.
I want to generate exact same array with one modification in productivity
let's say your array is e, then:
e.each { |item| item[:Productivity] = item[:Productivity]/100}
Example:
e = [{p: 12, d: 13}, {p:14, d:70}]
e.each { |item| item[:p] = item[:p]/10}
output: [{:p=>1, :d=>13}, {:p=>1, :d=>70}]
You could take help of map method here to create a new array from your original array, but with the mentioned changes.
ary.map do |elem|
h = elem.slice(:id)
h[:productivity] = elem[:Productivity] / 100 if elem[:Productivity]
h
end
=> [{:id=>1, :productivity=>42}, {:id=>29, :productivity=>15}]
Note, Hash#slice returns a new hash with only the key-value pairs for the keys passed in argument e.g. here, it returns { id: 1 } for first element.
Also, we are assigning the calculated productivity to the output only when it is set on original hash. Hence, the if condition there.

How to merge 2 activerecord records together and be left with 1? Rails

I have 2 apples:
{
id: 1,
rotten: true,
branch_on_tree: nil,
type: "red delicious"
},
{
id: 2,
rotten: nil,
branch_on_tree: 5,
type: "red delicious"
}
They are duplicate apples for red delicious. How do I merge the records together and then delete the one with missing data? Is there a convenient way to do this?
Note: There might be like 10 duplicates. I don't want any null values in my final record. Non-null values take precedence.
Not very convinient way but it will work
assuming apples is an array:
[
{
id: 1,
rotten: true,
branch_on_tree: nil,
type: "red delicious"
},
# ...
]
that can come from:
apples = Apple.where(type: "red delicious")
apples_attrs = apples.map(&:attributes)
Then,
apple_attrs = apples_attrs.reduce do |apple, next_apple|
apple.merge(next_apple) do |_, old_value, new_value|
old_value || new_value
end
end
apples.destroy_all
Apple.create(apple_attrs)
You might want to check this guide https://apidock.com/ruby/Hash/merge
Assuming type always has some value, you can use DISTINCT with where clause. The below should work
Apple.where('rotten IS NOT NULL AND branch_on_tree IS NOT NULL').select('DISTINCT ON (type) rotten,branch_on_tree,type').take

Ruby Array#sort_by on array of ActiveRecord objects seems slow

I'm writing a controller index method that returns a sorted array of ActiveRecord Contact objects. I need to be able to sort the objects by attributes or by the output of an instance method. For example, I need to be able to sort by contact.email as well as contact.photos_uploaded, which is an instance method that returns the number of photos a contact has.
I can't use ActiveRecord's native order or reorder method because that only works with attributes that are columns in the database. I know from reading that normally array#sort_by is much faster than array#sort for complex objects.
My question is, how can I improve the performance of this block of code in my controller method? The code currently
contacts = company.contacts.order(last_name: :asc)
if params[:order].present? && params[:order_by].present? && (Contact::READ_ONLY_METHOD.include?(params[:order_by].to_sym) || Contact::ATTRIBUTES.include?(params[:order_by].to_sym))
contacts = contacts.sort_by do |contact|
if params[:order_by] == 'engagement'
contact.engagement.to_i
else
contact.method(params[:order_by].to_sym).call
end
end
contacts.reverse! if params[:order] == 'desc'
end
The root problem here (I think) is that I'm calling sort_by on contacts, which is an ActiveRecord::Relation that could have several hundred contacts in it. Ultimately I paginate the results before returning them to the client, however they need to be sorted before they can be paginated. When I run the block of code above with 200 contacts, it takes an average of 900ms to execute, which could be a problem in a production environment if a user has thousands of contacts.
Here's my Contact model showing some relevant methods. The reason I have a special if clause for engagement is because that method returns a string that needs to be turned into an integer for sorting. I'll probably refactor that before I commit any of this to return an integer. Generally all the methods I might sort on return an integer representing the number of associated objects (e.g. number of photos, stories, etc that a contact has). There are many others, so for brevity I'm just showing a few.
class Contact < ActiveRecord::Base
has_many :invites
has_many :responses, through: :invites
has_many :photos
has_many :requests
belongs_to :company
ATTRIBUTES = self.attribute_names.map(&:to_sym)
READ_ONLY_METHOD = [:engagement, :stories_requested, :stories_submitted, :stories_published]
def engagement
invites = self.invites.present? ? self.invites.count : 1
responses = self.responses.present? ? self.responses.count : 0
engagement = ((responses.to_f / invites).round(2) * 100).to_i.to_s + '%'
end
def stories_requested
self.invites.count
end
def stories_submitted
self.responses.count
end
def stories_published
self.responses.where(published: true).count
end
end
When I run a query to get a bunch of contacts and then serialize it to get the values for all these methods, it only takes ~80ms for 200 contacts. The vast majority of the slowdown seems to be happening in the sort_by block.
The output of the controller method should look like this after I iterate over contacts to build a custom data structure, using this line of code:
#contacts = Hash[contacts.map { |contact| [contact.id, ContactSerializer.new(contact)] }]
I've already benchmarked that last line of code so I know that it's not a major source of slowdown. More on that here.
{
contacts: {
79: {
id: 79,
first_name: "Foo",
last_name: "Bar",
email: "t#t.co",
engagement: "0%",
company_id: 94,
created_at: " 9:41AM Jan 30, 2016",
updated_at: "10:57AM Feb 23, 2016",
published_response_count: 0,
groups: {
test: true,
test23: false,
Test222: false,
Last: false
},
stories_requested: 1,
stories_submitted: 0,
stories_published: 0,
amplify_requested: 1,
amplify_completed: 1,
photos_uploaded: 0,
invites: [
{
id: 112,
email: "t#t.co",
status: "Requested",
created_at: "Jan 30, 2016, 8:48 PM",
date_submitted: null,
response: null
}
],
responses: [ ],
promotions: [
{
id: 26,
company_id: 94,
key: "e5cb3bc80b58c29df8a61231d0",
updated_at: "Feb 11, 2016, 2:45 PM",
read: null,
social_media_posts: [ ]
}
]
}
}
}
if params[:order_by] == 'stories_submitted'
contact_ids = company.contact_ids
# count all invites that have the relevant contact ids
invites=Invite.where(contact_id:contact_ids).group('contact_id').count
invites_contact_ids = invites.map(&:first)
# Add contacts with 0 invites
contact_ids.each{|c| invites.push([c, 0]) unless invites_contact_ids.include?(c)}
# Sort all invites by id (add .reverse to the end of this for sort DESC)
contact_id_counts=invites.sort_by{|r| r.last}.map(&:first)
# The [0, 10] limits you to the lowest 10 results
contacts=Contact.where(id: contact_id_counts[0, 10])
contacts.sort_by!{|c| contact_id_counts.index(c.id)}
end

How do you sort an array alphabetically using sort_by in ruby?

I have an array of memberships. In each membership is a group. I need to sort this array of memberships by the name of the group. I've tried a bunch of different ways, and the latest way is this:
#memberships.sort_by! { |m| m.group.name }
However, this doesn't sort by the name. It appears to be randomly sorting the array.
Membership belongs_to :group
Group has_many :memberships
#memberships is equal to:
[
{
id: 2141,
user_id: 491,
group_id: 271,
member_type: "member",
group: {
id: 271,
name: "Derek's",
privacy: "open",
bio_image_url: "/bio_images/medium/missing.png?1340285189",
member_count: 1,
upcoming_checkins_count: 0
}
},
{
id: 2201,
user_id: 221,
group_id: 291,
member_type: "member",
group: {
id: 291,
name: "Rounded Developement",
privacy: "closed",
bio_image_url: "/groups/medium/291/bioimage.jpg?1340736175",
member_count: 7,
upcoming_checkins_count: 0
}
}
]
NOTE: This does work --> #memberships.sort_by! { |m| m.group.id }
It will order the array based on the group.id so maybe it has something to do with sorting alphabetically?
Any help would be much appreciated.
Wow, after struggling with this for an extremely long time, I realized my problem was a simple one. I was sorting by group.name but some of the group names were uppercase and some were lower, which was throwing it all off. Converting everything to downcase worked well.
#memberships.sort_by!{ |m| m.group.name.downcase }
Is the sort method an option?
ary.sort{ |a,b| a[:group][:name] <=> b[:group][:name] }
I don't see how your code is working. I can't access the hashes in the arrays using m.group.name
Here's a working syntax
#memberships.sort_by!{ |m| m[:group][:name] }

Resources