Ruby on Rails Unwanted Object Caching Behavior - ruby-on-rails

I'm trying to run a simple loop that increments an attribute in a database table, and isn't incrementing as expected. Imagine that our application is a daily deals site. I'm using Rails 3.0.1, for the record.
class Deal
has_many :orders
end
class Order
belongs_to :deal
end
An Order also has an attribute "quantity" - for example of someone buys a few of the same thing.
if someone buys a bunch of orders in a shopping cart, we want to tally the total in each deal to track how many we sold in each deal.
#new_orders.each do |order|
order.deal.update_attribute(:order_count, order.deal.order_count + order.quantity)
end
Firstly, ignore the fact that there may be a better way to write this. This is a contrived example in order to get to my point.
Now, imagine that our current case is where someone bought 3 orders of the same deal in the shopping cart (for no good reason - this example is somewhat contrived). And let's say each order had a quantity of 1.
I would have expected, at the end of the loop, the deal's quantity to be 3. But when I run this code, whether in tests, on the browser, or in the console, I get 1.
It seems as if in each iteration of the loop, when it pulls "order.deal" - since each order happens to belong to the same deal, it's pulling the deal from a cache. So let's say before this loop began, the deal's order_count was 0, each time it pulls a cached copy of the deal which has an order_count of 0.
Is this the expected behavior? Is there a way to turn off this type of caching? What's even stranger is that a colleague of mine tried to run a similar loop in his project and got the expected total. Or am I missing something entirely?
Thanks for your help!

You aren't saving the record. After the increment (but within the loop), you need:
order.deal.save
~~~~~~
Based on your comment:
#new_orders.each do |order|
order_quantity = order.quantity
order.reload
order.deal.update_attribute(:order_count, order.deal.order_count + order_quantity)
end
This will save the new order quantity in a variable, reload the order from the database and then do the update.

Related

sum method alternatives for active records

I am in a legacy Ruby on Rails project. Rails v2.3.9.
I have a model class product, in database there is products table.
class Product < ActiveRecord::Base
...
end
There is price attribute which is a integer type.
I got all products (ActiveRecords) at some point. I need to calculate the total price of all products I got. I know I can do:
total_price = all_products.sum(&:price)
It works.
But it also triggers a database query SELECT sum(...). Is there an alternative way to calculate the summation of price of all products without triggering any database query? I know I can use for loop, but I wonder any other ways?
sum with block is delegated to Enumerable and it will always hit the database if all_products is not previously loaded, so you have to make sure it is not being lazy loaded.
In terms of performance, SUM query would be the fastest way to get the result as it doesn't need to load all records and makes the operation in the database and not in memory.
In your case, if you have the collection loaded and still creating a query, you can use
total_price = all_products.map(&:price).sum
which will default to kindof rockusbacchus solution
.inject { |sum, element| sum + element }
If you already have the products loaded into all_products, you can use Ruby's inject method to sum the prices like so:
total_price = all_products.inject(0){|accumulator, product| accumulator + product.price}
However, you will likely find that this takes longer than just running the extra query. You might want to familiarize yourself with the other "*ect" methods in Ruby such as select and reject. Here's a decent article reviewing them:
https://blog.abushady.com/2014/08/27/select-reject-collect-detect-inject.html

Doing analytics on a large table in Rails / PostGreSQL

I have a "Vote" table in my database which is growing in size everyday, currently at around 100 million rows. For internal analytics / insights I used to have a rake task which would compute a few basic metrics, like the number of votes made daily in the past few days. It's just a COUNT with a where clause on the date "created_at".
This rake task was doing fine until I deleted the index on "created_at" because it seems that it had a negative impact on the app performance for all the other user-facing queries that didn't need this index, especially when inserting a new row.
Currently I don't have a lot of insights as to what is going on in my app and in this table. However I don't really want to add indexes on such a large table if it's only for my own use.
What else can I try ?
Alternately, you could sidestep the Vote table altogether and keep an external tally.
Every time a vote is cast, a separate tally class that keeps a running count of votes cast will be invoked. There will be one tally record per day. A tally record will have an integer representing the number of votes cast on that day.
Each increment call to the tally class will find a tally record for the current date (today), increment the vote count, and save the record. If no record exists, one will be created and incremented accordingly.
For example, let's have a class called VoteTally with two attributes: a date (date), and a vote count (integer), no timestamps, no associations. Here's what the model will look like:
class VoteTally < ActiveRecord::Base
def self.tally_up!
find_or_create_by_date(Date.today).increment!(:votes)
end
def self.tally_down!
find_or_create_by_date(Date.today).decrement!(:votes)
end
def self.votes_on(date)
find_by_date(date).votes
end
end
Then, in the Vote model:
class Vote < ActiveRecord::Base
after_create :tally_up
after_destroy :tally_down
# ...
private
def tally_up ; VoteTally.tally_up! ; end
def tally_down ; VoteTally.tally_down! ; end
end
These methods will get vote counts:
VoteTally.votes_on Date.today
VoteTally.votes_on Date.yesterday
VoteTally.votes_on 3.days.ago
VoteTally.votes_on Date.parse("5/28/13")
Of course, this is a simple example and you will have to adapt it to suit. This will result in an extra query during vote casting, but it's a hell of a lot faster than a where clause on 100M records with no index. Minor inaccuracies are possible with this solution, but I assume that's acceptable given the anecdotal nature of daily vote counts.
It's just a COUNT with a where clause on the date "created_at".
In that case the only credible index you can use is the one on created_at...
If write performance is an issue (methinks it's unlikely...) and you're using a composite primary key, clustering the table using that index might help too.
If the index has really an impact on the write performance, and it's only a few persons which run statistics now and then, you might consider another general approach:
You could separate your "transaction processing database" from your "reporting database".
You could update your reporting database on a regular basis, and create reporting-only indexes only there. What is more queries regarding reports will not conflict with transaction-oriented traffic, and it doesn't matter how long they run.
Of course, this increases a certain delay, and it increases system complexity. On the other hand, if you roll-forward your reporting database on a regular basis, you can ensure that your backup scheme actually works.

Rails ActiveRecord Performance with many Selects and Inserts

I have a Rails 3.2 application that tracks mailings for subscription orders.
The basic model structure is:
Order has_many Subscriptions has_many SubscriptionMailings
Each month a record for each subscription mailing is generated and a csv file is exported from these records.
The mailing address is stored at the order level.
Basically I select all of the subscriptions that are valid to mail that month and loop through them getting the mailing address from the order object. Then I create a new subscription mailing record for each one.
Right now this works ok because there aren't a lot of subscriptions, but it is pretty slow.
How can I speed up this process?
In order to optimize, you need to step down from Ruby level onto the SQL level here.
Instead of doing N+1 selects (1 for fetching all subscription and N for fetching all orders for each subscription) you may be able to do only 1 select with join.
SubscriptionMailing.
joins(:subscrtiption).
joins(:order).
where(Order.table_name => { valid: true })
Sounds like you want to use includes to eager load the orders. Maybe something like this:
# Subscription.rb
scope :valid_for_month lambda {|month| where(month: month)}
# Elsewhere
valid_subscriptions = Subscription.valid_for_month(Time.now.month).includes(:order)
valid_subscriptions.each do |subscription|
subscription.generate_subscription_mailing
end
More on includes: http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html
After some research I ended up wrapping my code in a transaction without making any other changes.
It sped up things quite a bit.
Before I added the transaction the code was taking over 1 minute to run, now it is down to roughly 10 seconds. This is plenty fast enough for my needs, so I didn't try and optimize any further.
ActiveRecord::Base.transaction do
# my db stuff here
end

Rails Caching DB Queries and Best Practices

The DB load on my site is getting really high so it is time for me to cache common queries that are being called 1000s of times an hour where the results are not changing.
So for instance on my city model I do the following:
def self.fetch(id)
Rails.cache.fetch("city_#{id}") { City.find(id) }
end
def after_save
Rails.cache.delete("city_#{self.id}")
end
def after_destroy
Rails.cache.delete("city_#{self.id}")
end
So now when I can City.find(1) the first time I hit the DB but the next 1000 times I get the result from memory. Great. But most of the calls to city are not City.find(1) but #user.city.name where Rails does not use the fetch but queries the DB again... which makes sense but not exactly what I want it to do.
I can do City.find(#user.city_id) but that is ugly.
So my question to you guys. What are the smart people doing? What is
the right way to do this?
With respect to the caching, a couple of minor points:
It's worth using slash for separation of object type and id, which is rails convention. Even better, ActiveRecord models provide the cacke_key instance method which will provide a unique identifier of table name and id, "cities/13" etc.
One minor correction to your after_save filter. Since you have the data on hand, you might as well write it back to the cache as opposed to delete it. That's saving you a single trip to the database ;)
def after_save
Rails.cache.write(cache_key,self)
end
As to the root of the question, if you're continuously pulling #user.city.name, there are two real choices:
Denormalize the user's city name to the user row. #user.city_name (keep the city_id foreign key). This value should be written to at save time.
-or-
Implement your User.fetch method to eager load the city. Only do this if the contents of the city row never change (i.e. name etc.), otherwise you can potentially open up a can of worms with respect to cache invalidation.
Personal opinion:
Implement basic id based fetch methods (or use a plugin) to integrate with memcached, and denormalize the city name to the user's row.
I'm personally not a huge fan of cached model style plugins, I've never seen one that's saved a significant amount of development time that I haven't grown out of in a hurry.
If you're getting way too many database queries it's definitely worth checking out eager loading (through :include) if you haven't already. That should be the first step for reducing the quantity of database queries.
If you need to speed up sql queries on data that doesnt change much over time then you can use materialized views.
A matview stores the results of a query into a table-like structure of
its own, from which the data can be queried. It is not possible to add
or delete rows, but the rest of the time it behaves just like an
actual table. Queries are faster, and the matview itself can be
indexed.
At the time of this writing, matviews are natively available in Oracle
DB, PostgreSQL, Sybase, IBM DB2, and Microsoft SQL Server. MySQL
doesn’t provide native support for matviews, unfortunately, but there
are open source alternatives to it.
Here is some good articles on how to use matviews in Rails
sitepoint.com/speed-up-with-materialized-views-on-postgresql-and-rails
hashrocket.com/materialized-view-strategies-using-postgresql
I would go ahead and take a look at Memoization, which is now in Rails 2.2.
"Memoization is a pattern of
initializing a method once and then
stashing its value away for repeat
use."
There was a great Railscast episode on it recently that should get you up and running nicely.
Quick code sample from the Railscast:
class Product < ActiveRecord::Base
extend ActiveSupport::Memoizable
belongs_to :category
def filesize(num = 1)
# some expensive operation
sleep 2
12345789 * num
end
memoize :filesize
end
More on Memoization
Check out cached_model

Freezing associated objects

Does anyone know of any method in Rails by which an associated object may be frozen. The problem I am having is that I have an order model with many line items which in turn belong to a product or service. When the order is paid for, I need to freeze the details of the ordered items so that when the price is changed, the order's totals are preserved.
I worked on an online purchase system before. What you want to do is have an Order class and a LineItem class. LineItems store product details like price, quantity, and maybe some other information you need to keep for records. It's more complicated but it's the only way I know to lock in the details.
An Order is simply made up of LineItems and probably contains shipping and billing addresses. The total price of the Order can be calculated by adding up the LineItems.
Basically, you freeze the data before the person makes the purchase. When they are added to an order, the data is frozen because LineItems duplicate nessacary product information. This way when a product is removed from your system, you can still make sense of old orders.
You may want to look at a rails plugin call 'AASM' (formerly, acts as state machine) to handle the state of an order.
Edit: AASM can be found here http://github.com/rubyist/aasm/tree/master
A few options:
1) Add a version number to your model. At the day job we do course scheduling. A particular course might be updated occasionally but, for business rule reasons, its important to know what it looked like on the day you signed up. Add :version_number to model and find_latest_course(course_id), alter code as appropriate, stir a bit. In this case you don't "edit" models so much as you do a new save of the new, updated version. (Then, obviously, your LineItems carry a item_id and an item_version_number.)
This generic pattern can be extended to cover, shudder, audit trails.
2) Copy data into LineItem objects at LineItem creation time. Just because you can slap has_a on anything, doesn't mean you should. If a 'LineItem' is supposed to hold a constant record of one item which appeared on an invoice, then make the LineItem hold a constant record of one item which appeared on an invoice. You can then update InventoryItem#current_price at will without affecting your previously saved LineItems.
3) If you're lazy, just freeze the price on the order object. Not really much to recommend this but, hey, it works in a pinch. You're probably just delaying the day of reckoning though.
"I ordered from you 6 months ago and now am doing my taxes. Why won't your bookstore show me half of the books I ordered? What do you mean their IDs were purged when you stopped selling them?! I need to know which I can get deductions for!"
Shouldn't the prices already be frozen when the items are added to the order? Say I put a widget into my shopping basket thinking it costs $1 and by the time I'm at the register, it costs $5 because you changed the price.
Back to your problem: I don't think it's a language issue, but a functional one. Instead of associating the prices with items, you need to copy the prices. If every item in the order has it's own version of a price, future price changes won't effect it, you can add discounts, etc.
Actually, to be clean you need to add versioning to your prices. When an item's price changes, you don't overwrite the price, you add a newer version. The line items in your order will still be associated with the old price.

Resources