I have a Rails 3 application that currently shows a single "random" record with every refresh, however, it repeats records too often, or will never show a particular record. I was wondering what a good way would be to loop through each record and display them such that all get shown before any are repeated. I was thinking somehow using cookies or session_ids to sequentially loop through the record id's, but I'm not sure if that would work right, or exactly how to go about that.
The database consists of a single table with a single column, and currently only about 25 entries, but more will be added. ID's are generated automatically and are sequential.
Some suggestions would be appreciated.
Thanks.
The funny thing about 'random' is that it doesn't usually feel random when you get the same answer twice in short succession.
The usual answer to this problem is to generate a queue of responses, and make sure when you add entries to the queue that they aren't already on the queue. This can either be a queue of entries that you will return to the user, or a queue of entries that you have already returned to the user. I like your idea of using the record ids, but with only 25 entries, that repeating loop will also be annoying. :)
You could keep track of the queue of previous entries in memcached if you've already got one deployed or you could stuff the queue into the session (it'll probably just be five or six integers, not too excessive data transfer) or the database.
I think I'd avoid the database, because it sure doesn't need to be persistent, it doesn't need to take database bandwidth or compute time, and using the database just to keep track of five or six integers seems silly. :)
UPDATE:
In one of your controllers (maybe ApplicationController), add something like this to a method that you run in a before_filter:
class ApplicationController < ActionController::Base
before_filter :find_quip
def find_quip:
last_quip_id = session[:quip_id] || Quips.find(:first).id
new_quip_id = Quips.find(last_quip.id + 1).id || Quips.find(:first)
session[:quip_id] = new_quip
end
end
I'm not so happy with the code to wrap around when you run out of quips; it'll completely screw up if there is ever a hole in the sequence. Which is probably going to happen someday. And it will start on number 2. But I'm getting too tired to sort it out. :)
If there are only going to be not too many like you say, you could store the entire array of IDs as a session variable, with another variable for the current index, and loop through them sequentially, incrementing the index.
Related
I am looking at a rather large database.. Lets say I have an exported flag on the product records.
If I want an estimate of how many products I have with the flag set to false, I can do a call something like this
Product.where(:exported => false).count.. .
The problem I have is even the count takes a long time, because the table of 1 million products is being written to. More specifically exports are happening, and the value I'm interested in counting is ever changing.
So I'd like to do a dirty read on the table... Not a dirty read always. And I 100% don't want all subsequent calls to the database on this connection to be dirty.
But for this one call, dirty is what I'd like.
Oh.. I should mention ruby 1.9.3 heroku and postgresql.
Now.. if I'm missing another way to get the count, I'd be excited to try that.
OH SNOT one last thing.. this example is contrived.
PostgreSQL doesn't support dirty reads.
You might want to use triggers to maintain a materialized view of the count - but doing so will mean that only one transaction at a time can insert a product, because they'll contend for the lock on the product count in the summary table.
Alternately, use system statistics to get a fast approximation.
Or, on PostgreSQL 9.2 and above, ensure there's a primary key (and thus a unique index) and make sure vacuum runs regularly. Then you should be able to do quite a fast count, as PostgreSQL should choose an index-only scan on the primary key.
Note that even if Pg did support dirty reads, the read would still not return perfectly up to date results because rows would sometimes inserted behind the read pointer in a sequential scan. The only way to get a perfectly up to date count is to prevent concurrent inserts: LOCK TABLE thetable IN EXCLUSIVE MODE.
As soon as a query begins to execute it's against a frozen read-only state because that's what MVCC is all about. The values are not changing in that snapshot, only in subsequent amendments to that state. It doesn't matter if your query takes an hour to run, it is operating on data that's locked in time.
If your queries are taking a very long time it sounds like you need an index on your exported column, or whatever values you use in your conditions, as a COUNT against an indexed an column is usually very fast.
I have an async Resque job that creates many associated objects inside a loop that I can't seem to avoid heroku's ever-popular R14 error with.
has_many :associated_things
...
def populate_things
reference_things = ReferenceThings.where(some_criteria).map(&:name) # usually between 10 k and 20k strings
reference_things.each do |rt|
self.associated_things << AssociatedThing.create name: rt
end
end
Some things I've tried:
wrapping the create loop in an ActiveRecord::Base.uncached block
manually running GC.start at the end of the loop
adding an each_slice before .each
Is there a way to rewrite this loop to minimize memory usage?
#Alex Peachey had some good suggestions, but ultimately, #mu had the right idea in the first comment.
Transitioning to raw SQL is the only way I could find to make this work. Some suggested methods are here:
http://coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/
I used the mass insert method and it works fine.
It should be said that it's far from clear to me why this is necessary. Apparently instantiating hundreds of thousands of AR objects -- even outside of a web request, asynchronously -- causes a memory leak. Maybe this just simply isn't the sort of thing Rails/AR was designed to do.
Related question, perhaps the same issue: ActiveRecord bulk data, memory grows forever
Some ideas that may help:
Since you are just pulling name from ReferenceThings, don't grab the full object and then just grab the name. Instead do something like this:
reference_things = ReferenceThings.where(some_criteria).pluck(:name)
That will do a better query grabbing just the names and give you an array. Much cheaper memory wise.
I noticed you are putting all the AssociatedThings you are creating into an array as you go. If you don't actually need an array of them then just creating them will be better. If you do need them, depending on what you need them for you could create them all and then query the database to grab them again and loop over them with find_each which will grab them in batches.
I searched for this and was surprised not to find an answer, so I might be overcomplicating this. But, basically I have a couple of RESTful models in my Rails 3 application that I would like to keep track of, in a pretty simple way, just for popularity tracking over time. In my case, I'm only interested in tracking hits on the GET/show method–Users log in and view these two resources, their number of visits go up with each viewing/page load.
So, I have placed a "visits" column on the Books model:
== AddVisitsToBooks: migrating ===============================================
-- add_column(:books, :visits, :integer)
-> 0.0008s
== AddVisitsToBooks: migrated (0.0009s) ======================================
The column initializes to zero, then, basically, inside the books_controller,
def show
unless #book.owner == current_user #hypothetically, we won't let an owner
"cheat" their way to being popular
#book.visits = #book.visits + 1
#book.save
end
And this works fine, except now every time a show method is being called, you've got not only a read action for the object record, but a write, as well. And perhaps that gets to the heart of my question; is the total overhead required just to insert the single integer change a big deal in a small-to-midsize production app? Or is it a small deal, or basically nothing at all?
Is there a much smarter way to do it? Everything else I came up with still involved writing to a record every time the given page is viewed. Would indexing the field help, even if I'm rarely searching by it?
The database is PostgreSQL 9, by the way (running on Heroku).
Thanks!
What you described above has one significant cons: once the process updates database (increase visit counter) the row is blocked and if there is any other process it has to wait.. I would suggest using DB Sequence for this reason: http://www.postgresql.org/docs/8.1/static/sql-createsequence.html However you need to maintain the sequence custom in your code: Ruby on Rails+PostgreSQL: usage of custom sequences
After some more searching, I decided to take the visits counter off of the models themselves, because as MiGro said, it would be blocking the row every time the page is shown, even if just for a moment. I think the DB sequence approach is probably the fastest, and I am going to research it more, but for the moment it is a bit beyond me, and seems a bit cumbersome to implement in ActiveRecord. Thus,
https://github.com/charlotte-ruby/impressionist
seems like a decent alternative; keeping the view counts in an alternate table and utilizing a gem with a blacklist of over 1200 robots, etc, etc.
So far i have #comments.count which gives me the number of all comments in the table,but i need another column which will act as a previous_count to compare with #comments.count , and then do something like this on the view.
if #comments.count is greater than previous_recorded
display NEW COMMENT
My question is how to record and save #comments.count in previous_count?I have thought of using session but i am not sure if that could be safe.Any help will be appreciated
Consider using a datetime instead of a count. A count will be faulty if, say, earlier comments are deleted and then more are added. If you just store the previous datetime (instead of the count) then you can call #comments.where(:created_at > prev_datetime).count to get the count. As for storing the "last datetime" a session would be a fine place for that unless you want it to persist across devices, in which case you'd want to save it as an attribute on e.g. your User model.
Might make more sense to look at the timestamps (created_at most likely) rather than the counts. Then each client could track the last timestamp they had and just ask for the comments newer than that as needed. This way you wouldn't have to store anything new or worry about different clients having different previous_count values, you could just keep track of a timestamp in the session or client-side JavaScript or whatever was convenient.
I am planning to have something like this for a website that is on Ruby on Rails. User comes and enters a bunch of names in a text field, and a queue gets created from all the names. From there the website keeps asking more details for each one from the queue until the queue finishes.
Is there any queue management gem available in Ruby or I have to just create an array and keep incrementing the index in session variable to emulate a queue behaviour?
The easiest thing is probably to use the push and shift methods of ruby arrays.
Push sticks things on the end of the array, shift will return and remove the first element.
As you receive data about each of the names, you could construct a second list of the names - a done array. Or if you're not concerned about that and just want to save and more on with them, just store the array in the session (assuming it's not going to be massive) and move on.
If your array is massive, consider storing the names to be added in temporary rows in a table then removing them when necessary. If this is the route you take, be sure to have a regularly running cleanup routine that removes entries that were never filled out.
References
http://apidock.com/ruby/Array/push
http://apidock.com/ruby/Array/shift
Try modeling a Queue with ActiveRecord
Queue.has_many :tasks
attributes: name, id, timestamps
Task.belongs_to :queue
attributes: name, id, position, timestamps, completed
Use timestamps to set initial position. Once a task is completed, set position to [highest position]+1 (assuming the lower the position number, the higher up on the queue). Completed tasks will sink to the bottom of the queue and a new task will rise to the top.
Hope this helps!