Enable random access to collection with MongoDB - ruby-on-rails

I have a Rails application. It has a feed that shows items from different users all mixed up. It would be something similar to Pinterest in the way you see these items.
Right now I show all these items ordered by its date of creation. However, as the items are created by batches by users, they are shown not randomly (say you se the first 6 items being from one user, then the other 5 from other one, etc.).
The code that serves the items is this:
class Feeder
def self.most_recent_created(watching_user=nil, current_cursor)
next_cursor = nil
feed = []
influencers_ids = User.influencers.distinct(:_id)
Rating.most_recent_from_influencers(watching_user, influencers_ids).scroll(current_cursor) do |rating, cursor|
next_cursor = cursor
feed << ImoPresenter.new(Imo.new(rating), watching_user)
end
feed << next_cursor.to_s
end
end
scroll just gives a cursor pointing to each item in the iteration. Then I push the item into the feed.
The access to the database is done in Rating.most_recent_from_influencers(watching_user, influencers_ids), where most_recent_from_influencers(watching_user, influencers_ids) is a scope defined as follows:
scope :not_from, ->(user) { ne(user_id: user.id) }
scope :from, ->(user_ids) { any_in(user_id: user_ids) }
scope :most_recent_from_influencers, ->(watching_user, influencers_ids) {
proxy = from(influencers_ids).over_zero.desc(:created_at).limit(IMOS_PER_PAGE)
proxy = proxy.not_from(watching_user) if watching_user
proxy
}
MongoDB does not have random access out of the box. They suggest this for having a way of accessing randomly to the items. Basically, the solution is to add a random field in all documents and order the collection through this field. However, although I would have random items, I would always have almost the same items being shown, as I would just have the options of ordering it by desc(:rand) or asc(:rand).
I would like to have suggestions on how I can make the items being shown truly in a random way. Is it possible?

Based on similar questions, I've come to the conclusion that having a random field is a valid solution if the collection is dynamic, meaning that documents are inserted frequently. The more dynamic the collection is, the more 'random' access you can have.

Related

How to get nil object when there is no record for a specific value in where clause in rails

Currently, i am fetching the user records by using phone number field in Where query in rails like below.
users = User.where(phone: ["123421341234", "123423144", "123423144","444633333",,,,,,,,,,,,])
i have the user records, for the first three mobile numbers. But for the 4'th mobile number(444633333), there is no user record in the table, so for this user i want to get the "nil" object.(if user exists then the user object should be returned)
How can i change the above query. the resultant array should contain the objects in the sequence(in the search array sequence)
I think the only way to do this without making so many queries to the DB is to load all users by the phone_number you already have first then map your phone_numbers to the users found
phones_numbers = [7799569, 7818111, 7820442, 78343033, 78347700, 7836863, 7837873, 7837898, 7838025, 7838442]
users = User.where(phone_number: phones_numbers)
users = phones_numbers.map { |number| users.find_by(phones_number: number) }
the users loaded will be cached in the memory so even if there are queries they run time will be insignificant
you can also use #detect if you wanna do this on ruby/rails level
#same stuff as above
users = phones_numbers.map { |number| users.detect { |user| user.phones_number == number } }

Removing an item from an in-memory collection

I have a collection that contains a class like:
locations = Location.all
class Location < ActiveRecord::Base
end
The location class has a property: code
I wan to remove an item from the collection if code == "unused".
How many different ways can I do this in ruby?
I am currently doing this:
locations = Location.all.select { |l| l.code != "unused" }
This works great but just wondering what other ways I could do this just for learning purposes (if there big performance advantages in another way that would be good to know also).
Update
Please ignore the fact that I am loading my collection initially from the database, that wasn't the point. I want to learn how to remove things in-memory not simple where clauses :)
You can simply fetch records from your database what you need:
Rails 4 onwards:
locations = Location.where.not(code: "unused")
Before Rails 4:
locations = Location.where("code != ?", "unused")
If you have a collection and you want to reject some items from it, then you can try this:
locations.reject! {|location| location.code != "unused"}
You are doing this the wrong way. In your case, you are retrieving all records from DB and getting an array of records. Then you are looking for records you need in the array. Instead, you should get the records directly from DB:
Location.where("code != 'unused'")
# or in Rails 4 and latest
Location.where.not(code: "unused")
If you need to remove records from DB, you can do it like this:
Location.where.not(code: "unused").destroy_all
If you just want to know what is the best way to remove elements from an existing array, I think you are on the right track. Besides select there are reject, reject!, delete_if methods. You can learn more about them in the documentation http://ruby-doc.org/core-2.3.1/Array.html
There is a related post that might give more information: Ruby .reject! vs .delete_if

.order("RANDOM()") with will_paginate gem

I am wondering if there is any way to still use the .order("RANDOM()") with will_paginate so that when a page loads and it orders the pages, all the post will stay the same on each page until the home page is reloaded.
so to have all the posts on localhost:3000/posts?page=1 stay the same until localhost:3000(root_path) is visited again.
Problem is it will paginate posts but it current re orders them for each page selected so you will often see posts on page 1 also on page 2.
One way to do this is to set the random seed which your database is ordering by, such that it returns the same sequence of random numbers each time. You can store this seed in your users' session, and reset it only when you want to. However, there's a complication -- even though setting the random seed produces the same ordering of random numbers each time, there's no guarantee your database will execute it on your rows in the same order each time, unless you force it to do so like so:
SELECT items.*
FROM (SELECT setseed(0.2)) t
, (SELECT name, rank() OVER (ORDER BY name DESC)
FROM foos ORDER BY name DESC) items
JOIN generate_series(1, (SELECT COUNT(*) FROM foos))
ON items.rank = generate_series
ORDER BY RANDOM()
LIMIT 10;
As you can tell, that's quite complicated, and it forces your database to materialize your entire table into memory. It'd work for smaller data sets, but if you've got a big data set, it's out of the question!
Instead, I'd suggest you go with a solution more like tadman suggested above: generate a page of results, store the ids into session, and when you need to generate the next page, simply ignore anything you've already shown the user. The code would look like:
class ThingsController < ApplicationController
def index
#page = params[:page].to_i
session[:pages] ||= {}
if ids = session[:pages][#page]
# Grab the items we already showed, and ensure they show up in the same order.
#things = Things.where(id: ids).sort_by { |thing| ids.index(thing.id) }
else
# Generate a new page of things, filtering anything we've already shown.
#things = Things.where(["id NOT IN (?)", shown_thing_ids])
.order("RANDOM()")
.limit(30) # your page size
# Save the IDs into our session so the above case will work.
session[:pages][#page] = #things.map(&:id)
end
end
private
def shown_thing_ids
session[:pages].values.flatten
end
end
This method uses the session to store which IDs were shown on each page, so you can guarantee the same set of items and ordering will be shown if the user goes back. For a new page, it will exclude any items already displayed. You can reset the cache whenever you want with:
session.delete(:pages)
Hope that helps! You could also use Redis or Memcache to store your page data, but the session is a good choice if you want the ordering to be random per-user.

Design - How to structure filters taking advantage of OOP in Rails API app

My app shows items in a feed. This items are ratings made by different users. Ratings can have a value, a review, etc.
Usual case is logged in user asking for feed (/users/feed). The controller takes the action:
def feed
if authenticate_with_token
imos_and_cursor = Feed.new(RandomItemFeeder.new(params[:cursor], #current_user)).feed
render json: {cursor: imos_and_cursor[imos_and_cursor.length-1], imos: imos_and_cursor[0..imos_and_cursor.length-2]}
end
end
Feed is the boss here. It controles what is served (it serves to respond with items but it also will know how to respond for feeding the people call (index of users basically).
Here are some of the feeders I have:
FriendsFeeder
RandomItemsFeeder
MostRecentItemsFeeder
Following is RandomItemFeeder, responsible of feed with random items:
class RandomItemFeeder < Feeder
def feed
influencers_ids = User.influencers.distinct(:id)
friends = User.friends(#watching_user).distinct(:id) if #watching_user
source_users = influencers_ids.concat(friends)
Rating.random_from_influencers(#watching_user, source_users).scroll(#current_cursor) do |rating, cursor|
#next_cursor = cursor
#feed << ImoPresenter.new(Imo.new(rating), #watching_user)
end
#feed << #next_cursor.to_s
end
end
Presenters
I've structured the rendering of jsons with presenters, so I have different presenters for different cases (feed, user profile, etc.).
Now, I want to incorporate filters. For example, I want that RandomItemFeeder feeds with just items of last 5 five years (obviously, Item model has corresponding fields).
The question is how can I incorporate this filtering utilizing best OOP practices. At the end, is just a scope and i can implement it in different ways, but I just want to do it right now so that I don't have to come back and refactor everything.
Thanks in advance.

Ruby on Rails - ActiveRecord::Relation count method is wrong?

I'm writing an application that allows users to send one another messages about an 'offer'.
I thought I'd save myself some work and use the Mailboxer gem.
I'm following a test driven development approach with RSpec. I'm writing a test that should ensure that only one Conversation is allowed per offer. An offer belongs_to two different users (the user that made the offer, and the user that received the offer).
Here is my failing test:
describe "after a message is sent to the same user twice" do
before do
2.times { sending_user.message_user_regarding_offer! offer, receiving_user, random_string }
end
specify { sending_user.mailbox.conversations.count.should == 1 }
end
So before the test runs a user sending_user sends a message to the receiving_user twice. The message_user_regarding_offer! looks like this:
def message_user_regarding_offer! offer, receiver, body
conversation = offer.conversation
if conversation.nil?
self.send_message(receiver, body, offer.conversation_subject)
else
self.reply_to_conversation(conversation, body)
# I put a binding.pry here to examine in console
end
offer.create_activity key: PublicActivityKeys.message_received, owner: self, recipient: receiver
end
On the first iteration in the test (when the first message is sent) the conversation variable is nil therefore a message is sent and a conversation is created between the two users.
On the second iteration the conversation created in the first iteration is returned and the user replies to that conversation, but a new conversation isn't created.
This all works, but the test fails and I cannot understand why!
When I place a pry binding in the code in the location specified above I can examine what is going on... now riddle me this:
self.mailbox.conversations[0] returns a Conversation instance
self.mailbox.conversations[1] returns nil
self.mailbox.conversations clearly shows a collection containing ONE object.
self.mailbox.conversations.count returns 2?!
What is going on there? the count method is incorrect and my test is failing...
What am I missing? Or is this a bug?!
EDIT
offer.conversation looks like this:
def conversation
Conversation.where({subject: conversation_subject}).last
end
and offer.conversation_subject:
def conversation_subject
"offer-#{self.id}"
end
EDIT 2 - Showing the first and second iteration in pry
Also...
Conversation.all.count returns 1!
and:
Conversation.all == self.mailbox.conversations returns true
and
Conversation.all.count == self.mailbox.conversations.count returns false
How can that be if the arrays are equal? I don't know what's going on here, blown hours on this now. Think it's a bug?!
EDIT 3
From the source of the Mailboxer gem...
def conversations(options = {})
conv = Conversation.participant(#messageable)
if options[:mailbox_type].present?
case options[:mailbox_type]
when 'inbox'
conv = Conversation.inbox(#messageable)
when 'sentbox'
conv = Conversation.sentbox(#messageable)
when 'trash'
conv = Conversation.trash(#messageable)
when 'not_trash'
conv = Conversation.not_trash(#messageable)
end
end
if (options.has_key?(:read) && options[:read]==false) || (options.has_key?(:unread) && options[:unread]==true)
conv = conv.unread(#messageable)
end
conv
end
The reply_to_convesation code is available here -> http://rubydoc.info/gems/mailboxer/frames.
Just can't see what I'm doing wrong! Might rework my tests to get around this. Or ditch the gem and write my own.
see this Rails 3: Difference between Relation.count and Relation.all.count
In short Rails ignores the select columns (if more than one) when you apply count to the query. This is because
SQL's COUNT allows only one or less columns as parameters.
From Mailbox code
scope :participant, lambda {|participant|
select('DISTINCT conversations.*').
where('notifications.type'=> Message.name).
order("conversations.updated_at DESC").
joins(:receipts).merge(Receipt.recipient(participant))
}
self.mailbox.conversations.count ignores the select('DISTINCT conversations.*') and counts the join table with receipts, essentially counting number of receipts with duplicate conversations in it.
On the other hand, self.mailbox.conversations.all.count first gets the records applying the select, which gets unique conversations and then counts it.
self.mailbox.conversations.all == self.mailbox.conversations since both of them query the db with the select.
To solve your problem you can use sending_user.mailbox.conversations.all.count or sending_user.mailbox.conversations.group('conversations.id').length
I have tended to use the size method in my code. As per the ActiveRecord code, size will use a cached count if available and also returns the correct number when models have been created through relations and have not yet been saved.
# File activerecord/lib/active_record/relation.rb, line 228
def size
loaded? ? #records.length : count
end
There is a blog on this here.
In Ruby, #length and #size are synonyms and both do the same thing: they tell you how many elements are in an array or hash. Technically #length is the method and #size is an alias to it.
In ActiveRecord, there are several ways to find out how many records are in an association, and there are some subtle differences in how they work.
post.comments.count - Determine the number of elements with an SQL COUNT query. You can also specify conditions to count only a subset of the associated elements (e.g. :conditions => {:author_name => "josh"}). If you set up a counter cache on the association, #count will return that cached value instead of executing a new query.
post.comments.length - This always loads the contents of the association into memory, then returns the number of elements loaded. Note that this won't force an update if the association had been previously loaded and then new comments were created through another way (e.g. Comment.create(...) instead of post.comments.create(...)).
post.comments.size - This works as a combination of the two previous options. If the collection has already been loaded, it will return its length just like calling #length. If it hasn't been loaded yet, it's like calling #count.
It is also worth mentioning to be careful if you are not creating models through associations, as the related model will not necessarily have those instances in its association proxy/collection.
# do this
mailbox.conversations.build(attrs)
# or this
mailbox.conversations << Conversation.new(attrs)
# or this
mailbox.conversations.create(attrs)
# or this
mailbox.conversations.create!(attrs)
# NOT this
Conversation.new(mailbox_id: some_id, ....)
I don't know if this explains what's going on, but the ActiveRecord count method queries the database for the number of records stored. The length of the Relation could be different, as discussed in http://archive.railsforum.com/viewtopic.php?id=6255, although in that example, the number of records in the database was less than the number of items in the Rails data structure.
Try
self.mailbox.conversations.reload; self.mailbox.conversations.count
or perhaps
self.mailbox.reload; self.mailbox.conversations.count
or, if neither of those work, just try reloading as many of the objects as possible to see if you can get it to work (self, mailbox, conversations, etc.).
My guess is that something is messed up between memory and the DB. This is definitely a really weird error though, might wanna put in an issue on Rails to see why this would be the case.
The result of mailbox.conversations is cached after the first call. To reload it write mailbox.conversations(true)

Resources