How can I get the participants of a Vanity experiment? - ruby-on-rails

tl:dr
Is there anyway to get something like Vanity.experiment(:landing).participants_for_option(:a) returning a array of users?
The long story
I'm using the gem Vanity with a Rails 4.2 application and it is working nicely, but I want to inspect further the behaviour of participants.
I tested what kind of page converted more users: A classical signup page versus a signup with order page. The classical signup page led to almost three times more signups, but I'm still in the dark in the sense that I don't know, among the signup-only-users, how many ordered a product.

It sounds like you're trying to understand more about how an experiment affects different parts of your funnel.
At the aggregate level, one way to do that may be to to use multiple metrics for your experiment at different parts of your funnel, e.g. track!ing both signups and then purchases.
Unfortunately, Vanity isn't set up very well to query for individual participants per alternative, because testing itself is aggregate. If you want to access alternatives per user, there are methods for that, for example, Vanity.playground.adapter.ab_showing(experiment, identity), see the docs.
If you're interested in doing more in depth analytical queries, it might be worth using the SQL adapter, the schema tracks per participant and you could join to other tables that hold data about purchases/etc.
Edit:
It looks like this has changed in the most recent version of Vanity:
https://github.com/alobato/vanity/blob/master/lib/vanity/playground.rb#L231
Vanity.playground.connection.ab_assigned(experiment_name, identity)
Vanity.playground.connection.ab_showing(experiment_name, identity)

Related

Rails associations when unique and common models combine

First of all, I want to apologize for my terminology. I am not entirely sure what to call what I am looking for, so I can’t google for answers. But here is my problem.
I am working on a Rails application that stores information about different websites and provides various services for them. I will call these services ‘Products.’ One website can be subscribed to several products, and a product can be served to various websites. So here is a very simple association scheme for these relationships:
At least, it would have been simple, but the problem is that the Settings model (shown in red on this diagram) is different for each product: for one product, it will have one number of fields and data types, for another it will have a different number of fields with different data types. On the other hand, the Faq and Description are the same, so if I redraw the diagram as follows:
I will get another problem: too much repetition (shown in blue on the diagram). Ideally, I want some kind of modification of the first diagram, where the Product model will choose differens Settings models depending on a parameter that I pass to it:
So that a request website.products.find(1).settings will return the model Settings1, while a request website.products.find(2).settings will return a completely different model, Settings2.
Is this achievable in Rails? If not, how would you organize such data?

What is the best way to store a user's Facebook friends list in my database?

Overview
I'm creating a Ruby on Rails website which uses Facebook to login.
For each user I have a database entry which stores their Facebook User ID along with other basic information.
I'm also using the Koala gem in order to retrieve a user's friendlist from Facebook, but I'm unsure as to how I should store this data...
Option 1
I could store the user's friends as a serialized hash in the User table, then if I wanted to display a list of all the current user's friends, I could grab this hash and do something along the lines of SELECT FROM Users WHERE facebook_user_id IN hash
Each time the user logs in I could update this field to store the latest friends list.
Option 2
I could create a Friend table and store friendship information in here, where a User has many Friends. So there would be a row for each friendship, (User1 and User2 columns). Then to display a list of the current user's friends I could do something like SELECT User2 FROM Friends WHERE User1 = current_user
This seems like the better option to me, but...
It has the disadvantage that there would be many rows... If there were 100,000 users, each with 100 friends, that's now 10,000,000 rows in the Friends table.
It also means each time the user logs in, I'd need to loop over their Facebook friends list returned using Koala and create a Friend record if someone on their friendlist is in my User table and there isn't a corresponding entry in the Friends table. This seems like it'd be slow if a user has 1000 Facebook friends?
I'd appreciate any guidance on how it would be best to achieve this.
Apologies for the badly worded question, I'll try and reword/organise it shortly.
Thanks for any help in advance.
If you need to store a lot of data, then you need to store a lot of data. If you are like most, you probably won't run into that problem sooner than you have the cash to solve it. In other words, you are probably assuming you'll have more traffic and data than you'll get, at least in the short-term. So I doubt this is an issue, even though it is a good sign that you are thinking about it now rather than later.
As I mentioned in my comment below, the easiest solution is to have a tie table with a row for each side of the friend relationship (a has_many :friends, through: :facebook_friend_relationships, class_name: 'FacebookFriend' on FacebookFriend, per the design mentioned below). But your question seemed to be about how to reduce the number of records, so that is what the remainder of the answer will address.
If you have to store in the DB and you know for sure that you will absolutely have every FB user on the planet hitting your site because it is so awesome, but they won't all hit at once, then if you are limited in storage, you may want to use a LRU algorithm (remove the least recently used records) possibly with timed expiration also. You could just have a cron job that does a query on the DB then deletes old/unused records to do this. Wouldn't be perfect, but it would be a simple solution.
You could also archive older data rather than throw it away. So, frequently used data could stay in the table of active users, and then you might offload older data to another table or even another database (and you might see the apartment and second_base gems for that). However, once you get to the size, you're probably looking at a number of other architectural solutions that have much less to do with ActiveRecord models/associations or schema design. Though it pays to plan ahead, I wouldn't worry about that excessively until you are sure that the application will get enough users to invest the time in that.
Even though ActiveRecord has some caching, you could just avoid the DB and cache friends in memory yourself in the beginning for speed, especially if you don't yet have many users, which you probably don't yet. If you think you'll run out of memory because of the high number of users, LRU might be a good option here also, and lru_redux looks interesting. Again, you might want to time the cache also so expires and re-gets friends when the cache expires. Even just storing the results in the user session may be adequate, i.e. in the controller action method, just do #friends ||= Something.find_friends(fb_user_id), and the latter is what most might do as a first shot at it while you're getting started.
If you use ActiveRecord, in your query in the controller (or on the association in the model) consider using include: to avoid n+1 queries. That will speed up things.
For the schema design, maybe:
User - users table with email and authN info. Look at the Devise gem.
FacebookUser - info about the Facebook user.
FacebookFriendRelationship - a tie model with (id and) two columns, one for one FacebookUser id and one for the other.
By separating the authN info (User) from the FB data (FacebookUser and FacebookFriendRelationship), you make it easier to have other social media accounts, etc. each with information specific to those accounts in other tables.
The complexity comes in FacebookUser's relationship with friends if the goal is to minimize rows in the relationship table. To half the number of rows, you'd have a single row for a relationship where the id of FacebookUser could be in either foreign key column. Either the user has a friend or is a friend, so you could have two has_many :through associations on FacebookFriend that each use a different foreign key in FacebookFriendRelationship. Or you could do HABTM without the model and use foreign_key and association_foreign_key options in each association. Either way, you could add a method to add both associations together (because they are arrays). Instead, you could use custom SQL in a single has_many if you didn't care about having to use ActiveRecord to remove associations the normal way. However, per your comments, I think you want to avoid this complexity, and I agree with you, unless you really must limit the number of relationship rows. However, it isn't the number of tie table rows that will eat the data, it is going to be all of the user info you keep in the FacebookFriends table.

MongoDB and embedded documents, good use cases

I am using embedded documents in MongoDB for a Rails 3 app. I like that I can use embedded documents and the values are all returned with one query and there is less load on the database server. But what happens if I want my users to be able to update properties that really should be shared across documents. Is this sort of operation feasible with MongoDB or would I be better off using normal id based relations? If ID based relations are the way to go would it affect performance to a great degree?
If you need to know anything else about the application or data I would be happy to let you know what I am working with.
Document that has many properties that all documents share.
Person
name: string
description: string
Document that wants to use these properties:
Post
(references many people)
body: string
This all depends on what are you going to do with your Person model later. I know of at least one working example (blog using MongoDB) where its developer keeps user data inside comments they make and uses one collection for the entire blog. Well, ok, he uses second one for his "tag cloud" :) He just doesn't need to keep centralized list of all commenters, he doesn't care. His blog contains consolidated data from all his previous sites/blogs?, almost 6000 posts total. Posts contain comments, comments contain users, users have emails, he got "subscribe to comments" option for every user who comments some post, authorization is handled by the external OpenID service aggregator (Loginza), he keeps user email got from Loginza response and their "login token" in their cookies. So the functionality is pretty good.
So, the real question is - what are you going to do with your Users later? If really feel like you need a separate collection (you're going to let users have centralized control panels, have site-based registration, you're going to make user-centristic features and so on), make it separate. If not - keep it simple and have fun :)
It depends on what user info you want to share acrross documents. Lets say if you have user and user have emails. Does not make sence to move emails into separate collection since will be not more that 10, 20, 100 emails per user. But if user say have some big related information that always growing, like blog posts then make sence to move it into separate collection.
So answer depend on user document structure. If you show your user document structure and what you planning to move into separate collection i will help you make decision.

Need advice on MongoDB Schema for Chat App. Embedded vs Related Documents

I'm starting a MongoDB project just for kicks and as a chance to learn MongoDB/NoSQL schemas. It'll be a live chat app and the stack includes: Rails 3, Ruby 1.9.2, Devise, Mongoid/MongoDB, CarrierWave, Redis, JQuery.
I'll be handling the live chat polling/message queueing separately. Not sure how yet, either Node.js, APE or custom EventMachine app. But in regards to Mongo, I'm thinking to use it for everything else in the app, specifically chat logs and historical transcripts.
My question is how best to design the schema as all my previous experience has been with MySQL and relational DB schema's. And as a sub-question, when is it best to us embedded documents vs related documents.
The app will have:
Multiple accounts which have multiple rooms
Multiple rooms
Multiple users per room
List of rooms a user is allowed to be in
Multiple user chats per room
Searchable chat logs on a per room and per user basis
Optional file attachment for a given chat
Given Mongo (at least last time I checked) has a document limit of 4MB, I don't think having a collection for rooms and storing all room chats as embedded documents would work out so well.
From what I've thought about so far, I'm thinking of doing something like:
A collection for accounts
A collection for rooms
Each room relates back to an account
Related documents in chats collections for all chat messages in the room
Embedded Document listing all users currently in the room
A collection for users
Embedded Document listing all the rooms the user is currently in
Embedded Document listing all the rooms the user is allowed to be in
A collection for chats
Each chat relates back to a room in the rooms collection
Each chat relates back to a user in the users collection
Embedded document with info about optional uploaded file attachment.
My main concern is how far do I go until this ends up looking like a relational schema and I defeat the purpose? There is definitely more relating than embedding going on.
Another concern is that referencing related documents is much slower than accessing embedded documents I've heard.
I want to make generic queries such as:
Give me all rooms for an account
Give me all chats in a room (or filtered via date range)
Give me all chats from a specific user
Give me all uploaded files in a given room or for a given org
etc
Any suggestions on how to structure the schema efficiently in a way that scales? Thanks everyone.
I think you're pretty much on the right track. I'd use a capped collection for chat lines, with each line containing the user ID, room ID, timestamp, and what was said. This data would expire once the capped collection's "end" is reached, so if you needed a historical log you'd want to copy data out of the capped collection into a "log" collection periodically, but capped collections are specifically designed for logging-style applications where you aren't going to be deleting documents, and insertion order matters. In the case of chat, it's a perfect match.
The only other change I'd suggest would be to maintain uploads in a separate collection, as well.
I am a big fan of mongodb as a document database aswell. But are you sure you are using mongodb for the right reason? What is mongodb powerful at?
Its a subjective question but for me in-place (atomic) updates over documents is what makes mongodb powerful. And I can't really see you using it that much. And on top of that you are hitting the document size limit problem aswell.(With experience I can tell you that embedding files to mongodb is not a good idea). You want to have a live chat application on top of database too.
Your document schema's seems logical. But I wouldn't go with mongodb for this kind of project where your application heavily depends on inserts. I would go for CouchDB.
With CouchDB you wouldn't have to worry about attachments problem, you can embed them easily. "_changes" would make your life much much easier to eighter build a live chat application / long pooling / feeding search engine (if you want to implement one).
And I saw an open source showcase project in couchone. It has some similarities with your goals: Anologue. You should check it out.
PS : Sorry it was a little off topic but I couldn't hold myself.

RESTful route for a list of members that are not in a collection

I'm trying to figure out what the best way to show a list of members (users) that aren't a collection (group).
/users
is my route for listing all of the users in the account
/group/:id/members
is my route for listing all of the users in the group
/users?not_in_group=:id
is my current option for showing a list of users NOT in the group. Is there a more RESTFul way of displaying this?
/group/:id/non_members
seems sort of odd…
Either query parameters or paths can be used to get at the representation you want. But I'd follow Pete's advice and make sure your API is hypertext-driven. Not doing so introduces coupling between client and server that REST was intended to prevent.
The best answer to your question might depend on your application. For example, if your system is small enough, it may suffice to only support a representation consisting of a list of users and their respective groups (the resource found at /users). Then let the client sort out what they want to do with the information. If your system has lots of groups and lots of users, each of which belongs to only a couple of groups, your available_users representation for any group is likely to be only slightly smaller than the entire list of users anyway.
Creative design of media types can go a long way to solving problems like this.
Spoke with my partner. He suggested:
/group/:id/available_members
Seems much more positive.
The main precept of REST is "hypertext as the engine of application state". The form of the URI is irrelevant, what matters is that it is navigable from the representation returned at the application's entry point.

Resources