A/B testing and stats solutions - ruby-on-rails

I've been looking for a good testing framework for months, not finding anything, so I've just been building my own.
This is what I want to do:
- track arbitrary behaviors (e.g. # of photos viewed, # of comments posted)
- track correlation between arbitrary variables and those behaviors
(e.g, how do different versions of this prompt affect average # of photos viewed?)
This kind of thing should be a core part of agile development. What's out there? I know Google Website Optimizer is one of the answers, but you can only track behaviors that end in a single "success" page.
It'd be great to have a plugin that can work within your code (Rails in my case) and feed into a nice hosted service with pretty graphs...

You may want to partition your problem into analytics (impressions, actions, and possible in-page events, reporting of your tests), and a framework for serving up your variations (how do you manage x variations both in practical terms of preparing them, do you need to store variations for future reference, turning on and off tests, optimize the effectiveness of your test etc). There is clearly an overlap, say, Google Website Optimizer can turn off a bad variation as soon as it has data to support it, but by thinking about this as different problems you may be able to reuse perhaps the Google Analytics component.

Yep, here:
http://github.com/paulmars/seven_minute_abs/tree/master

Related

How do "Indeed.com" and "hotelscombined.com" search other sites?

I'm trying to build a vertical (meta) search engine for a particular industry. I'm trying to do somthing similar to "indeed.com" (job search engine) and "hotelscombined.com" (hotel search engine). I would like to know how do these two search engines build up their search results?
1) Is it using APIs of the other websites they serve results from? (seems odd to me since some results come from small and primitive sites).
2) Do other website post updates to these search engines? (Also seems odd as above)
3) Do they internally understand and create a map for each website they serve results from? (if so, then maybe they need to constantly monitor the structure of these sites for any changes. Seems error prone to me).
4) Any other possibilities?
I don't know even where to start, so any pointers in the right direction is much appreciated. (books, tutorials, hints, ideas...)
Thanks
It is mostly a mix of 1 and 3. Ideally, the site will have some sort of API they expose and document. If not, you have to do data scraping. Basically, you reverse-engineer their page. If they get results asynchronously via an undocumented API, you can use that API as well as (until they make a breaking change). Otherwise, it's simply a matter of pulling the text straight out of the HTML.
I don't know of any more advanced techniques since I don't do this myself, but several of my acquaintances have gone on to work on mobile apps that need to do this sort of thing with sports scores and such (not for searching, but same requirements - get someone else's data into our database). The low tech "pull it from the HTML until they change the HTML and break everything" is standard practice where they work.
2 is possible, but to do it you have to either make business arrangements with every source of data you want to use, or gain enough market presence for everyone to want to upload their data.
Also, you don't do this while actually searching (unless you have other constraints as Charles Duffy points out in his comment). You run a process that regularly goes out, gets all the data it can find, and inserts it into your own database, which you then search. This allows you to decouple data gathering from data searching - your search page won't have to know about and handle errors from the scraper, and the scraper has to only "get all the data" from each source instead of being able to transform queries from your site to search each source.

Cucumber folder structure for web + mobile app

I have a few questions on appropriate folder structure in cucumber:
I think I am going to organize my feature folders according to type_of_user/type_of_feature.feature, i.e. main_admin/add_a_customer.feature or franchisee/schedule_job.feature. The only slight issue with this is that of the user types I have: cleaners, customers, franchisees and main admin/franchisor, the latter two users share many features. For example, both franchisees and franchisor have the ability to add new customers and schedule jobs, the only difference being that the franchisor has the ability to schedule a job for anyone, anywhere - i.e. the only real difference is permissions, not functionality. Does it matter that I will be essentially duplicating tests for these two users, given the proposed folder structure? Or should I be looking to seperate folders by functionality only, then type of user?
For my mobile app, should I have these feature folders separate from the web app or should these go in the root as well: mobile/ios/cleaner_login.feature, mobile/android/cleaner_login.feature etc?
Regarding user types:
Organizing at the top level by user type has worked well for me. However, I would only consider user types separate if they actually used different features, not just if they differed in permissions with respect to specific objects as in the example you gave. You could consider both franchisees and franchisors "administrators", make a top-level folder for those, and just write scenarios for franchisees and franchisors for features that had different permissions for those roles.
If you're a developer and writing RSpec specs in addition to Cucumber features, you might even just write specs instead of features to cover the difference between franchisees and franchisors. (I would only do that if the differences between franchisee and franchisor were fairly trivial and not worth exposing in Cucumber.) If you're QA and testing only from the outside of course it'll all have to be in Cucumber.
I would certainly not systematically duplicate entire scenarios for the sake of any organization. The extra work required to maintain the duplication and the errors when you forgot would be far worse than the bit of extra work required to follow a slightly more complicated system that minimizes duplication!
Regarding web and mobile: How to handle different platforms depends on how different they are.
If you have a web app and a native (Android, iOS) mobile app the step implementations will be completely different and your tests will need to be in different projects altogether. That probably won't mean that much duplication, since the users and features in the web and mobile apps will probably be rather different.
If you have two web apps, one for desktop and one for mobile, there are no technology issues. But again it will depend on how similar the two apps are. If they're different, separate them at the top level (even before users). If they're very similar, separate them only when necessary and only at the scenario level.

in a Rails apps, how to organize Cucumber .feature files?

when i got to this project there were cucumber tests in "features/enhanced", which ran with javascript and a few in "features/plain" which did not require js. with the later development of per-scenario #javascript, this doesn't make sense. and as the number of features files we have grows and grows, it'd be awesome if this stayed tidy.
so, in best practice land:
1) how long should .feature files be? i try to keep each narrow and specific with 1 or 2 "Scenarios".
2) what folder/file structure should one keep them in?
2a) how might one group similar features?
1) Once you've done them for a few months you'll soon find what works best for you. My advice is you should make them small ish. We have often split our earlier features down into smaller chunks, but have never ended up combining them. It's handy for making use of backgrounds etc...
2) We had a big problem with this and spent ages doing it one way then another. In the end we gunned to group them by the services that our company provides. e.g. payments, customer registration, stock management
Inconveniently, features don't always conform to a hierarchical tree view of the world, so make liberal use of tagging and your primary grouping of features is less important.
Have you tried yard? There's an example here We've just built it into our CI, it lets you pull together sets of scenarios based on tags, you can do unions, intersections etc... well worth it :)
I would keep the JavaScript and non-JavaScript versions of a scenario together, since they should be very similar.
Anything more than 8 scenarios in a feature file is probably too much.
A useful approach is to have a folder to represent the high-levels features (sometimes call epics or themes), and separate feature files within those folders for the different aspects of the behaviour.
For example, you may have a feature "Employee Directory" which would have separate feature files contains scenarios for a photograph, office location, job title, etc.
Depending on the size and complexity of your app, you could group those folders into other folders.
(Note that none of the above is specific to Rails apps).

"Business" Software Metrics

I'm looking for a service (or gem) that will enable me to create a track software-produced business metrics. I should clarify what I'm looking for, because this might be me failing at articulating what I'm looking for in Google. Basically, based on the context of my software, I want to be able to emit certain values and then have them accumulate as metrics. These are not performance or request metrics, per se, and certainly not code-quality metrics.
The quintessential use case is: suppose I have an if / else block in my code, I'd like to publish a metric that tells me how often I choose the true block vs the false block.
Or, suppose I'm using delayed_job, I'd like to publish how often jobs run and how many are in the queue on each run.
I can find all the metrics I want in code, I'm just not sure where to put them right now.
AWS cloudwatch has an api to publish your own metrics. New Relic does, too. However, both look expensive and give me a whole lots I'm not looking for (all the host metrics and code profiling).
Are there other services out that that offer this kind of functionality?
There are actually a couple of services that offer this functionality. My company's product, Instrumental sounds like it might be a great fit for you - we've got a Ruby client as well as some extra tools for measuring system level stuff among other things.
If you're up for hosting your own stats collecting services, many people use the Graphite/Statsd combination; it takes a bit to set up and maintain, but it can definitely accomplish what you're looking to do here as well.
If I understood you right, you're looking for some form of event tracking (as in, how often a part of your code gets used as opposed to another part). If that's right, you might want to give Mixpanel a look.
You should see NewRelic (APM or Insights modules) or Microsoft Insights, they allow you to create custom metrics (business metrics) inside your source code and monitor them online using dashboards.

Breaking up a Rails app into two

We have a very lage Rails app that has two distinct sections: the front end and the CMS/Admin. We would like to break up the app into two pieces (for maintenance, as we have distinct teams that work on the front end vs. back end and they could have different release cycles).
One thought was to start a new Admin 2.0 app that has access to the models/schema from the original application, but has its own controllers/views and its own models that extend the original models until it is safe to fully decouple. Is this advisable? If not, what would be an appropriate plan to migrate away from one monolithic codebase?
warning, this is a bit ranty, and does not go anywhere.
Having worked on a very large app that operates in the manor you describe (for scalability reasons), I still have mixed opinions (an no conclusive answers).
Currently we operate 3 major apps (+ one or two smaller ones that use a fragment of the schema).
RVW (our admin app): This is the only app that writes, runs on a single server, and is responsible for maintaining the schema.
reevoo.com: ecommerce, price comparison, stuff like that. This (for historic reasons runs on a slightly different schema, run on a read only slave of RVW, with database views to map the schemas. All writes are done by sticking things on queues that RVW picks up and acts on. This works very well, although the number of random db related issues (mostly related to the views) is an issue. The main problem with this app is the difficulty sharing code (gems work well, I've often dreamed of bringing the schemas into line and sharing the core models in a gem!). We share code between apps using ruby gems. And test using lots of integration tests that cross app boundaries (using drunit (presentation on this available)).
reevoomark: very high load b2b app. This has many servers each with a full stack (db server, app server one per node). These have their databases populated with a db export - import batch job. This works very well in the short term, the shear flexibility of it is just ace, but integration testing between apps is very hard.
My advice would be to avoid splitting the apps at all costs, keeping things DRY quickly becomes a major challenge. My advice would be to stick with one app, two sets of routes (selected at startup by environment variables).
This gives you all the advantages of the other solutions, while making code sharing implicit. Splitting your test packs out would make your test cycles shorter and make things more manageable for the two teams. I would avoid working on different code bases, as doing this promotes the apps drifting apart and making code sharing tricky (as in .com).
If you decide do split, have a good set of high level cross app tests. Custom (per app) extensions to a core set of models sounds like a good plan, although with distinct code bases and teams you may still end up with duplicate code. Rails engines should be a good way of sharing the models, but be prepared for model reloading to become a little schizophrenic.
Good luck!
Have you namespaced your admin controllers? That would be a relatively easy point of subdivision and also avoid many of the negative side effects of forking your code into two apps.
Have you considered looking at Rails Engines? Added to Rails in 2.3.

Resources