Create fake data to use as tutorial - ruby-on-rails

I want to create fake usage of my app, as a tutorial.
In order to do so, I need to create fake data (with the same structure and association in real data) to send to my templates, so it renders the same view when my users will have when using the app.
But unfortunately, I didn't find an easy way to do that, except hard coding the examples (fake usage) as a separate view, that looks like the generated view with true data. But this is not useful when you have to change the template, one day or another.
And, mixing true data with fake ones in the database (and then retrieving only fake data when needed) doesn't seem to be a good idea.
Do you have an idea or a gem to suggest?

One option to consider is to use a factory gem (such as factory_girl) or similar (I personally like machinist) and then create some form of script that initializes objects in the test environment.
This may not be the intended or ideal use for a test environment, but you'd be able to utilize your app in a non-production environment with fake data, and allows you to keep data separate from real and/or development data.
The only "gotcha" I can think of is that you'll want to ensure that any automatic testing gem (such as rspec) drops the test database before your tests run. If you don't, you could see errors in your tests due to data already existing in the database.

Related

2 rails apps - shared data using common engine

I've started off with a single Rails application. Very simple, largely read-only, front end for a line of business application (using view backed to retrieve data) - with a few standard tables to augment the views.
I now have a need to use the same set of data in a new application (the 2 applications, whilst sharing the same data, work differently enough its not trivial to try and merge them into the same application).
I figured it would be easiest to split the models I can reuse into their own engine and have the 2 applications share the database.
Adding an API and having both applications query that is an option, but in practice I'm not sure I can provide an API that will satisfy both applications properly, as they use the data in different ways.
On that basis I figured if I gave each application a table prefix, or use different schemas to namespace them - that way each application has it's own distinct tables for stuff they don't share, but I can easily reuse the existing views without having to duplicate them.
Both options seems to work great, except I forgot about migrations for the common views and data.
So the only things I can think of are:
Since the 2 applications are kinda tightly coupled anyway, I have no migrations in my common data engine at all - any changes to the views/tables will be dealt with by the "first" application. This seems a bit nasty, since the models are now contained in a separate engine.
I dislike the idea of having migrations in the engine and then copying them into one or the other, since that's basically the same thing.
I use pivotal labs advice, but add some code to detect if it's in the "first" application, and only apply to that. If I fail to do that I end up with both applications including the engine migrations, which results in both applications trying to run the same migrations, and causing nothing but pain.
I actually split off the common data into it's own database. So application #1 uses db #1, application #2 uses db #2, and the common data is housed in db #3 and accessible by both applications. With a bit of faffing, I'm guessing I can end up with 3 dbs, 3 schema_migrations, and I can just blindly leave my migrations in the engine, and include them in both applications as per pivotal labs - my plan is to do something like this to make all this work, and have the common models setup to connect to their own DB, rather than the application DB
Stick with 1 db and multiple schemas, and somehow setup a task to run the engine migrations only, using an account locked down to it's own schema only - that way it creates it's own schema_migrations.
I kinda feel paralyzed as I'm not sure what is the least shitty option. 3 or 4 feels "best", but not great.
I think 3 is the best option. However, I'd expose the common db #3 data via an api. That is, have an application that is used to manage the shared data (HTML admin interface, and json feeds for the other apps to connect on).
Keep the shared data app really simple. Just CRUD actions on the data. So the other two apps will grab the data from the shared data app, manipulate to match local requirements, and then display it; or allow input - manipulate the input to match the shared structure, and then persist via the shared data app's API update/create actions.
As an update to this, due to how tightly coupled the 2 applications are, and other needs that came up yesterday, I've ended up pulling the migrations out of both and using active_record_migrations to manage and orchestrate the migrations for both applications.
Effectively I could've done the same by having a dummy or master application, however this feels somewhat cleaner.
However, this comes at the expense of being able to use models effectively in the migrations. For my use case this doesn't matter at all.

What's wrong with Rspec, FactoryGirl and real sample data?

I've been leaning some of RoR, and when I got to TDD, stuff started to get more complicated. At some point of my App, I thought it would be better to run my tests over the real data.
Real Data vs Sample Data
Searching the web, I found that tests were not meant to run over real data, but over sample data. But yet I couldn't agree with that.
Let's supose my app had an Alias System. So when you access a random url it figures out what that fragment wants and redirects to the proper canonical url. And let's add that an alias dictionary is stored in some models. How would we test agains that dictionary? Hard code spec files for every alias/keyword?
Sticking with real data
The first two things I've realized, yet very unsurely, is:
Rspec testing environment wouldn't access development model's data.
FactoryGirl rules over my testing database, so it's not my option to populate it.
The best solution I could figure out, as the complete newbie I am, is that I could create some classes in spec/support folder and call them inside my factories so as to get that real data. Those classes have a short sample of my real database info, nested, and so my test can go 'real'.
What can pros around suggest to improve it?
I think you may want to look into building a seeds.rb file to populate your databases. This is usually used to initialize the development database so it can be used in your app (and queried in the rails console), but you can use it to seed your test database as described in this answer.
You certainly should not use your development database for testing. You can either seed the test database, or create factories that reflect various scenarios.
FactoryGirl rules over my testing database, so it's not my option to populate it.
You can use more than one factory to represent a business entity, depending on the scenario under test. FactoryGirl makes this easy by allowing you to nest factories. You can define a factory with a basic set of valid attributes and use that in unit tests. For integration (feature) tests, you can use a nested factory that expands on the basic attributes to implement a particular scenario. You can have as many variations of these implementation-specific factories as you need.

What are best practices to handle database data in 'test' mode?

I am using Ruby on Rails 3.2.2, cucumber-rails-1.3.0, rspec-rails-2.8.1 and capybara-1.1.2. I have this problem but I started thinking that maybe I'm doing something wrong... mostly about seeding data in the test database for testing purposes. Specifically, my issue is related to how to properly manage data in the test database when I have to test my application.
My doubt is: By seeding data (For Your Information: I use the ROOT_PATH/db/seed.rb file to inject that data) in the test database I'm doing things as they should be done? That is, how should I populate the test database since the data in that database* is required in order to make to properly work my application for testing purposes? Should I populate the test database at all?
In other words, what are best practices to handle database data in test mode (in my case)? And, generally speaking, how the situation should be handled?
***** For example, in order to work my application requires at least data related to an "anonymous" user, to "basic" articles, to "basic" article categories, etc.
You should use one of the following:
Fixtures. See corresponding Rails documentation
Factories. The most popular tool to created/manage factories is FactoryGirl. IMHO that's the best solution.
Make sure data is seeded into test database. See this StackOverflow question.
I had a similar problem, association made it necessary to have a bit of seed data:
Factories will make your tests really slow, they are perfectly OK for single objects, but not for a lot of seed data that has to be created for each test
Fixture Builder - http://github.com/rdy/fixture_builder
I created a bunch of fixtures and just load them from the DB for each test, cut my test time down by 40%. You can also load the seeds file instead.
But be careful, deleting or updating records will create unwanted side effects. Use factories for those specs.
Mock and Stub everything so that your tests rarely touch the DB.
This has become very unpopular, you will eventually get passing specs that don't pick up on your actual errors.

Rspec testing against data from an external database

I'm currently creating rspec test cases for a Rails 2.3 application. The problem I'm confronting is that the application has a dependency on data from an external database like this:
Main_User is an external application.
Node_User is the application I'm testing.
When a new Node_User is created, Node_User calls Main_User which creates a user and returns with some data which Node_User uses to create a the Node_User.
Is there any way I can query the Main_User database to verify that the record created there contains the data that is passed to Node_User? Or is there a better way to test this?
Thank you!
Yes, you can theoretically query the remote database if you have access to the user/pass/hostname for that database. (Search for "Rails multiple databases.")
But I wouldn't necessarily recommend taking that approach. Your tests would be modifying the state of the remote database. This strikes me as unsafe. You'll be creating bogus records--and possibly deleting or corrupting data, if your tests do any mass deletes or exercise broken functionality. In other words, your tests may do bad things to a production database.
Therefore, I would recommend mocking the remote database in some way. One option would be VCR. It's really cool, but it would require you to operate on the remote database at least once, because it has to record the first run. Another option is to use one of the tools underlying VCR, such as FakeWeb.
Of course, this only tests the Node_User side of things. You said you also need to verify that the remote record is correct. To do that, I would recommend writing tests on the Main_User side. Such tests would simulate an incoming RESTful request from Node_User and verify the correct result.
That only works if Main_User is your application. If it's a third-party service, then there may not be a safe way to test it. You may just have to trust its correctness.

Best practice for importing a partial database dump into a rails app daily?

The iTunes Enterprise Partner Feed is "a data feed of the complete set of metadata from iTunes and the App Store" and "is available in two different formats - either as the files necessary to build a relational database or as stand-alone flat files that are country and media dependent."
I need to consume the data from this feed (which is essentially exported into flat files) and allow linking of my own Model objects (User, Activity, etc.) to data provided by the feed (App, Developer, etc.) The data is provided as a weekly full export and a daily incremental export.
I have two ideas for ways to implement this:
Create all of the models in my rails app and write my own importer that will insert/update records directly into my app's database daily via cron using models I've created (App, Developer, etc.)
Keep this database entirely separate and open up REST API that my own app will consume
My naive approach with #1 to keep everything in the Rails app is based on the need to be able to observe changes in the data I'm getting from the EPF. For example, if an App's description is updated, I want to be able to create an Activity object via an observer to track that update.
On one hand #2 feels like a better approach because it creates a standalone API into the data that can be consumed from several different apps I create. On the other hand, I'm just not sure how I'd accomplish the data change notifications without using observers directly on my own models. Or, how I would even consume the data in an object oriented way that could still be used with my own models. It feels like a lot of duplicate work to have to query the API for, say, an App's data, create a proper Active Record object for it and then save it just so it can be linked to one of my own models.
Is there a standard way to do this that I'm totally missing? Any pointers?
EDIT: Rails engines sound interesting but it would mean that each app would still need to consume and insert the data separately. That doesn't sound so DRY. It sounds more and more like a REST API is the way to go. I just don't know how to bridge the gap from API to Active Record model.
Rails Engines might be a good fit for this. You can create a Rails Engine gem and add all of your models and rake tasks to consume the data. Then you can include this gem in any app that uses it and also create an API app which includes the gem. You should be able to create observers in your other apps that interact with the gem.
I have quite a few apps that interact with each other and this approach works well for me. I have one central app that includes all the engines that consume data and I run all of my cronjobs from this app. I use the use_db plugin which allows my app to communicate with different databases. Each engine has use_db as a dependency and I keep the database configuration inside the gem. One example:
Engine Gem = transaction_core
This gem consumes transaction data from a source and inserts it into my transaction database.
The gem is included in my central app and I pull the transaction data using a rake task on the cron
I include this gem in several other apps that need to use the transaction data. Since the engine automatically adds the models and database config to the app, there is no additional work required to use the models in the app.
I have not used observers inside an app that includes my engines, but I see no reason why it would not work. With the engine the models work as if they are in your app/models directory. Hope this helps!
Modest Rubyist has a good 4 part tutorial on Rails 3 plugins that includes Engines:
http://www.themodestrubyist.com/2010/03/05/rails-3-plugins---part-2---writing-an-engine/

Resources