Related
I love Faker, I use it in my seeds.rb all the time to populate my dev environment with real-ish looking data.
I've also just started using Factory Girl which also saves a lot of time - but when i sleuth around the web for code examples I don't see much evidence of people combining the two.
Q. Is there a good reason why people don't use faker in a factory?
My feeling is that by doing so I'd increase the robustness of my tests by seeding random - but predictable - data each time, which hopefully would increase the chances of a bug popping up.
But perhaps that's incorrect and there is either no benefit over hard coding a factory or I'm not seeing a potential pitfall. Is there a good reason why these two gems should or shouldn't be combined?
Some people argue against it, as here.
DO NOT USE RANDOM ATTRIBUTE VALUES
One common pattern is to use a fake data library (like Faker or Forgery) to generate random values on
the fly. This may seem attractive for names, email addresses or
telephone numbers, but it serves no real purpose. Creating unique
values is simple enough with sequences:
FactoryGirl.define do
sequence(:title) { |n| "Example title #{n}" }
factory :post do
title
end
end
FactoryGirl.create(:post).title # => 'Example title 1'
Your randomised
data might at some stage trigger unexpected results in your tests,
making your factories frustrating to work with. Any value that might
affect your test outcome in some way would have to be overridden,
meaning:
Over time, you will discover new attributes that cause your test to
fail sometimes. This is a frustrating process, since tests might fail
only once in every ten or hundred runs – depending on how many
attributes and possible values there are, and which combination
triggers the bug. You will have to list every such random attribute in
every test to override it, which is silly. So, you create non-random
factories, thereby negating any benefit of the original randomness.
One might argue, as Henrik Nyh does, that random values help you
discover bugs. While possible, that obviously means you have a bigger
problem: holes in your test suite. In the worst case scenario the bug
still goes undetected; in the best case scenario you get a cryptic
error message that disappears the next time you run the test, making
it hard to debug. True, a cryptic error is better than no error, but
randomised factories remain a poor substitute for proper unit tests,
code review and TDD to prevent these problems.
Randomised factories are therefore not only not worth the effort, they
even give you false confidence in your tests, which is worse than
having no tests at all.
But there's nothing stopping you from doing it if you want to, just do it.
Oh, and there is an even easier way to inline a sequence in recent FactoryGirl, that quote was written for an older version.
It's up to you.
In my opinion is a very good idea to have random data in tests and it always helped me to discover bugs and corner cases I didn't think about.
I never regret to have random data. All the points described by #jrochkind would be correct (and you should read the other answer before reading this one) but it's also true that you can (and should) write that in your spec_helper.rb
config.before(:all) { Faker::Config.random = Random.new(config.seed) }
this will make so that you have repeatable tests with repeatable data as well. If you don't do that then you have all the problems described in the other answer.
I like to use Faker and usually do so when working with larger code bases. I see the following advantages and disadvantages when using Faker with Factory Girl:
Possible disadvantages:
A bit harder to reproduce the exact same test scenario (at least RSpec works around this by displaying the random number generator seed every time and allows you to reproduce the exact same test with it)
Generating data wastes a bit of performance
Possible advantages:
Makes data displayed usually more humanly comprehensible. When creating test-data manually, people tend to all kinds of short-cuts to avoid the tediousness.
Building factories with Faker for tests at the same time provides you with the means of generating nice demo data for presentations.
You could randomly discover edge case bugs when running the tests a lot
Currently, for each table in my database, I add columns in several steps (ie. I add columns by migrating new files on multiple occasions). This results in a large number of migration files (~50 or so?). This seems very un-DRY.
I end up with large "add-details_to" files mixed with single entry "add_(column_name)_to" files, making it difficult to tell which file was used to migrate which column.
Is there a way to DRY up the migration files so that I have a single migration file for each table?
For example, if I add multiple columns in a single migration, then decide I want to remove one of those columns, what is the best practice?
1) create a down migration for the one column I want to remove
2) rollback the entire multiple-column migration, then create a new up migration with only the columns I want.
I currently follow 1, but it seems to me that 2 would allow me to get rid of my initial mistake migration files, thereby avoiding the lots-of-migration-files-for-each-table problem.
Any thoughts would be appreciated!
I think in general it's a good option to just let your migration files grow and just manage the growing requirements through tests. shoulda-matchers is a great tool for this.
I definitely do not like the idea of down migrations, especially after its up has been run on the server (few exceptions if the down is against the immediate migration). I would rather create another migration to do what would have been done in the down. Though, I will admit there are times down is the way to go.
But at the end this all depends on where you are in your app. If working on a feature locally and want to consolidate, I could see you doing that, where you are doing a db:migrate:redo till you get what you need on your current migration. However, once you push something up (especially to production) I'd add another migration.
I'm joined a rails project which have been going for past two months. I saw that the developers are modifying existing migrations to change column types/names? When I ran migrations nothing happens and I get random errors like, method not found. When I deubug and check the database, find that the field names are different and thats the reason errors come.
As per my understanding for each modification of database, we need to create new migration.
Is this behavior of modifying existing migrations acceptable?
Generally, once you check in a migration it should not be modified. For one project I was a part of, we reset the database on a regular basis and so people would modify migrations from time to time rather than create new ones, but it's not a practice I'm fond of.
So, if you want the default Rails migration behavior to work, no, don't change a migration that someone may have already used. However, if you are okay with working around the default behavior, it doesn't matter. Note: if you're running migrations on production to keep the database up to date, it's important that you never change a migration that has already been run on production!
It's all about good style. Of course you can change migrations, but this is bad style. According the guidelines you should make new migration for each changing.
In my opinion this is not acceptable. Here are some reasons when and why it could be acceptable or nor:
I'm developing 2 rails applications for myself only, and I am the only developer. When I have made an error in the migration, I sometimes just fix it, rollback the migration and redo it after fixing the error. This only works because
I know that I am the only one.
I do it right after the erroneous change
I know how and when to first rollback and then redo it.
In all other cases, it is not acceptable:
More than one developer.
Integrated by version control system like Subversion, Git, ...
Independent development done in different rooms or even location.
So I think you have the hard job to change the behavior of others. Good luck with that!
How reliable is this plugin for writing down migrations. Some people in the rails community I have spoken with have told me they swear by it and others are telling me to just stay away. Any and all thoughts will be appreciated.
It is phenomenal, but I have had it not work quite right before. However, I would highly recommend doing a rake db:migrate:redo after running a migration for the first time anyways to make sure that the up and the down both work. Even if it only writes 90% of the down migration for you, I don't know why you would stay away.
From Rails 3.1 onwards, for most cases, you don't need to write a down method. The migrations will have one change method and Rails automatically does the down migration in case of rollbacks.
Refer: http://edgeguides.rubyonrails.org/migrations.html#writing-your-change-method
If you're just generating DDL changes (adding columns, etc), it has always been rock solid for me. However, if you're removing columns or generating DML statements such as copying data from one field to another, translating data, etc... :RInvert will not handle those. But there's still no reason I can think of to not use what they do generate as a starting point. If you don't like the generated down migration by :RInvert, just delete it and you're no worse off than before you ran it.
As I move through the iterations on my application*(s) I accumulate migrations. As of just now there are 48 such files, spanning about 24 months' activity.
I'm considering taking my current schema.rb and making that the baseline.
I'm also considering deleting (subject to source control, of course) the existing migrations and creating a nice shiny new single migration from my my current schema? Migrations tend to like symbols, but rake db:schema:dump uses strings: should I care?
Does that seem sensible?
If so, at what sort of interval would such an exercise make sense?
If not, why not?
And am I missing some (rake?) task that would do this for me?
* In my case, all apps are Rails-based, but anything that uses ActiveRecord migrations would seem to fit the question.
Yes, this makes sense. There is a practice of consolidating migrations. To do this, simply copy the current schema into a migration, and delete all the earlier migrations. Then you have fewer files to manage, and the tests can run faster. You need to be careful doing this, especially if you have migrations running automatically on production. I generally replace a migration that I know everyone has run with the new schema one.
Other people have slightly different ways to do this.
I generally haven't done this until we had over 100 migrations, but we can hit this after a few months of development. As the project matures, though, migrations come less and less often, so you may not have to do it again.
This does go against a best practice: Once you check in a migration to source control, don't alter it. I make a rare exception if there is a bug in one, but this is quite rare (1 in 100 maybe). The reason is that once they are out in the wild, some people may have run them. They are recorded as being completed in the db. If you change them and check in a new version, other people will not get the benefit of the change. You can ask people to roll back certain changes, and re-run them, but that defeats the purpose of the automation. Done often, it becomes a mess. It's better left alone.
I think that there are two kinds of migrations:
those you made during design/development, because you changed your mind on how your db should be like;
those you made between releases, reflecting some behaviour changes.
I get rid of the first kind of migrations as soon as I can, as they do not really represent working releases, and keep the second kind, so that it is possible, in theory, to update the app.
About symbols vs strings: many argue that only strings should be used in migrations: symbols are meant to be "handles" to objects, and should not be used to represent names (column and table names, in this case). This is a mere stylistic consideration, but convinced me, and I'm no more using symbols in migrations.
I've read of another point for using strings: "ruby symbols are memory leaks", meaning that, when you create a symbol, it never gets disposed for all the application life time. This seems quite pointless to me, as all your db columns will be used as symbols in a Rails (and ActiveRecord) app; the migrating task, also, will not last forever, so I don't think that this point actually makes sense.
The top of schema.rb declares:
# This file is auto-generated from the current state of the database. Instead of editing this file,
# please use the migrations feature of Active Record to incrementally modify your database, and
# then regenerate this schema definition.
#
# Note that this schema.rb definition is the authoritative source for your database schema. If you need
# to create the application database on another system, you should be using db:schema:load, not running
# all the migrations from scratch. The latter is a flawed and unsustainable approach (the more migrations
# you'll amass, the slower it'll run and the greater likelihood for issues).
#
# It's strongly recommended to check this file into your version control system.
I must endorse what [giorgian] said above about different migrations for different purposes. I recommend cleaning up development-oriented migrations along with other tasks you do when you branch for a release. That works for well for me, for myself and small teams. Of course my main app sits atop and between two other databases with their own schemas which I have to be careful of so we use migrations (rather than schema restore) for a new install and those need to survive release engineering.
Having lots of migrations are a good thing. Combined with your version control system, they allow you to see what developer made a change to the database and why. This helps with accountability. Removing them just makes this a big hassle.
If you really want to get a new database up and running quickly you can just load the schema with rake db:schema:load RAILS_ENV=your_environment and if you want to get your test database setup quick you can just use rake db:test:prepare
That being said, if you really want to consolidate your migrations then I'd create a new migration that checks to see if the very last migration in your set has been performed (ex: does the column you added exist?) and if not, then it will fire. Otherwise the migration will just add itself to the schema table as completed so it doesn't attempt to fire again.
Just communicate what you're doing to the rest of your team so that they understand what is going on lest they blindly fire off a rake db:migrate and screw up something they already had.
Although I'm sure everyone has their own practices, there's a few rules implied by the way the migration system works:
Never commit changes to migrations that may have been used by other developers or previous deployments. Instead, make an additional migration to adjust things as required.
Never put model-level dependencies in a migration. The model may be renamed or deleted at some point in the future and this would prevent the migration. Keep the migration as self-contained as possible, even if that means it's quite simplistic and low-level.
Of course there are exceptions. For example, if a migration doesn't work, for whatever reason, a patch may be required to bring it up to date. Even then, though, the nature of the changes effected by the migration shouldn't change, though the implementation of them may.
Any mature Rails project will likely have around 200 to 1000 migrations. In my experience it is unusual to see a project with less than 30 except in the planning stages. Each model, after all, typically needs its own migration file.
Collapsing multiple migrations into a single one is a bad habit to get into when working on an evolving piece of software. You probably don't collapse your source control history, so why worry about database schema history?
The only occasion I can see it as being reasonably practical is if you're forking an old project to create a new version or spin-off and don't want to have to carry forward with an extraordinary number of migrations.
You shouldn't be deleting migrations. Why create the extra work?
Migrations essentially are a set of instructions that define how to build the database to support your application. As you build your application the migrations record the iterative changes you make to the database.
IMHO by resetting the baseline periodically you are making changes that have the potential to introduce bugs/issues with your application, creating extra work.
In the case where a column is mistakenly added and then needs to be removed sometime later, just create a new migration to remove extra column. My main reason for this is that when working in a team you don't want your colleagues to have to keep rebuilding their databases from scratch. With this simple approach you (and they) can carry on working in an iterative manner.
As an aside - when building a new database from scratch (without any data) migrations tend to run very quickly. A project I am currently working on has 177 migrations, this causes no problems when building a new database.