Unit testing to protect against production - ruby-on-rails

We recently had a bug go into production where we added validates_presence_of to a model, but that column was nil for any records created before the change, so any code that instantiated the model on old data was broken.
This seems to be in a class of bugs that can only be caught when tested against the production data.
Is there a way to create unit tests that protect against these types of issues?

When talking about writing automated tests for unhappy paths or cases in your code (also called negative testing), you will end up writing tests for the things you expect to happen. Take your validates_presence_of situation, you can write now a test that validates the proper handling of the case where the column was nil, because you know now that it can happen. Writing these tests before issues arise is the tricky part, because you don't know you have to.
If you really want to prevent these cases before they happen, you could consider:
Testing against real data: get a hold of a realistic database and run functional tests and database consistency checks in it. Some things to consider with this approach:
If the database you are considering as reference has customer data, you'll probably want to anonimise it, for the sake of customer privacy.
This approach is hard to get right within the development loop, as it's costly to automate the set-up and orchestration. This strategy is more suitable for a one-off effort, say, before doing a massive migration or something like that.
As I pointed earlier, you will still write tests for the things you expect to happen, but given you are increasing your dataset, there is a higher likelihood to stumble upon something unexpected
Exploratory testing: I know you mentioned unit tests in your question, because you want to automate this process, but another good way to prevent these cases before they happen is to apply your full brain energy in coming up with unexpected but feasible scenarios, and manually test them to see if your code supports these scenarios or not (and if not, bug caught, patch and write a test to ensure the bug never sees the light).
An automated approach that might scale better would err on the side of detection instead. At some point we have to accept that software has bugs and software development is not a flawless process that prevents all of them. It's very likely that some of those bugs get into production, so let's minimise the impact they have.
We still want to write tests to catch as many issues we can before releasing, so that the ones that reach our users are rare and not dangerous. When this happens, good logging practices, log aggregators, good monitoring of errors and good alerting schemes will let us know immediately, and we'll be able to react before it affects more users or gets worse.
This path can get even more solid when canary releases are executed, anomaly detection patterns are applied on real time behavioral data and automated rollbacks are programmed the moment things start going wrong.

Related

Should we be using Faker in Rails Factories?

I love Faker, I use it in my seeds.rb all the time to populate my dev environment with real-ish looking data.
I've also just started using Factory Girl which also saves a lot of time - but when i sleuth around the web for code examples I don't see much evidence of people combining the two.
Q. Is there a good reason why people don't use faker in a factory?
My feeling is that by doing so I'd increase the robustness of my tests by seeding random - but predictable - data each time, which hopefully would increase the chances of a bug popping up.
But perhaps that's incorrect and there is either no benefit over hard coding a factory or I'm not seeing a potential pitfall. Is there a good reason why these two gems should or shouldn't be combined?
Some people argue against it, as here.
DO NOT USE RANDOM ATTRIBUTE VALUES
One common pattern is to use a fake data library (like Faker or Forgery) to generate random values on
the fly. This may seem attractive for names, email addresses or
telephone numbers, but it serves no real purpose. Creating unique
values is simple enough with sequences:
FactoryGirl.define do
sequence(:title) { |n| "Example title #{n}" }
factory :post do
title
end
end
FactoryGirl.create(:post).title # => 'Example title 1'
Your randomised
data might at some stage trigger unexpected results in your tests,
making your factories frustrating to work with. Any value that might
affect your test outcome in some way would have to be overridden,
meaning:
Over time, you will discover new attributes that cause your test to
fail sometimes. This is a frustrating process, since tests might fail
only once in every ten or hundred runs – depending on how many
attributes and possible values there are, and which combination
triggers the bug. You will have to list every such random attribute in
every test to override it, which is silly. So, you create non-random
factories, thereby negating any benefit of the original randomness.
One might argue, as Henrik Nyh does, that random values help you
discover bugs. While possible, that obviously means you have a bigger
problem: holes in your test suite. In the worst case scenario the bug
still goes undetected; in the best case scenario you get a cryptic
error message that disappears the next time you run the test, making
it hard to debug. True, a cryptic error is better than no error, but
randomised factories remain a poor substitute for proper unit tests,
code review and TDD to prevent these problems.
Randomised factories are therefore not only not worth the effort, they
even give you false confidence in your tests, which is worse than
having no tests at all.
But there's nothing stopping you from doing it if you want to, just do it.
Oh, and there is an even easier way to inline a sequence in recent FactoryGirl, that quote was written for an older version.
It's up to you.
In my opinion is a very good idea to have random data in tests and it always helped me to discover bugs and corner cases I didn't think about.
I never regret to have random data. All the points described by #jrochkind would be correct (and you should read the other answer before reading this one) but it's also true that you can (and should) write that in your spec_helper.rb
config.before(:all) { Faker::Config.random = Random.new(config.seed) }
this will make so that you have repeatable tests with repeatable data as well. If you don't do that then you have all the problems described in the other answer.
I like to use Faker and usually do so when working with larger code bases. I see the following advantages and disadvantages when using Faker with Factory Girl:
Possible disadvantages:
A bit harder to reproduce the exact same test scenario (at least RSpec works around this by displaying the random number generator seed every time and allows you to reproduce the exact same test with it)
Generating data wastes a bit of performance
Possible advantages:
Makes data displayed usually more humanly comprehensible. When creating test-data manually, people tend to all kinds of short-cuts to avoid the tediousness.
Building factories with Faker for tests at the same time provides you with the means of generating nice demo data for presentations.
You could randomly discover edge case bugs when running the tests a lot

rails integration testing why fixtures rather than db

I'm new to testing in rails and I don't understand why or when I should use fixtures (or factory) rather than just seeding my test db and querying it to run the tests?
In many cases, it should be faster and easier to have the same data in dev and test env.
For example, if I want to test an index page, should I create 100 records via a factory or should I seed the db with 100 records?
I someone could clarify this, it would be great.
Thanks!
This is actually a deeper question of how to test efficiently, and you will find a lot of different opinions.
The reason to avoid a database in your unit tests is merely speed. Database operations are slow. It might not seem slow with one test, but if you have continuous integration going (as you should) or when you made a quick change and just want to see what happens, those delays add up. So prefer mocks to truly unit test code.
Then your own integration tests should hit an in-memory database rather than your real database--for the same reason, speed. These will be slower than your mocked tests, but still faster than hitting the real database. When you're developing, the build-test-deploy cycle needs to be as fast as possible. Note that some people call these unit tests as well. I wouldn't, but I guess it is just semantics.
These first two kinds of tests are by developers for developers.
Then the testers will hit the real database, which will be populated with test data defined by the testers and subject-matter experts. There are lots of clever ways to speed this up as well, but this will be the place where they test the integration of your code with the production-like database. If all your in-memory database tests passed and something goes wrong here, then you know it has to do with something like database configuration, vendor-specific SQL, etc. rather than something fundamentally bad. You will also get your first taste of what the performance is like.
Note that everything I've said here is a matter of debate. But hopefully it clarifies what you should consider about when to do certain things and why.

How to shift development of an existing MVC3 app to a TDD approach?

I have a fairly large MVC3 application, of which I have developed a small first phase without writing any unit tests, targeted especially detecting things like regressions caused by refactoring. I know it's a bit irresponsible to say this, but it hasn't really been necessary so far, with very simple CRUD operations, but I would like to move toward a TDD approach going forward.
I have basically completed phase 1, where I have written actions and views where members can register as authors and create course modules. Now I have more complex phases to implement where consumers of the courses and their trainees must register and complete courses, with academic progress tracking, author feedback and financial implications. I feel it would be unwise to proceed without a solid unit testing strategy, and based on past experience I feel TDD would be quite suitable to my future development efforts here.
Are there any known procedures for 'converting' a development effort to TDD, and for introducing unit tests to already written code? I don't need kindergarten level step by step stuff, but general strategic guidance.
BTW, I have included the web-development and MVC tags on this question as I believe these fields of development can significant influence on the unit testing requirements of project artefacts. If you disagree and wish to remove any of them, please be so kind as to leave a comment saying why.
I don't know of any existing procedures, but I can highlight what I usually do.
My approach for an existing system would be to attempt writing tests first to reproduce defects and then modify the code to fix it. I say attempt, because not everything is reproducible in a cost effective manner. For example, trying to write a test to reproduce an issue related to CSS3 transitions on a very specific version of IE may be cool, but not a good use of your time. I usually give myself a deadline to write such tests. The only exception may be features that are highly valued or difficult to manually test (like an API).
For any new features you add, first write the test (as if the class under test is an API), verify the test fails and implement the feature to satisfy the test. Repeat. When you done with the feature, run it through something like PEX. It will often highlight things you never thought of. Be sensible about which issues to fix.
For existing code, I'll use code coverage to help me find features I do not have tests for. I comment out the code, write the test (which fails), uncomment the code, verify test passes and repeat. If needed, I'll refactor the code to simplify testing. PEX can also help.
Pay close attention to pain points, as it highlights areas that should be refactored. For example, if you have a controller that uses ObjectContext/IDbCommand/IDbConnection directly for data access, you may find that you require a database to be configured etc just to test business conditions. That is my hint that I need an interface to a data access layer so I can mock it and simulate those business conditions in my controller. The same goes for registry access and so forth.
But be sensible about what you write tests for. The value of TDD diminishes at some point and it may actually cost more to write those tests than it is to give it to someone in India to manually test.

What not to test in Rails?

I've been writing tests for a while now and I'm starting to get the hang of things. But I've got some questions concerning how much test coverage is really necessary. The consensus seems pretty clear: more coverage is always better. But, from a beginner's perspective at least, I wonder if this is really true.
Take this totally vanilla controller action for example:
def create
#event = Event.new(params[:event])
if #event.save
flash[:notice] = "Event successfully created."
redirect_to events_path
else
render :action => 'new'
end
end
Just the generated scaffolding. We're not doing anything unusual here. Why is it important to write controller tests for this action? After all, we didn't even write the code - the generator did the work for us. Unless there's a bug in rails, this code should be fine. It seems like testing this action is not all too different from testing, say, collection_select - and we wouldn't do that. Furthermore, assuming we're using cucumber, we should already have the basics covered (e.g. where it redirects).
The same could even be said for simple model methods. For example:
def full_name
"#{first_name} #{last_name}"
end
Do we really need to write tests for such simple methods? If there's a syntax error, you'll catch it on page refresh. Likewise, cucumber would catch this so long as your features hit any page that called the full_name method. Obviously, we shouldn't be relying on cucumber for anything too complex. But does full_name really need a unit test?
You might say that because the code is simple the test will also be simple. So you might as well write a test since it's only going to take a minute. But it seems that writing essentially worthless tests can do more harm than good. For example, they clutter up your specs making it more difficult to focus on the complex tests that actually matter. Also, they take time to run (although probably not much).
But, like I said, I'm hardly an expert tester. I'm not necessarily advocating less test coverage. Rather, I'm looking for some expert advice. Is there actually a good reason to be writing such simple tests?
My experience in this is that you shouldn't waste your time writing tests for code that is trivial, unless you have a lot of complex stuff riding on the correctness of that triviality. I, for one, think that testing stuff like getters and setters is a total waste of time, but I'm sure that there'll be more than one coverage junkie out there who'll be willing to oppose me on this.
For me tests facilitate three things:
They garantuee unbroken old functionality If I can check that
nothing new that I put in has broken
my old things by running tests, it's
a good thing.
They make me feel secure when I rewrite old stuff The code I
refactor is very rarely the trivial
one. If, however, I want to refactor
untrivial code, having tests to
ensure that my refactorings have not
broken any behavior is a must.
They are the documentation of my work Untrivial code needs to be
documented. If, however, you agree
with me that comments in code is the
work of the devil, having clear and
concise unit tests that make you
understand what the correct behavior
of something is, is (again) a must.
Anything I'm sure I won't break, or that I feel is unnessecary to document, I simply don't waste time testing. Your generated controllers and model methods, then, I would say are all fine even without unit tests.
The only absolute rule is that testing should be cost-efficient.
Any set of practical guidelines to achieve that will be controversial, but here are some advices to avoid tests that will be generally wasteful, or do more harm than good.
Unit
Don't test private methods directly, only assess their effects indirectly through the public methods that call them.
Don't test internal states
Only test non-trivial methods, where different contexts may get different results (calculations, concatenation, regexes, branches...)
Don't assess things you don't care about, e.g. full copy on some message or useless parts of complex data structures returned by an API...
Stub all the things in unit tests, they're called unit tests because you're only testing one class, not its collaborators. With stubs/spies, you test the messages you send them without testing their internal logic.
Consider private nested classes as private methods
Integration
Don't try to test all the combinations in integration tests. That's what unit tests are for. Just test happy-paths or most common cases.
Don't use Cucumber unless you really BDD
Integration tests don't always need to run in the browser. To test more cases with less of a performance hit you can have some integration tests interact directly with model classes.
Don't test what you don't own. Integration tests should expect third-party dependencies to do their job, but not substitute to their own test suite.
Controller
In controller tests, only test controller logic: Redirections, authentication, permissions, HTTP status. Stub the business logic. Consider filters, etc. like private methods in unit tests, tested through public controller actions only.
Others
Don't write route tests, except if you're writing an API, for the endpoints not already covered by integration tests.
Don't write view tests. You should be able to change copy or HTML classes without breaking your tests. Just assess critical view elements as part of your in-browser integration tests.
Do test your client JS, especially if it holds some application logic. All those rules also apply to JS tests.
Ignore any of those rules for business-critical stuff, or when something actually breaks (no-one wants to explain their boss/users why the same bug happened twice, that's why you should probably write at least regression tests when fixing a bug).
See more details on that post.
More coverage is better for code quality- but it costs more. There's a sliding scale here, if you're coding an artificial heart, you need more tests. The less you pay upfront, the more likely it is you'll pay later, maybe painfully.
In the example, full_name, why have you placed a space between, and ordered by first_name then last_name- does that matter? If you are later asked to sort by last name, is it ok to swap the order and add a comma? What if the last name is two words- will that additional space affect things? Maybe you also have an xml feed someone else is parsing? If you're not sure what to test, for a simple undocumented function, maybe think about the functionality implied by the method name.
I would think your company's culture is important to consider too. If you're doing more than others, then you're really wasting time. Doesn't help to have a well tested footer, if the main content is buggy. Causing the main build or other developer's builds to break, would be worse though. Finding the balance is hard- unless one is the decider, spend some time reading the test code written by other team members.
Some people take the approach of testing the edge cases, and assume the main features will get worked out through usage. Considering getter/setters, I'd want a model class somewhere, that has a few tests on those methods, maybe test the database column type ranges. This at least tells me the network is ok, a database connection can be made, I have access to write to a table that exists, etc. Pages come and go, so don't consider a page load to be a substitute for an actual unit test. (A testing efficiency side note- if having automated testing based on the file update timestamp (autotest), that test wouldn't run, and you want to know asap)
I'd prefer to have better quality tests, rather than full coverage. But I'd also want an automated tool pointing out what isn't tested. If it's not tested, I assume it's broken. As you find failure, add tests, even if it's simple code.
If you are automating your testing, it doesn't matter how long it takes to run. You benefit every time that test code is run- at that point, you know a minimum of your code's functionality is working, and you get a sense of how reliable the tested functionality has been over time.
100% coverage shouldn't be your goal- good testing should be. It would be misleading to think a single test of a regular expression was accomplishing anything. I'd rather have no tests than one, because my automated coverage report reminds me the RE is unreliable.
The primary benefit you would get from writing a unit test or two for this method would be regression testing. If, sometime in the future, something was changed that impacted this method negatively, you would be able to catch it.
Whether or not that's worth the effort is ultimately up to you.
The secondary benefit I can see by looking at it would be testing edge cases, like, what it should do if last_name is "" or nil. That can reveal unexpected behavior.
(i.e. if last_name is nil, and first_name is "John", you get full_name => "John ")
Again, the cost-vs-benefit is ultimately up to you.
For generated code, no, there's no need to have test coverage there because, as you said, you didn't write it. If there's a problem, it's beyond the scope of the tests, which should be focused on your project. Likewise, you probably wouldn't need to explicitly test any libraries that you use.
For your particular method, it looks like that's the equivalent of a setter (it's been a bit since I've done Ruby on Rails) - testing that method would be testing the language features. If you were changing values or generating output, then you should have a test. But if you are just setting values or returning something with no computation or logic, I don't see the benefit to having tests cover those methods as if they are wrong, you should be able to detect the problem in a visual inspection or the problem is a language defect.
As far as the other methods, if you write them, you should probably have a test for them. In Test-Driven Development, this is essential as the tests for a particular method exist before the method does and you write methods to make the test pass. If you aren't writing your tests first, then you still get some benefit to have at least a simple test in place should this method ever change.

How to get a positive mental attitude towards testing?

I want to write tests for my app, though each time I look at rspec.info, I really don't see a definite path to take towards "doing things right" and testing first. I watched the peepcode videos on rspec more than once, yet it doesn't take. I want to take more pride in my work, and I think that testing will help. How can I break through this mental block?
Find tools that will reward you for testing. For example, make it very easy to run all the tests and get a message like
73 tests passed.
Try random testing because you can test against a lot of values quickly and easily.
See if your language provides a test-coverage analysis tool that gives you percentage of statement coverage or percentage of block coverage. It is very rewarding to drive code coverage from 60% up to 90%---and if you are lucky, you will find bugs.
My key advice is to quantify your progress in testing so that you can see the numbers going up. That will make it a lot more motivating. (Gee, I wonder what other numbers that go up can be found on this site...)
I was hating it until I started creating a few testing macros. Like logging in or getting to the homepage. I found it fun to start poking at what my testing framework could really do.
It also helped to have someone else get me started by writing a few. Right away I found obvious improvements which made me want to get in there and start improving things.
"Test things you don't want to break."
It might be helpful to prioritise at first. I know that typing out the full three layers of model, view, and controller specs on top of the cucumber acceptance tests can be a chore. So one idea is to just test the most critical things in your app, and add tests as you run into bugs you don't want to see again.
"Always start with a failing test."
Cucumber features plain text "stories" that are pretty awesome for getting some really concrete tests up & running. Maybe that would be one place where you could get started. Cucumber doesn't really work with an AJAX-based app though, for that you'd have to take Selenium or Watir instead. You can start with a failing story before writing a single line of code, and quickly proceed from there to make that story pass.
"Don't test, specify."
Instead of thinking of tests, try to make a mental switch: you're not testing but SPECIFYING how your application will behave. This is design work, not nearly as boring as testing. :)
Think of it like this: if you don't test, your code is broken.
You need to see the value that testing will bring in refactoring and extending your code. Once you have a set of tests that define the behavior of your classes, you can then feel free to start making changes to improve the code. Your tests will provide the confidence that what you're doing isn't breaking the system. When you go to add new functionality to your code, running your existing tests will give you confidence that the new code you've added doesn't break anything else.
The key is to take that long term view. Testing is an investment. It takes a little bit away from the code you could be writing but eventually it will start paying off with interest. The capital that you have stored up will make it much easier to move ahead more quickly when adding new features.
Assuming you already have a list of bugs to fix, I always like to go back through and where ever possible create an automated test that demonstrates the bug. Then fix the bug and watch the test pass. Since you have to test the bug anyway, and the bug should already give you enough information to recreate it, you can see an immediate return on your tests.
Eventually, you'll start to get a feel for putting the tests together and how to write them, and you won't need the "blueprint" of an existing bug.
I wrote a motivation post about just this case couple of days ago. Here is the summary:
Start writing tests whenever you have
an opportunity to do it (ie. whenever
you write some code). Choose any tool
that makes sense to you and write any
test that you feel could cover at
least some tiny behavior of your
application (don’t care about the
coverage or any other scary terms from
the day one). Don’t be afraid about
primitive tests and trivial assertions
- you’ll get more confidence as your test coverage will grow and you’ll
become more and more happier as you’ll
notice that you don’t need to hit F5
that often anymore. Think about
testing in other positive terms - the
better you are at it, the less time
you need to spend with activities you
don’t like (watching the spinning
refresh icon in the browser,
debugging) and more with things you
love.
And here is the whole thing, if you are interested.
As has been mentioned previously, the easiest way to break into testing is with regression testing.
I'd also avoid doing controller specs - they are a PITA. Do heavy model testing, because that's where the logic should be in the first place.
Try spec'ing / testing a plain ruby project before you go off into a rails project.
Well I'll tell you how!
FIRST DO THE FOLLOWING 10 TIMES MANUALLY ON DIFFERENT APPLICATIONS ,BEFORE YOU TRY TO AUTOMATE
the negative scenarios, where the result would come out negative.
it could be wrong data entered and gives you right outputs.
for example a login screen:
There could be many scenarios when correct User wrong PW,Wrong User correct PW.... the most important thing is YOU DONT GIVE UP UNLESS BREAK IT .this is your mantra.
HMMM NOW YOU ARE THINKING LIKE A TESTER NOW TURN TO UR SYSTEM,
JUS WRITE THE NEGATIVES TESTS AND THEIR RESULTS
AND THEM THE POSITVE TESTS
DESIGN IT.
NOW DEVELOP THE FRAMEWORK

Resources