We are currently using Coveralls for code coverage of our Rails projects. The coverage results it is giving us are really unreliable. I have on numerous occasions found classes which have not been spec'd, written specs for them and then watched coverage actually drop. This is because Coveralls only checks classes which are loaded up by your specs. So if there is no spec for a class it is excluded from coverage statistics. This is obviously not ideal. Is there a way to get around this behavior? I am trying to push for greater emphasis on testing in my team and it's pretty hard when this is creating a false sense of security.
I've seen two reasons why coverage statistics are unreliable in Coveralls (and the similar service Codecov):
When last I used it, Coveralls had to be told that a test suite was parallelized. If given no warning, Coveralls would display partial results. To avoid this, you can tell Coveralls that your parallelized test suite has completed. See the docs for details, but, briefly:
set the environment variable COVERALLS_PARALLEL=true on your CI server
POST { "payload": { "build_num": 1234, "status": "done" } } to https://coveralls.io/webhook?repo_token=(your repo token) at the end of your build. (Some hosted CI services figure out the payload automatically.)
If you run your test suite more than once to retry flaky tests, partial results from a retry run can overwrite almost-full results from the first run. The problem I saw was specific to a homemade retry setup, but, if you're doing something like that, think through whether it might confuse Coveralls.
Regarding completely untested classes missing from your coverage report, that's not specific to Coveralls. If you want to be sure that all classes are loaded, load them eagerly before running your tests. Untested classes and methods are often unused, so auditing your app with a dead code detector like debride can also be a good step on the road to better coverage.
Related
How can I make a distinction between critical tests that fail (and that should be addressed immediately) vs. tests that fail, but aren't too critical (eg. a tab-view with the wrong default tab open)? It seems like most services (I am using CircleCI) only show you red or green.
I feel like I need some intermediate "orange" color in addition to the green and red colors. Is there any add-on or trick that could help us make the distinction between critical test failures and acceptable ones? (For example with annotations #non-critical?)
I am using Cucumber for testing a Ruby on Rails application.
EDIT
Here are two ways that could make sense (feel free to suggest other approaches):
One single build alert that is not just "green" or "red" but could be "yellow/orange" depending on which tests fail
Many builds, that can only be green or red, but that would be labeled
Build of "critical tests" succeeded with 0 errors (green)
Build of "non-critical" tests failed with 10 errors (red)
A better way would be to separate the execution between critical and non critical features. It would be quicker to detect a critical failure and you'd be able to run them more often.
To run the critical features tagged with #critical :
cucumber --tags #critical
To run the non critical features not tagged with #critical :
cucumber --tags ~#critical
Documentation:
https://github.com/cucumber/cucumber/wiki/Tags
I strongly recommend against having tests that are 'non-critical'. A binary pass/fail result is simple to manage: a suite either passes and all is well, or it fails and needs to be fixed. Whenever I've seen a test suite divided into levels of importance, the team immediately began paying attention only to the tests in the highest-importance group and more of the less-important tests were allowed to fail.
Instead, if a test isn't considered important enough to be fixed, delete it. Even better, if a feature isn't considered important enough to be tested, remove or simplify it so there is nothing to test.
Note that both the product owner and developers get to have input on what's important enough to be fixed. For example, if the developers have a standard of 100% code coverage and a test is the only test that provides some of that coverage, the developers would be right to insist that the test be kept passing even if the feature that it tests isn't considered critical by the product owner. Although that would then suggest that the feature should be removed or simplified so that the test wouldn't be needed by developers either.
I've been working on a project that I want to add automated tests. I already added some unit tests, but I'm not confident with the process that I've been using, I do not have a great experience with automated tests so I would like to ask for some advice.
The project is integrated with our web API, so it has a login process. According to the logged user the API provides a configuration file which will allow / disallow the access to some modules and permissions within the mobile application. We also have a sync process where the app will access several methods from the API to download files (PDFs, html, videos, etc) and also receive a lot of data through JSON files. The user basically doesn't have to insert data, just use the information received in the sync process.
What I did to add unit tests in this scenario so far was to simulate a logged user, then I added some fixture objects to the user and test them.
I was able to test the web service integration, I used Nocilla to return fake JSONs and assert the result. So far I was only able to test individual request, but I still don't know how should I test the sync process.
I'm having a hard time to create unit tests for my view controllers. Should I unit test just the business logic and do all the rest with tools like KIF / Calabash?
Is there an easy way to setup the fixture data and files?
Thanks!
Everybody's mileage may vary but here's what we settled on and why.
Unit tests: We use a similar strategy. Only difference is we use OHTTPStubs instead of Nocilla because we saw some more flexibility there that we needed and were happy to trade off the easier syntax of Nocilla.
Doing more complicated (non-single query) test cases quickly lost its luster because we were essentially rebuilding whole HTTP request/response flows and that wasn't very "unit". For functional tests, we did end up adopting KIF (at least for dev focused efforts, assuming you don't have a seaparte QA department) for a few reasons:
We didn't buy/need the multi-language abstraction layer afforded by
Appium.
We wanted to be able to run tests on many devices per
build server.
We wanted more whitebox testing and while
Subliminal was attractive, we didn't want to build hooks in our main
app code.
Testing view controller logic (anything that's not unit-oriented) is definitely much more useful using KIF/Calbash or something similar so that's the path I would suggest.
For bonus points, here are some other things we did. Goes to show what you could do I guess:
We have a basic proof of concept that binds KIF commands to a JSON RPC server. So you can run a test target on a device and have that device respond to HTTP requests, which will then fire off test cases or KIF commands. One of the advantage of this is that you can reuse some of the test code you wrote for single device for multiple device test cases.
Our CI server builds integration tests as a downstream build of our main build (which includes unit tests). When the build starts we use XCTool to precompile tests, and then we have some scripts that starts recording a quicktime screen recording, runs the KIF tests, exports the result, and then archive it on our CI server so we can see a live test run along with test logs.
Not really part of this answer but happy to share that if you ping me.
I am building a educational service, and it is a rather process heavy application. I have a lot of user actions triggering various actions, some present, some future.
For example, when a student completes a lesson for his day, the following should happen:
updating progress count for his user-module record
checking if he has completed a particular module and progressing him to the next one (which in turn triggers more actions)
triggering current emails to other users
triggering FUTURE emails to himself (ongoing lesson plans)
creating range of other objects (grading todos by teachers)
any other special case events
All these triggers are built into the observers of various objects, and the execution delayed using Sidekiq.
What is killing me is the testing, and the paranoia that I might breaking something whenever I push something. In the past, I do a lot of assertion and validations checks, and they were sufficient. For this project, I think this is not enough, given the elevated complexity.
As such, I would like to implement a testing framework, but after reading through the various options (Rspec, Cucumber), it is not immediately clear what I should be investing my effort into, given my rather specific needs, especially for the observers and scheduled events.
Any advice and tips on what approach and framework would be the most appropriate? Would probably save my ass in the very near future ;)
Not that it matters, but I am using Rails 3.2 / Mongoid. Happy to upgrade if it works.
Testing can be a very subjective topic, with different approaches depending on the problems at hand.
I would say that given your primary need for testing end-to-end processes (often referred to as acceptance testing), you should definitely checkout something like cucumber or steak. Both allow you to drive a headless browser and run through your processes. This kind of testing will catch any big show stoppers and allow you to modify the system and be notified of breaks caused by your changes.
Unit testing, although very important, and should always be used in parallel with acceptance tests, isn't for doing end-to-end testing, Its primarily for testing the output of specific methods in isolation
A common pattern to use is called Test Driven Development (TDD). In this, you write your acceptance tests first, in the "outer" test loop, and then code your app with Unit tests as part of the "inner" test loop. The idea being, when you've finished the code in the inner loop, then the outer loop should also pass, and you should have built up enough test coverage to have confidence that any future changes to the code will either pass/fail the test depending on if the original requirements are still met.
And lastly, a test suite is something that should grow and change as your app does. You may find that whole sections of your test suite can (and maybe should) be rewritten depending on how the requirements of the system change.
Unit Testing is a must. you can use Rspec or TestUnit for that. It will give you atleast 80% confidence.
Enable "render views" for controller specs. You will catch syntax errors and simple logical errors faster that way.There are ways to test sidekiq jobs. Have a look at this.
Once you are confident that you have enough unit tests, you can start looking into using cucumber/capybara or rspec/capybara for feature testing.
We were used to running our grails integration test against in memory HSQLDB database, but at the failure point it was difficult to investigate as the data was lost. We migrated to running the test against the physical database(postgres) and all is well when the tests passes. At any point if the tests fail we want the data to be committed in the database for postmortem analysis as to why the test failed.
To summarize, we want the tests to run in rollback mode as long as the test passes so that one test doesn't affect the other test and on the first failure of a test, commit the data at that point and stop.
We spend considerable amount of time investigating the integration test failure and would like to know if we have any option in grails to stop at first integration test failure with data preserved in the database for investigation. I searched little and didn't find any suitable pointers. If you follow any other practice to troubleshoot integration test and if it is worth sharing please let us know.
Simple hack you could try:
set a global flag on failure, test for the flag in each test. if flag is set exit the test
Recently I came across with Grails Guard Plugin and I think it can be useful in this case, because besides running integration tests faster, it preserves the data saved after tests are run.
Curious, what are you folks doing in as far as automating your unit tests with ruby on rails? Do you create a script that run a rake job in cron and have it mail you results? a pre-commit hook in git? just manual invokation? I understand tests completely, but wondering what are best practices to catch errors before they happen. Let's take for granted that the tests themselves are flawless and work as they should. What's the next step to make sure they reach you with the potentially detrimental results at the right time?
Not sure about what exactly do you want to hear, but there are couple of levels of automated codebase control:
While working on a feature, you can use something like autotest to get an instant feedback on what's working and what's not.
To make sure that your commits really don't break anything use continuous integration server like cruisecontrolrb or Integrity (you can bind these to post-commit hooks in your SCM system).
Use some kind of exception notification system to catch all the unexpected errors that might pop up in production.
To get some more general view of what happened (what was user doing when the exception occured) you can use something like Rackamole.
Hope that helps.
If you are developing with a team, the best practice is to set up a continuous integration server. To start, you can run this on any developers machine. But in general its nice to have a dedicated box so that its always up, is fast, and doesn't disturb a developer. You can usually start out with someone's old desktop, but at some point you may want it to be one of the faster machines so that you get immediate response from tests.
I've used cruise control, bamboo and teamcity and they all work fine. In general the less you pay, the more time you'll spend setting it up. I got lucky and did a full bamboo set up in less than an hour (once)-- expect to spend at least a couple hours the first time through.
Most of these tools will notify you in some way. The baseline is an email, but many offer IM, IRC, RSS, SMS (among others).