How can I make a distinction between critical tests that fail (and that should be addressed immediately) vs. tests that fail, but aren't too critical (eg. a tab-view with the wrong default tab open)? It seems like most services (I am using CircleCI) only show you red or green.
I feel like I need some intermediate "orange" color in addition to the green and red colors. Is there any add-on or trick that could help us make the distinction between critical test failures and acceptable ones? (For example with annotations #non-critical?)
I am using Cucumber for testing a Ruby on Rails application.
EDIT
Here are two ways that could make sense (feel free to suggest other approaches):
One single build alert that is not just "green" or "red" but could be "yellow/orange" depending on which tests fail
Many builds, that can only be green or red, but that would be labeled
Build of "critical tests" succeeded with 0 errors (green)
Build of "non-critical" tests failed with 10 errors (red)
A better way would be to separate the execution between critical and non critical features. It would be quicker to detect a critical failure and you'd be able to run them more often.
To run the critical features tagged with #critical :
cucumber --tags #critical
To run the non critical features not tagged with #critical :
cucumber --tags ~#critical
Documentation:
https://github.com/cucumber/cucumber/wiki/Tags
I strongly recommend against having tests that are 'non-critical'. A binary pass/fail result is simple to manage: a suite either passes and all is well, or it fails and needs to be fixed. Whenever I've seen a test suite divided into levels of importance, the team immediately began paying attention only to the tests in the highest-importance group and more of the less-important tests were allowed to fail.
Instead, if a test isn't considered important enough to be fixed, delete it. Even better, if a feature isn't considered important enough to be tested, remove or simplify it so there is nothing to test.
Note that both the product owner and developers get to have input on what's important enough to be fixed. For example, if the developers have a standard of 100% code coverage and a test is the only test that provides some of that coverage, the developers would be right to insist that the test be kept passing even if the feature that it tests isn't considered critical by the product owner. Although that would then suggest that the feature should be removed or simplified so that the test wouldn't be needed by developers either.
Related
We are currently using Coveralls for code coverage of our Rails projects. The coverage results it is giving us are really unreliable. I have on numerous occasions found classes which have not been spec'd, written specs for them and then watched coverage actually drop. This is because Coveralls only checks classes which are loaded up by your specs. So if there is no spec for a class it is excluded from coverage statistics. This is obviously not ideal. Is there a way to get around this behavior? I am trying to push for greater emphasis on testing in my team and it's pretty hard when this is creating a false sense of security.
I've seen two reasons why coverage statistics are unreliable in Coveralls (and the similar service Codecov):
When last I used it, Coveralls had to be told that a test suite was parallelized. If given no warning, Coveralls would display partial results. To avoid this, you can tell Coveralls that your parallelized test suite has completed. See the docs for details, but, briefly:
set the environment variable COVERALLS_PARALLEL=true on your CI server
POST { "payload": { "build_num": 1234, "status": "done" } } to https://coveralls.io/webhook?repo_token=(your repo token) at the end of your build. (Some hosted CI services figure out the payload automatically.)
If you run your test suite more than once to retry flaky tests, partial results from a retry run can overwrite almost-full results from the first run. The problem I saw was specific to a homemade retry setup, but, if you're doing something like that, think through whether it might confuse Coveralls.
Regarding completely untested classes missing from your coverage report, that's not specific to Coveralls. If you want to be sure that all classes are loaded, load them eagerly before running your tests. Untested classes and methods are often unused, so auditing your app with a dead code detector like debride can also be a good step on the road to better coverage.
We currently use ruby and cucumber setup. There are some steps failing in the tests(end to end regression tests) due to known bugs. Developers takes sometime to fix them according to their work load and bug severity. How best to deal with these failing tests?
Should we tag them with bug ticket numbers and let the those specific tests skip when it runs on CI?
Let them fail and mark the build unstable until the dev fixes them how many ever days they take?
Is there any other way in cucumber to say these specific tests have a different state other than pass or fail to indicate that its under control?
I managed to find a better solution to suit this need on searching further online. we could mark the test as "pending" so that it wont fail but goes yellow and indicates pending.
https://phabricator.wikimedia.org/T58243
The beauty of this is, in future if the bug is fixed and the step doesn't fail anymore, it will indicate about that so we can remove the pending status.
Expected pending 'bug jira-195' to fail. No Error was raised. No longer pending? (Cucumber::Pending exception)
I am building a educational service, and it is a rather process heavy application. I have a lot of user actions triggering various actions, some present, some future.
For example, when a student completes a lesson for his day, the following should happen:
updating progress count for his user-module record
checking if he has completed a particular module and progressing him to the next one (which in turn triggers more actions)
triggering current emails to other users
triggering FUTURE emails to himself (ongoing lesson plans)
creating range of other objects (grading todos by teachers)
any other special case events
All these triggers are built into the observers of various objects, and the execution delayed using Sidekiq.
What is killing me is the testing, and the paranoia that I might breaking something whenever I push something. In the past, I do a lot of assertion and validations checks, and they were sufficient. For this project, I think this is not enough, given the elevated complexity.
As such, I would like to implement a testing framework, but after reading through the various options (Rspec, Cucumber), it is not immediately clear what I should be investing my effort into, given my rather specific needs, especially for the observers and scheduled events.
Any advice and tips on what approach and framework would be the most appropriate? Would probably save my ass in the very near future ;)
Not that it matters, but I am using Rails 3.2 / Mongoid. Happy to upgrade if it works.
Testing can be a very subjective topic, with different approaches depending on the problems at hand.
I would say that given your primary need for testing end-to-end processes (often referred to as acceptance testing), you should definitely checkout something like cucumber or steak. Both allow you to drive a headless browser and run through your processes. This kind of testing will catch any big show stoppers and allow you to modify the system and be notified of breaks caused by your changes.
Unit testing, although very important, and should always be used in parallel with acceptance tests, isn't for doing end-to-end testing, Its primarily for testing the output of specific methods in isolation
A common pattern to use is called Test Driven Development (TDD). In this, you write your acceptance tests first, in the "outer" test loop, and then code your app with Unit tests as part of the "inner" test loop. The idea being, when you've finished the code in the inner loop, then the outer loop should also pass, and you should have built up enough test coverage to have confidence that any future changes to the code will either pass/fail the test depending on if the original requirements are still met.
And lastly, a test suite is something that should grow and change as your app does. You may find that whole sections of your test suite can (and maybe should) be rewritten depending on how the requirements of the system change.
Unit Testing is a must. you can use Rspec or TestUnit for that. It will give you atleast 80% confidence.
Enable "render views" for controller specs. You will catch syntax errors and simple logical errors faster that way.There are ways to test sidekiq jobs. Have a look at this.
Once you are confident that you have enough unit tests, you can start looking into using cucumber/capybara or rspec/capybara for feature testing.
I recently started working on a project that has all passing cucumber tests. But I would say 60% of the time they fail on Timeouts, or just all together random intermittent errors. So roughly 1/4 times everything will pass and be green.
Are there common reasons for this sort of intermittence? Should I be concerned?
Acceptance tests may be something tricky on most of time.
You gotta check the async part of your code (Long database transactions, Ajax, MessageQueues). Put some timeout that makes sense for you, for the tests and for the build time (a long build time is not pretty good. I think that 10 minutes is acceptable, more than that, you can review your tests, if they are good enough).
Other problem is browser (if you're using it), it can take a lot of time to warm-up and start all tests.
Curious, what are you folks doing in as far as automating your unit tests with ruby on rails? Do you create a script that run a rake job in cron and have it mail you results? a pre-commit hook in git? just manual invokation? I understand tests completely, but wondering what are best practices to catch errors before they happen. Let's take for granted that the tests themselves are flawless and work as they should. What's the next step to make sure they reach you with the potentially detrimental results at the right time?
Not sure about what exactly do you want to hear, but there are couple of levels of automated codebase control:
While working on a feature, you can use something like autotest to get an instant feedback on what's working and what's not.
To make sure that your commits really don't break anything use continuous integration server like cruisecontrolrb or Integrity (you can bind these to post-commit hooks in your SCM system).
Use some kind of exception notification system to catch all the unexpected errors that might pop up in production.
To get some more general view of what happened (what was user doing when the exception occured) you can use something like Rackamole.
Hope that helps.
If you are developing with a team, the best practice is to set up a continuous integration server. To start, you can run this on any developers machine. But in general its nice to have a dedicated box so that its always up, is fast, and doesn't disturb a developer. You can usually start out with someone's old desktop, but at some point you may want it to be one of the faster machines so that you get immediate response from tests.
I've used cruise control, bamboo and teamcity and they all work fine. In general the less you pay, the more time you'll spend setting it up. I got lucky and did a full bamboo set up in less than an hour (once)-- expect to spend at least a couple hours the first time through.
Most of these tools will notify you in some way. The baseline is an email, but many offer IM, IRC, RSS, SMS (among others).