Best practices for unit tests on custom functions for a drake workflow - drake-r-package

A drake workflow can have several custom functions stored in its R directory. It would make sense to develop unit tests for the custom functions. There is well-established tooling and structures for running testthat unit tests on an R package in RStudio (or from a command line).
But what are best practices for developing and running testthat unit tests for the custom functions in a drake workflow?
Any pointers to resources or examples would be greatly appreciated. Thanks!

The best practices for unit tests do not change much when drake enters the picture. Here are the main considerations.
If you are using drake, you are probably dealing with annoyingly long runtimes in your full pipeline. So one challenge is to construct tests that do not take forever. I recommend invoking your functions on a small dataset, a small number of iterations, or whatever will get the test done in a reasonable amount of time. You can run a lot of basic checks that way. To more thoroughly validate the answers that come from your functions, you can run an additional set of checks on the results of the drake pipeline.
If you are using testthat, you probably have your functions arranged in a package-like structure, or even a fully-fledged package, and you may even be loading your functions with devtools::load_all() or library(yourPackage). If you load your functions this way instead of individually sourcing your function scripts, be sure to call expose_imports() before make() so drake can analyze the functions for dependencies.

Related

Understanding how do the coverage modules work

I came across with a new dynamic language. I would like to create a coverage tool for that language. I started reading the source code of Perl 5 and Python coverage modules but it got complicated. It's a dynamic scripting language so I guess that source code of static languages (like Java & C++) won't help me here. Also, as I understand, each language was built in a different way and the same ideas won't work. But, the big concepts could be similar.
My question is as follows: how do I "attack" this task? What is the proper workflow I need to follow? What I need to investigate? Are there any books or blogs I can read about those kind of stuff?
There are two kinds of coverage collection mechanisms:
1) Real-time sampling of the program counter, typically by a clock running at 1-10ms. Difficulties: a) mapping an actual PC value back to a source line, b) sampling means you might not see execution of a rarely used bit of code, so your coverage reporting is inaccurate. Because of these issues, this approach isn't used very often.
2) Instrumenting the program so that it collects coverage as it runs. This is hard to do with object code... a) you have to decode the instructions to see where to put probes, and this can be very hard to do right, b) you have patch the source code to include the probes (this can be really awkward; a "probe" might consist of a 5 byte subroutine call but the probe has replace a single-byte instruction). c) you still have to figure out how to map a probe location back to a source code line. A more effective way is to instrument the source code, which requires pretty sophisticated machinery to read source, make probe patches, and regenerate the instrumented code for execution/compilation.
My technical paper Branch Coverage for Arbitrary Languages Made Easy provides explicit detail for how to do this in a general way. My company has built commercial test coverage tools for a wide variety of languages (C, Python, PHP, COBOL, Java, C++, C#, ProC,....) using this approach. This covers most static and dynamic languages. Some dynamic mechanisms are extremely difficult to instrument, e.g., eval() but that is true of every approach.
In addition to Ira's answer, there is a third coverage collection mechanism: the language implementation provides a callback that can inform you about program events. For example, Python has sys.settrace: you provide it a function, and Python calls your function for every function called or returned, and every line executed.

Unit Testing Strategy, Ideal Code Coverage Baseline

There's still not much information out there on the XCode7 and Swift 2.0 real-world experiences from a unit testing and code coverage perspective.
While there're plenty of tutorials and basic how-to guides available, I wonder what is the experience and typical coverage stats on different iOS teams that actually tried to achieve a reasonable coverage for their released iOS/Swift apps. I specifically wonder about this:
1) while code coverage percentage doesn't represent the overall quality of the code base, is this being used as an essential metric on your team? If not, what is the other measurable way to assess the quality of your code base?
2) For a bit more robust app, what is your current code coverage percentage? (just fyi, we have hard time getting over 50% for our current code base)
3) How do you test things like:
App life-cycle, AppDelegate methods
Any code related to push/local notifications, deep linking
Defensive programming practices, various piece-of-mind (hardly reproducible) safe guards, exception handling etc.
Animations, transitions, rendering of custom controls (CG) etc.
Popups or Alerts that may include any additional logic
I understand some of the above is more of a subject for actual UI tests, but it makes me wonder:
Is there a reasonable way to get the above tested from the UTs perspective? Should we be even trying to satisfy an arbitrary minimal code coverage percentage with UTs for the whole code base or should we define that percentage off a reasonably achievable coverage given the app's code base?
Is it reasonable to make the code base more inflexible in order to achieve higher coverage? (I'm not talking about a medical app where life would be in stake here)
are there any good practices on testing all the things mentioned above, other than with UI tests?
Looking forward to a fruitful discussion.
You do ask a very big and good question. Although your question includes:
I wonder what is the experience and typical coverage stats on different iOS teams ...
I think the issue is language/OS agnostic. Sure some languages and platform are more unit testable than others. So some are more expensive to unit test (as opposed to other forms of automated/coded testing). I think you are searching for a cost/benefit equation to maximize productivity. Ah the fun of software development processes.
To jump to the end to give you the quick sound grab answer:
You should unit test all code that you want to work and is appropriate to unit testing.
So now why the all and why the emphasis on unit testing ...
What is a unit test?
The language in the development community is corrupted, so please bear with me. Unit testing is just one type of automated testing. Others are Automated Acceptance Tests, Application tests, Integration Tests, and Components test. These all test different things. They have different purposes.
However, when I hear unit testing two things pop into mind:
What is a unit test?
As part of TDD (Test Driven Development)?
TDD is about writing tests before writing code. It is a very low level coding practice/process (XP - eXtreme Programming) as you write a test to write a statement and then another test. Very much a coding practice but not an application/requirements practice as it is about writing code that does what you intended, not what the product requirements are (oh gosh I feel the points being lost).
Writing code and then unit testing it is ... in my experience ... fun, short term team building, but not productive. Sure some defects are found, but not many. TDD leads to better "healthy" code.
My point here is that unit testing is:
A subset of automated/coded testing.
Is part of a coding process.
Is about code health (maintainability).
Does note prove that your application works (sound of falling points).
Why all?
If you're team delivers zero defect software (ZDFD is real and achievable .. but that a flat earth discussion) all the time without unit testing then this is nonsense and you would not be asking any questions here.
The only valid reason for a team to include unit testing as part of its coding process is to improve productivity. If all team members commit to team productivity then the only issue is identifying which code profits from unit testing. This is the context of the all.
The easiest way I think to illustrate this is to list types I do not unit test:
Factories - They only instantiate types.
Builders / writing (IoC) - Same as factories - No domain logic.
Third party libraries - We call 3rd party libraries as documented. If you want to test these then use integration/component tests.
Cyclomatic Complexity of one - Every method of of type has a CC of 1. That is, no conditions. Unit tests will tell you nothing useful, peer review is more useful.
The practical answer
My teams have expected 100% unit test coverage on all new code that should be unit tested. This is achieved by attributing code that does not meeting the unit testing criteria. All code must go through code review and the attributes must be specific to the why options listed above. -- Simple.
A long answer, and perhaps not easy to digest, nor what people want to hear. But, from long experience, I know it is the best answer that can lead to best profitability.
Post comment
My answer is aimed at the unit testing aspects of the question. As for defensive programming and other practices, TDD is a process that mitigates that by making it harder to do the wrong thing. But build system static code analysis tools may help you capture these before they get to peer review (they can fail a build on new issues). Look at others like SonarQube, Resharper, CppDepend, NDepend (yes language dependent).

What are all the pieces to an effective TDD strategy?

I'm really getting frustrated with learning how to properly develop software using TDD. It seems that everyone does it differently and in a different order. At this point, I'd just like to know what are all the considerations? This much is what I've come up with: I should use rspec, and capybara. With that said, what are all the different types of test I need to write, to have a well built and tested application. I'm looking for a list that comprises the area of my application being tested, the framework needed to test it, and any dependencies.
For example, it seems that people advise to start by unit testing your models, but when I watch tutorials on TDD it seems like they only write integration test. Am I missing something?
Well, the theme "how do you TDD" is as much out there in the open as the theme "how do you properly test?". In Ruby, and more specifically in Rails, rspec should be the tool to start with, but not be done with. RSpec allows you to write Unit Tests for your components, to test them separately. In the Rails context, that means:
test your models
test your controllers
test your views
test your helpers
test your routes
It is a very good tool not exactly rails-bound, it is also used to test other frameworks.
After you're done with RSpec, you should jump to cucumber. Cucumber (http://cukes.info/) is the most used tool (again, for the Rails environment) to write integration tests. You can then integrate capybara on cucumber.
After you're done with cucumber, you'll be done with having tested your application backend and (part of) its HTML output. That's when you should also test your javascript code. How to do that? First, you'll have to Unit test it. Jasmine (http://pivotal.github.com/jasmine/) is one of the tools you might use for the job.
Then you'll have to test its integration in your structure. How to do that? You'll come back to cucumber and integrate selenium (http://seleniumhq.org/) with your cucumber framework, and you'll be able to test your integration "live" in the browser, having access to your javascript magic and testing it on the spot.
So, after you're done with these steps, you'll have covered most of the necessary steps to have a well-integrated test environment. Are we done? Not really. You should also set a coverage tool (one available: https://github.com/colszowka/simplecov) to check if your code is being really well tested and no loose ends are left.
After you're done with these morose steps, you should also do one last thing, in case you are not developing it all alone and the team is big enough to make it still unmanageable by itself: you'll set a test server, which will do nothing other than run all the previous steps regularly and deliver notifications about its results.
So, all of this sets a good TDD environment for the interested developer. I only named the most used frameworks in the ruby/rails community for the different types of testing, but that doesn't mean there aren't other frameworks as or more suitable for your job. It still doesn't teach you how to test properly. For that there's more theory involved, and a lot of subdebates.
In case I forgot something, please write it in a comment below.
Besides that, you should approach how you test properly. Namely, are you going for the declarative or imperative approach?
Start simple and add more tools and techniques as you need them. There are many way to TDD an app because every app is different. One way to do that is to start with an end-to-end test with Rspec and Capybara (or Cucumber and Capybara) and then add more fine-grained tests as you need them.
You know you need more fine-grained tests when it takes more than a few minutes to make a Capybara test pass.
Also, if the domain of your application is non-trivial it might be more fruitful for you to start testing the domain first.
It depends! Try different approaches and see what works for you.
End-to-end development of real-world applications with TDD is an underdocumented activity indeed. It's true that you'll mostly find schoolbook examples, katas and theoretical articles out there. However, a few books take a more comprehensive and practical approach to TDD - GOOS for instance (highly recommended), and, to a lesser extent, Beck's Test Driven Development by Example, although they don't address RoR specifically.
The approach described in GOOS starts with writing end-to-end acceptance tests (integration tests, which may amount to RSpec tests in your case) but within that loop, you code as many TDD unit tests as you need to design your lower-level objects. When writing those you can basically start where you want -from the outer layers, the inner layers or just the parts of your application that are most convenient to you. As long as you mock out any dependency, they'll remain unit tests anyway.
I also have the same question when I started learning rails, there're so many tools or methods to make the test better but after spending to much time on that, I finally realized that you could simply forget the rule that you must do something or not, test something that you think it might have problem first, then somewhere else. Well ,it needs time.
that's just my point of view.

SpecFlow/BDD for Unit Tests?

Seems like the internet doesn't have a definitive answer, or set of principles to help me answer the question. So I turn to the great folk on SO to help me find answers or guiding thoughts :)
SpecFlow is very useful for BDD in .NET. But when we talk about BDD are we just talking integration/acceptance tests, or are we also talking unit tests - a total replacement for TDD?
I've only used it on small projects, but I find that even for my unit tests, SpecFlow improves code documentation and thinking in terms of language. Converseley, I can't see the full code for a test in one place - as the steps are fragmented.
Now to you..........
EDIT: I forgot to mention that I see RSpec in the RoR community which uses BDD-style syntax for unit testing.
I've recently started to use SpecFlow for my BDD testing, but also, I still use unit and integration tests.
Basically, I split the tests into seperate projects:
Specs
Integration
Unit
My unit tests are for testing a single method and do not perform any database calls, or external references whatsoever. I use integration tests for single method calls (maybe sometime two) which do interact with an external resources, such as a database, or web service, etc.
I use BDD to describe tests which mimick the business/domain requirements of the project. For example, I would have specs for the invoice generation feature of a project; or for working with a shopping basket. These tests follow the
As a user, I want, In order to
type of semantics.
My advise is to split your tests based on your needs. Avoid trying to perform unit testing using SpecFlow.
We have started using Specflow even for our unit tests.
The main reason (and benefit) for this is that we find that it forces you to write the tests from a behavior point of view, which in turn forces you to write in a more implementation agnostic way and this ultimately results in tests which are less brittle and more refactoring friendly.
Sure this can also be done with standard unit testing frameworks, but you aren't guided that way as easily as we have found we are using specflow and the gherkin syntax.
There is some overhead setting things up for specflow, but we find this is quickly repaid when you have quite a few tests (due to the significant step reusability that you can get with specflow) or you need refactor your implementation.
Plus you get nice readable specs that are easy for newcomers to the team to understand.
Given:
Unit tests are test of (small) "units of code"
The customer of most “units of codes” are other programmers.
Part of the reason for having a unit test is to provide an example of how to call the code.
Therefore:
Clearly unit tests should normally be written in the programming language that the users of the “unit of code” will be calling it with.
However:
Sometimes data tables are needed to setup the conditions a unit test runs in.
Most unit test frameworks are not good at using tables of data.
Therefore:
Specflow may be the best option for some unit test, but should not be your default choose.
I see it as an integration testing which mean it doesn't replace your unit test cases written as part of your TDD process. Someone will have different opinion about this. IMHO unit test case only test the methods/functions and all the dependencies should be mocked and injected. When in it comes to integration testing, you will be injecting real dependencies instead of mocked one. You could do the same integration testing with any of the unit testing frameworks, but the BDD provides you cleaner way of explaining the integration test use case in a Domain Specific Language which is a plain English(or any localized language).
Ta,
Rajeesh
I used specflow for BDD testing on two different good sized applications. Once we worked through the kinks of the sentence naming conventions, it worked out pretty good. BA's and QA's, and even interns could write BDD tests for the application.
However, I ALSO used it for unit tests. Heresy! I can hear some of you scream. However, there were VERY good reasons for it. The system was responsible for making many calculations or determinations based off a lot of different data. With lots of unit tests that require all this data to be input for test purposes, it makes it a LOT easier to manage that data used for the unit tests via the table format provided by specflow. Effectively mocking the data repository in table format, allowing the different components to be vigorously tested.
I don't know if I would do it in every case, but in the ones I used it for, it made laying out the volumes of data necessary for for performing the unit tests so much easier and clearer.
In the end we are trying to deliver to the customer exactly what the customer wants and as such I really don't see the need to write unit tests in addition to SpecFlow. After all, it exercises the same code base. I am fairly new to BDD/ATDD/TDD but other than being "complete" and strictly adhering to TDD I'm finding it unnecessary to write more unit tests.
Now I suppose if the team was dispersed and the developer was not able to run the entire application then separate unit tests would be necessary but where the developer(s) has access to the entire code base and is able to run the application, then why bother write more tests.

How to choose between different test types with SpecFlow, Cucumber or other BDD acceptance test framework?

I am looking at SpecFlow examples, and it's MVC sample contains several alternatives for testing:
Acceptance tests based on validating results generated by controllers;
Integration tests using MvcIntegrationTestFramework;
Automated acceptance tests using Selenium;
Manual acceptance tests when tester is prompted to manually validate results.
I must say I am quite impressed with how well SpecFlow examples are written (and I managed to run them within minutes after download, just had to configure a database and install Selenium Remote Control server). Looking at the test alternatives I can see that most of them complement each other rather than being an alternative. I can think of the following combinations of these tests:
Controllers are tested in TDD style rather than using SpecFlow (I believe Given/When/Then type of tests should be applied on higher, end-to-end level; they should provide good code coverage for respective components;
MvcIntegrationTestFramework is useful when running integration tests during development sessions, these tests are also part of daily builds;
Although Selenium-based tests are automated, they are slow and are mainly to be started during QA sessions, to quickly validate that there are no broken logic in pages and site workflow;
Manual acceptance tests when tester is prompted to confirm result validity are mainly to verify page look and feel.
If you use SpecFlow, Cucumber or other BDD acceptance test framework in you Web development, can you please share your practices regarding choosing between different test types.
Thanks in advance.
It's all behaviour.
Given a particular context, when an event occurs (within a particular scope), then some outcome should happen.
The scope can be a whole application, a part of a system or a single class. Even a function behaves this way, with inputs as context and the output as outcome (you can use BDD for functional language as well!)
I tend to use Unit frameworks (NUnit, JUnit, RSpec, etc.) at a class or integration level, because the audience is technical. Sometimes I document the Given / When / Then in comments.
At a scenario level, I try to find out who actually wants to help read or write the scenarios. Even business stakeholders can read text containing a few dots and brackets, so the main reason for having a natural language framework like MSpec or JBehave is if they want to write scenarios themselves, or show them to people who will really be put off by the dots and brackets.
After that, I look at how the framework will play with the build system, and how we'll give the ability to read or write as appropriate to the interested stakeholders.
Here's an example I wrote to show the kind of thing you can do with scenarios using simple DSLs. This is just written in NUnit.
Here's an example in the same codebase showing Given, When, Then in class-level example comments.
I abstract the steps behind, then I put screens or pages behind those, then in the screens and pages I call whatever automation framework I'm using - which could be Selenium, Watir, WebRat, Microsoft UI Automation, etc.
The example I provided is itself an automation tool, so the scenarios are demonstrating the behaviour of the automation tool through demonstrating the behaviour of a fake gui, just in case that gets confusing. Hope it helps anyway!
Since acceptance tests are a kind of functional tests, the general goal is to test your application with them end-to-end. On the other hand, you might need to consider efficiency (how much effort is to implement the test automation), maintainability, performance and reliability of the test automation. It is also important that the test automation can easily fit into the development process, so that it supports a kind of "test first" approach (to support outside-in development).
So this is a trade off, that can be different for each situation (that's why we provided the alternatives).
I'm pretty sure, that today the most widely fitting option is to test at the controller layer. (Maybe later as UI and UI automation frameworks will evolve, this will change.)

Resources