Using Fitnesse to test external data - fitnesse

We would like to use Fitnesse to test externally produced data set. Specifically, the tests would contain invariants that must be valid in the data, but every time tests are run they would fetch the data from, let's say, a database and apply the checks to every row in the result set.
The tests would still be organised as wiki pages, but each one once running would be repeated for all applicable data rows. Should a particular row fail an assertion, we still want the tests to continue for other rows, but then receive a summary and a list of rows failed each particular assertion.
I understand this is not exactly what Fitnesse is for, but we do have skills in the team to write fixtures and tests, and we like the idea of haivng non-technical subject matter experts authoring some of the tests.
Is there a way of achieveing the above in Fitnesse, or is it completely outside of its intended usage? If it is possible, I would appreciate any guidance on how to achieve that, I couldn't find anything insightful in the documentation (or other websites).

Sounds like the Slim protocol is what you're looking for to write the fixtures.
The Query Table in particular.



I am all confused with TDD vs BDD :)
How does TDD and BDD differ in each of below point?
Development: Test case first, development follows next
RestService(HTTP): Don't make rest calls? If so,
a) do we return only hardcoded json using a mock object?
b) how to handle REST call failures? We should have test case for that too?
Especially for item 2, i have googled so many articles, but couldn't find a sample (code) approach on how to handle rest calls.
BDD and TDD are not comparable to each other, although they are both used in test first development.
BDD is more than just writing tests with an English-like syntax, e.g. Kiwi. BDD (also known as ATDD—Acceptance Test Driven Development) starts with developers, QA, and designers (e.g. business, and interaction designers), working together to develop a shared understanding of the proposed solution. It is common to use examples to illustrate the behavior, also known as Specification by Example.
I have found that a useful way to think of abstraction is distinguishing between what you do (abstract, high-level policy), and how you do it (concrete, low-level details). Every concrete detail exists to fulfill a higher-level policy. When you see something concrete, it is beneficial to identify the policy it is serving.
The specification by example can be used to create high-level acceptance tests, which test what the application does, i.e. its behavior.
Unit tests are used to test how the app implements a solution, i.e. test that the appropriate messages are sent to its collaborators/dependencies at the appropriate time.
The phases of the standard TDD cycle are Red, Green, Refactor. During the green phase, your goal is to get the test passing as quickly as possible, by hook or by crook—it is acceptable to write ugly, unorganized code. Once the test passes, you refactor the code to make it more readable/changeable.
Similarly, with a BDD/ATDD cycle, you have Red, Green, Refactor. During the green phase of BDD, just get the acceptance test to pass. All of the code you write can exist within the test itself. During the refactor phase of BDD, you extract test code into production code. You can use TDD to guide the extraction.
So, for a given BDD acceptance test, you might have multiple TDD tests.
Regarding how to test REST calls, let's go back to the premise of abstraction—distinguishing what we do from how we do it.
Calling a REST service is a concrete action. The policy it satisfies may be to provide a list of model objects.
Let's say the use case you are implementing is to invite a friend to lunch. Part of the use case responsibility is to obtain the list of friends from a server; it doesn't care how the server finds the friends.
Your BDD tests would handle getting the list of friends, picking a friend, and completing the invitation. Your BDD tests would not worry about actually making REST calls.
When you use TDD to implement the the class that handles communication with the server, you could have tests that retrieve JSON from a remote data source (i.e. the server), and ensure the JSON is properly parsed into User model objects. You could also have tests to cover the data source responding with an error, etc.
At the point you actually make a REST call, in the implementation of a remote data source that uses REST to communicate with the backend server, I would classify that as an integration test, as you are testing the integration with a component you don't control, i.e. the actual backend server. The integration tests only need to confirm that the server returns JSON data in the format your app expects, or that errors are returned when appropriate.
BDD is actually derived from TDD, so it's not surprising there's a little confusion! BDD is exactly like TDD (or ATDD if you're doing it for a whole system), but without the word "Test". It turns out that can be pretty powerful.
Particularly, it lets developers have conversations with non-technical business people about what the system should do. You can also use it to have conversations about what a class should do, or a module of code should do, even with a technical expert.
So in the example of your REST service, you can imagine that I'm a dev and you're an expert who knows what the REST service should do.
Me: What should it do?
You: It should let me read a record.
Me: Great! Can you give me an example of a record?
You: I have one here...
Me: Is there any context in which someone shouldn't be able to read the record?
You: Sure, if they don't have permissions.
Me: Okay, so I've done Read, let's do Update. Can you give me an example of a typical update?
You: Here you go.
Me: Fantastic, and you want it to respond just with success or fail. Is there any scenario in which it should fail?
You: Sure. The record shows when it was last updated. If someone else has already updated it in the meantime, yours should fail when you submit it.
So you see you can use BDD to explore all kinds of scenarios, including those around a REST service. The trick is to ask, "Can you give me an example?" Then you get a concrete example, which you can then automate if you want to. The conversations help us look for other examples and scenarios which we might have missed.
Don't use BDD tools to automate for a technical audience! BDD tools like Cucumber, JBehave etc. work with real English that's a lot harder to refactor than code. Use JUnit, NUnit etc. if you're just doing something like a REST service. You can put "Given, When, Then" in comments, or make a little DSL.
So now you can see that with your REST call failure, if I were coding it, I'd have an example like:
Me: So, this call failure... can you give me an example?
You: Sure, if you access a record that's been deleted it's going to fail.
Me: Give me a typical example of a record that might get deleted?
You: The one we're using before is good.
Me: Okay, is there a situation in which we shouldn't delete a record?
You: Yes, if it's already been published...
You can see that throughout, I'm not really using the word "test". Tests are a nice by-product in BDD. It's used more for exploration and specification of requirements. The conversations in BDD are the most important part of it.
The reason it's tricky to find examples of using BDD for REST is first because REST is deliberately simple and doesn't often have a lot of behaviour, and second because BDD's scenarios aren't generally phrased in terms of their implementation, focusing instead on the value of what the service or system provides ("read a record").
TDD and ATDD are exactly the same, if they're done well. It's just easier to have conversations about examples and scenarios than it is to have them about tests.

How to run a Cucumber Background step once for all Scenarios under the same feature?

In Cucumber, is it possible to run a Background step for the whole feature? So it doesn't get repeated every scenario?
I am running some tests on a search engine, and I need to pre-seed the search engine with test data. Since this data can be quite long to generate and process (I'm using Elasticsearch and I need to build the indices), I'd rather do this background only once, but only for all tests under the same feature.
Is it possible with Cucumber?
Note that I am using MongoDB, so I don't use transactions but truncation, and I believe I have DatabaseCleaner running automatically after each test, that I suppose I'll have to disable (maybe with an #mention?)
Yes I'm using Cucumber with Ruby steps for Rails
EDIT2 : concrete examples
I need to test that my search engine always return relevant results (eg. when searching for "buyers" it should return results with "buyer", "buying", "purchase", etc. (has to do with ES configuration), and other contextual information gets updates correctly (eg in the sidebar
I have categories/filters with the number of hits in parenthesis, I must make sure those number gets refreshed as the user plays with filters)
For this I pre-seed the search engine with a dozen of results, and I run all those tests that are based on the same inputs. I often have "example" clauses that just do something slightly different, but based on the same seeding
Supposing the search data is a meaningful part of the scenario, something that someone reading the feature should know about, I'd put it in a step rather than hide it in a hook. There is no built-in way of doing what you want to do, so you need to make the step idempotent yourself. The simplest way is to use a global.
In features/step_definitions/search_steps.rb:
$search_data_initialized = false
Given /^there is a foo, a bar and a baz$/ do
# initialize the search data
$search_data_initialized = false
In features/search.feature:
Feature: Search
Given there is a foo, a bar and a baz
Scenario: User searches for "foo"
There are a number of approaches for doing this sort of thing:
Make the background task really fast.
Perhaps in your case you could put the search data outside of your application and then symlink it into the app in your background step? This is a preferred approach.
Use a unit test tool.
Consider if you really get any benefit out of having scenarios to 'test' search. If you don't use a tool that allows you greater control because your tests are being written in a programming language
Hack cucumber to work in a different way
I'm not going to go into this, because my answer is all about looking at the alternatives
For your particular example of testing search there is one more possibility
Don't test at all
Generally search engines are other peoples code that we use. They have thousands of unit tests and tens of thousands of happy customers, so what value do your additional tests bring?

Can I reuse my integration test suite to profile a Rails app?

Most posts on Rails profiling recommend Ruby-Prof. To use Ruby-Prof I need to write at least one new test for each controller action, then manually compare the results to see what's taking the longest and might be a candidate for optimization.
This is good if I already know exactly what request I'm focusing on. It seems less good if I'm trying to identify the hot spots in the first place. Given that I already have a huge integration test suite covering all the app's functionality that I care about, it seems like what I really want to do is:
Run the entire test suite and capture the time spent in each controller action. (Or model method, or whatever level of granularity I want.)
Print two lists, of worst-case and average-case times in each controller action.
Sort each list and start investigating the longest-running controller actions, now using Ruby-prof or other profiling tools to drill down into the call stack. The worst-case times will identify request params that might be problematic (i.e., trigger slow code on the backend), without my having to think of them all when I write the performance test.
Is there some reason people don't use the integration test suite in this way, rather than basically duplicating it with a second performance test suite? I have not seen it suggested. Before I write code to do something like this (presumably with a before_action in ApplicationController, is there already a tool for this?
I think that the automated tests will not tell you anything about performance. You need real data. For example, your tests probably won't use indexes, but if you create 10,000 records without an index you may find a performance issue.
I need to write at least one new test for each controller action
Why would you performance test each controller action?
In my limited experience, performance testing was done after deploying the app and tested very specific things. I tested a chunk of code that was slow or code that I thought might be slow.
Also, if you use online performance tools it is not necessary to change your code. The online tool runs against an instance of the app that has been deployed.

Input and test data for a SpecFlow scenario

I have started recently using SpecFlow and I have 2 basic questions I need to clarify, also to confirm I am on the right way:
As I understand, it is a must that all the input data (test parameters for the scenarios) to be provided by the tester, the same about the test data (input data for the tables involved in the test scenarios)
Are there any existing tools for a quick way of generating test data (inserting it into the DB) ? I am using Entity Framework as part of the Data access layer. I was wondering about some tool that would read the data from a file or probably some Desktop application to provide values for the table's fields (which could also then generate a file from which some other tool could read all the data and generate all the required objects etc).
I also had a look at Preparing data for a SpecFlow scenario - I was thinking if there is already a framework which would achieve insert\delete of test data to use alongside with SpecFlow.
I don't think you are on the right track. SpecFlow is a BDD tool, but in some ways it only covers part of the process. Have a read of and see if any if the scenarios sound familiar?
To move forwards I would recommend you start with to get a good idea of how it all began. Now lets consider your points;
The tester provides all the test data. Well yes and no. The idea is that between yourself and the feature expert, you are able to have a conversation that provides all the examples that you need to develop your feature. If you don't involve yourself in that conversation, then yes all the data will come from the other side, but the chances are it won't be such high quality as if you are able to ask the right questions and guide the conversation so the data follows a structure that you can code tests too.
As an example here, when I first started with BDD I thought I could get the business experts to write the plain text scenario files with less input from the development, but in practice the documents tended to be less useful than when we were involved. Not because they couldn't write decent specifications, but actually because they couldn't refactor them to reuse bindings etc. We were still needed to add our skills into the process.
Why does data go into a database? A good test is isolated to the scope that it is testing. For a UI layer test this means that we don't have a database. For a business tier test we shouldn't be reliant on the database to get data either.
In practice a database is one of the most difficult things to include in your testing because once any part of the data changes you cause cascading test failures.
Instead I would recommend making your features smaller and provide the data for your test in the scenario or binding. This also makes having your conversation easier, because the fiftieth row of test pack is not something either party is going to remember. ;-) I recommend instead trying to give you data identities, so "bob" might be individual in a test you can discuss, and both sides understand what makes him an interesting example.
good luck :-)
Update: With regard to using a database during testing, my experience is that there are a lot of complexities that make it a difficult choice to work with. Consider these points,
How will you reset the state of your data between tests?
How will you reset the state if one / some tests fail?
If you are using branches or even just if two developers are making changes at the same time, how will you support multiple test datasets?
How will you handle two instances of the tests running at the same time (don't forget the build server)?
Have a look at this question SpecFlow Integration Testing with Database Patterns which includes some patterns that you can use.

How do you plan your Rails app?

I'm starting a Rails app for a customer and am considering either creating a mind map or jumping straight to a Cucumber specification.
How do you plan your Rails app?
As an additional question, say you also start with Cucumber, at which point would you write Unit tests? Before satisfying the specifications?
I've got a 6 step process.
I prefer to work out the model relationship and uses before doing anything. Generally I try to define models into units containing coherent chunks of information. Usually this starts by identifying the orthogonal resources my application will need (Users, Posts, etc). I then figure out what information each of those resources absolutely need (attributes) and may potentially need (associations), and how that information will likely be operated on (methods), from there I define a set of rules to govern resource consistency (validations).
I usually iterate over my design a few times because the act of defining other models usually makes me rethink ones I've already done. Once I have a model design I like, I will start refactoring or specializing(subclassing) models to clarify the design.
I write the migration and make skeletons for my models. I usually won't write tests until I have a first draft of methods and validations implemented. It's not always obvious how to implement things until giving it some moderate thought.
Next comes the test suite. Doesn't matter what I used to write the tests, so long as I can be certain the backend is sane.
This is when I piece together the control flow. What happens on a successful request? Unsuccessful request? Which controller actions will link to others? Usually there is a 1-1 mapping between controllers and models (not counting sub classes of models), every so often I'll encounter situations where I need to act on multiple model types, for that I'll probably create a new controller. Depending on how complex my app is I may model the flow as a state machine.
Lastly I create the views. I start by sketching out the UI based which is heavily influenced by my model's relationships and attributes. Abstract out common parts, then write the views.
Polish the UI. I create a CSS, and start to replace links with remote calls, or even just javascript when appropriate.
I may interleave steps 2 and 3. I find it's very easy to write a test just after I write the code to be tested. Especially because I'm usually testing things in a console as I write, and half the test is written by pasting from the console.
I may also compartmentalize steps 4 and 5 for each model/controller. Any point I may go back and revise, a previous decision, and propagate those changes through my steps.
I start with sketches of the user interface and then progress to HTML mockups. Once the UI design is finalised I can identify the RESTful resources in the application and their relationships.
I don't think writing only cucumber features as specifications is a good idea. Writing test code without be able to test it pass leads to errors in the tests and increases the time you'll need to correct them later.
So I'd do the following :
Write some mindmap. But keep it simple with the major ideas of the project.
Start writing tests and coding at the sime time (write one test, make it pass, write an other, ...).
So you'll write your specifications while driving your application. Keeping it clean but also remaining agile and being able to change some ideas in the middle of the project.
