I was trying to gain insight into my own problems when I stumbled upon this question. To my understanding, two different functionalities are being expressed and tested (look at the two Whens and two Givens about them). Is it right to do so?
There are two ways of writing scenarios (and for that matter class-level examples too).
One is to use a single example per scenario.
Another is to have one aspect of behavior per scenario.
In this scenario, the behavior in the cases of both inactive and active users provide something valuable. Without one or the other, the behavior is meaningless. So putting them in one scenario makes sense. This also provides a pragmatic benefit in that it often takes time to initialise the context for a scenario.
A good reason to split them might be if a third behavior comes into play (for instance, you have suspended users as well as active and inactive ones).
If you have separate aspects of behavior, it's usually valuable to illustrate them with different examples. For instance:
Given Fred bought a fridge for $100
When Fred returns the fridge
Then he should be refunded $100
And the fridge should be returned to stock.
The two outcomes are quite clearly two different aspects of behavior and involve different stakeholders - the customer and the shop owner - so it would make more sense to split them up.
Given Fred bought a fridge for $100
When Fred returns the fridge
Then he should be refunded $100
Given Fred bought a fridge for $100
When Fred returns the fridge
Then the fridge should be returned to stock.
However, be pragmatic about it. If it's more readable or comprehensible one way than the other, then that should take priority over any hard-and-fast rules. I will say that it took me a while to learn how to do this effectively, so mostly it comes with experience.
Related
I am trying to answer the below question given by the business (The business generates revenue from multiple apps through customer pay model) The business is interested in the below questions
new users (trend with respect to previous months)
daily active users
Day 1 retention
I came up with the below DM
Dimension: users, app, deviceid, useractions, plan, date
Fact: fact_activity(userid, appid,deviceid, actionid)
Actions could be: app installed, app launch, registered, completed purchase, postedcomments, playgame etc
The questions I have is
should the fact table contain action_type instead of the actionid into the fact (to avoid join with useractions)
Definition of day 1 retention: No of apps installed/ app launches next day how do to avoid multiple counting of single user using multiple devices
Would it be advisable to have device details in the user dimension
or separate.
If I need to measure average session duration, should I use another fact at session level or tweak the activity fact?
your questions are really unanswerable without significant more information about your business processes, data definitions, etc. In effect, you are asking someone to design a dimensional model for you before they can answer your questions - which is obviously not going to happen.
However, I can give you some very generic pointers that may help you:
Dimensions
A Dimension describes an entity so if attributes can't be described as belonging to the same entity then they shouldn't be in the same dimension. In your case, I assume a Device and a User are not the same thing and therefore they need to be separate dimensions
Facts
You need to define your measures i.e. what precisely are the things you are going to want to aggregate (count, sum, avg, etc) and how are they defined/calculated.
For each measure, you also need to define its grain i.e. what is the minimum set of dimensions that uniquely identify it. Once you have the grain defined, if multiple measures have the same grain then they can be held in the same fact table and if they don't then they can't
Is there a preferred way of creating BDD scenarios in small agile teams and amongst the community? I'm using courgette and it gives an example on https://courgette-testing.com/bdd
Scenario: Refunded items should be returned to stock
Given a customer previously bought a black sweater from me
And I have three black sweaters in stock.
When they return the black sweater for a refund
Then I should have four black sweaters in stock.
Does this sound like a good idea? Would this work well for communication in teams?
I've used their web steps bit, and am now doing the refactor bit to make it clear to the business.
Any links would help. Thanks
The conversations in BDD are more important than the tools. Rather than starting with the finely-grained specification in Courgette's example, try talking to the business first. Ask them for an example of the kind of behaviour they want.
When you write it down, start by just writing it the way they describe it. It's amazing how few people listen properly! After you've got the example from them, take a look at it. Can you see which bits are the contexts (Givens) and which are the outcomes (Thens)? Which is the step that's associated with triggering the behaviour you're interested in (Whens)?
Once you've worked that out, there are a couple more questions I like to ask:
Is there any other context which, for this same event, gives a different outcome?
Is there any other outcome that's important?
For instance, if I was implementing this behaviour for a big supermarket, I might come across an example like:
"Oh! No, don't add food back to stock. We don't know how it's been stored. We refund it if there's something wrong with it, but we bin it."
You can probably see how that might change your code!
Testers are really great at asking these questions and spotting missing scenarios! This leads us to the "Three Amigos" pattern. I like to include:
A business person, Product Owner, subject matter expert or person with the problem
A tester
A dev (or a pair of devs).
You can also include UI designers, technical writers, etc. - Matt Wynne says it's "Three Amigos where three is a number between 3 and 7".
I really like it when the developer writes the scenarios down, in any form that allows them to get to the "Given, When, Then". Sometimes I'll do it in the meeting; sometimes I do it later and show it or send it to my business person.
Courgette's example is something that typically happens when people don't have these conversations. If you start with the conversations, you're much more likely to get something that matches the above. Not only are those declarative steps easier for business to read and for the whole team to talk about, but they're also easier to maintain, as the detail of how they're achieved is hidden (usually in Step Definitions, and further in Page Objects).
There's all kinds of useful posts for BDD newcomers on my blog if you want to know more!
Given a scenario that tests sending a message to a 3rd-party API, I can add multiple givens and outcomes to a single scenario, for each property of the message. This makes the scenario quite complex.
I can also break these out into separate scenarios. But they really are not different scenarios.
This is a scenario with multiple givens and outcomes:
Scenario 1: An order
Given an order
And that has order ID equal to 42
And that has affiliate reference equal to foo
When the conversion for the order is sent
Then the conversion has an ID equal to 42
And the conversion has an affiliate ID equal to foo
And here I have broken it up into multiple scenarios:
Scenario 1: An order with a specific order ID
Given an order that has order ID equal to 42
When the conversion for the order is sent
Then the conversion has an ID equal to 42
Scenario 2: An order with a specific affiliate reference
Given an order that has affiliate reference equal to foo
When the conversion for the order is sent
Then the conversion has an affiliate ID equal to foo
Try having a conversation with someone in the business about the order. Ask them for an example of the kind of order that has an affiliate reference.
If they naturally talk about an order with a certain ID and affiliate reference, and those two things come together, it's fine to put it in one scenario. You'll probably hear them talk about both things in the same clause, for instance:
Bus: So, when we send the order for conversion, it should have the same ID and
affiliate reference.
MvO: Can you give me an example of those? The ID and affiliate
reference?
Bus: Sure, an ID is a simple integer, so, 42, and the affiliate
reference is the name of our affiliate, so something like 'Foo'.
(By the way, use realistic affiliate names if you can - it makes it easier for the business to spot if you've missed something!)
When we convert this to Gherkin, keeping the language as natural as possible (I wrote a blog post on this), we get something like:
Given an order with ID 42 and affiliate reference "foo"
When we send the order for conversion
Then the conversion should have the same ID and affiliate reference.
If, however, there are some orders which don't have affiliate references, or retaining the affiliate reference is a completely separate capability and the business talk about it separately, probably you want two scenarios.
Note there are some other benefits to talking to your business representatives!
First, they'll probably phrase the "when" in the active voice (we send the order) rather than the passive (the order is sent) which makes it much easier to see who's doing what. This is especially important in scenarios with multiple roles, and helps us think about who or what triggers the outcome. (Here's a blog post about tenses and voices in BDD.)
Second, you get a chance to question them! "Are there any orders which don't have affiliate references? Do all orders have IDs like that, or do you have some old orders with old-style IDs floating around in the system?" And so forth. If you can't think of questions to ask easily, bring a tester with you. Testers are great at thinking of questions to ask. (I wrote a blog post on this, too.)
Third, you're more likely to carry the same language the business use into the code, so it's going to be easier to maintain, and you'll be able to have conversations about it more easily too.
If your business aren't actually interested in conversations around what the API does then don't use Gherkin-based tools for the API tests. You can maintain a little DSL in plain old XUnit much more easily than in English.
To cover your question more generically: yes, it's fine to have multiple givens and outcomes in a scenario. I generally reckon that once you've got more than seven steps, you want to be splitting it into separate scenarios.
Make sure you have conversations around the scenarios, though, because a lot of these problems go away when you do.
Take two different ways of stating the same behavior.
Option A:
Given a customer has 50 items in their shopping cart
When they check out
Then they will receive a 10% discount on their order
Option B:
Given a customer has a high volume of items in their shopping cart
When they check out
Then they will receive a high volume discount on their order
The former is far more specific. If someone has some question about exactly when a customer gets a high volume discount or how much to give them, reading this scenario makes it very clear. Serving the purposes of documenting the behavior, it's about as specific as it can be, although any change in those values will require changing the scenario.
The second is more generalized and doesn't have the clarity of the first. Automating it would require incorporating the values "50" and "10" in the step implementations. On the other hand, the scenario captures the core business need: a high volume customer gets a discount. If we later decide to use "40" and "15", the scenario doesn't have to change because the core business need hasn't really changed (though the step implementation would). Also, the term "high volume customer" communicates something about why we're giving them the discount.
So, which is better? Rather, under what circumstances should I favor the former or the latter?
I think I'll go for option A.
The thing is that BDD scenarios must serve as documentation of the system.
So if a non technical wants to know how your discount system is working (A business guy, a tester or someone from the customer support team), they surely would like to know what it means to have a high volume of item and what it is the applied discount.
And they would not want to have to go in the plumbing code to get this information back.
I think this information is important and can not be hidden from the reader.
Another benefit is that it will allow for a non developer (a tester for example) to write new scenarios and check what will happen if there are 1 item in the cart or 100 items.
When you get too much abstract about thing, it gets harder to apply deliberate discovery.
So with a scenario as in Option B, you loose the opportunity to ask your self these questions:
What happen if we have more than 50 items like 100 items is there any other discount available
What happen if we have 1 item, surrely we need to not apply a discount or should we apply a discount based on the total price of the cart instead of the number of items in it, someone buying only one really expensive item should benefit a discount too ?
is 10% the only available type of discount, do we have for example fixed amount discounts ? Do we have more complex discount strategies ?
When the business variable are visible, you can play around with them and figure out stuff that you may have forgotten or think about new interesting (or not) features.
As a general rule, I'd hide what it does not matter to know in a scenario and in that case the number of items and the applied discount value do really matter to the reader.
We run an affiliate program. Users who sign up can gain points when they successfully recruit other users. However, spammers are abusing this program, and automatically signing up large numbers of accounts. We want to prevent this from happening by closing down clearly machine-generated accounts. My idea for this is to write a program to identify machine-generated account names, or at least select a subset for manual inspection.
So far, we have found that there are two types of abnormal ids:
The first one is that there are some ids looks very similar to others, such as:
wss12345
wss12346
wss12347
test1
test2
...
The second one is that there are some ids looks like randomly generated with out rules, such as:
MiDjiSxxiDekiE
NiMjKhJixLy
DAFDAB7643
...
For the first one, I use the Levenshtein(edit) distance. This method can find out some ids, which was illustrate in type 1. (I have done this, and can get good performance)
For the second one, I can calculate the probabilty for the ids, just like:
id = "DAFDAB7643:
p(id) = p(D)*p(A|D)*p(F|A)*p(D|F)*...*p(3|4)
So I can use the probability to filter out the abnormal ids. (Just an idea; I haven't tried it out.)
Can anyone give me other suggestions about this topic? How else could I approach this problem? Can you see flaws or omissions in my attempts?
Assuming that these new accounts refer back to the the recruiter's ID, I'd look at the rate and/or sheer number of new accounts associated with a given recruiter.
Some analysis on IP addresses or similar may also indicate if multiple users are coming from the same computer.
I'd use a dictionary of words, and kind of do the reverse of detecting poor passwords -- human user names should have dictionary words, personal names, lack punctuation, not include repeated characters, be mostly lower case etc.
Sort of going back to 1. above -- if a recruiter has an anamalously tight cluster of IDs, using the features you've already identified, would be a good flag. I think that this might be, essentially, #larsmans comment directly under the question.
I'd be curious to know if re-purposing password checking algorithms (item 3) provides any benefit.
You're not telling us what sort of site you are running, so this is a bit on the speculative side; but consider Stack Overflow as a prime example of successfully promoting good behavior through the use of a user reputation system, and weeding out many kinds of unwanted behaviors.
A quick, hackish fix might be to progressively deduct from the score when the amount of dormant recruit accounts grows larger, but a more rewarding and compelling fix is to award higher reputation scores for actually contributing to the site's content. However, this depends on the type of site you have; a stock market tips site, say, obviously works quite differently from a techical discussion forum.