External data source with specflow - specflow

I find entering the data in the feature file of specflow very painful specially when it is repetitive and large data. Can we use an external data source like spreadsheet to enter this data and then use this external datasource in the feature file?

It's theoretically possible, but probably so much effort that you wouldn't want to do it.
The problem is that the feature file is simply a human readable form. When it is saved in Visual Studio it is parsed and converted into the feature.cs file and that is the one that is compiled and used for testing.
So your process would become
edit spreadsheet
export to feature file
get specflow's VS plugin to convert to feature.cs
run msbuild
run tests via Nunit or similar
I wouldn't do this. Instead I'd focus on getting my tests to be better examples. It sounds like you are to trying to exhaustively cover every possibility. Don't come up with examples to cover every possible case, but instead cover as much logic as possible with fewer tests.

Related

Should i use a framework or self made script for machine learning workflow automation?

For a personal work I try to automate the workflow of my machine learning model but I face some question in the perspective of a professional approach.
At the moment I am doing the following tasks manually:
From the raw data I extract the data that interests me in a directory with the help of a third party software (to which I give in argument the parameters of the extraction).
Then I run another software, or in some cases one (or more) of my scripts (python) in order to pre-process my data which will be stored in a new directory.
Finally I provide the processed data to one of my model which returns the labeled data and that I store in a last directory.
process diagram of the previous description.
Each step (extract, pre-process and model) are always executed in the same order but I change the scripts/software parameters/model according to my needs or the comparison I need to do.
All my scripts are stored in an ordered script directory and the third party software is called from the command line from a python script.
My goal would be to have a script/software that does the whole loop by itself. As input it would take the raw data (or the directory where they are stored) and the different parameters to make the loop with the desired module (and their right parameters).
The number of module and parameter combinations is so big that I can't make a script for each one, that's why I want to build something very modular.
I can code myself my own script but I would like to have a more professional approach as if I had to implement it for a company.
My questions: In my case (customizable/interchangeable module) would it be more appropriate to use a framework (e.g. Kedro or any other) or to build it myself (because my needs are too specific)? If frameworks are appropriate which ones to choose (and why) ?
I've been researching frameworks that already exist but besides the fact that I'm not sure if they fit my needs there are so many that I'd like to spend some time on one that could help me in my future project or professional experience.
thanks you

Multiple feature file launching with single browser instance in Specflow

When I am running my test solution, single browser is getting launched but it is running two feature file simultaneously due to which test cases are failing. One step it is taking from one feature file and other from other feature file.
Contrary to the comment left on your question, I think I may have enough context to answer you.
You describe feature files that are sharing steps and concerns about multiple browser instances. This tells me that your various step files might each be containing a browser instance.
What you're likely looking to do instead is to use a SpecFlow Context -- SpecFlow provides it own ScenarioContext object you can use, or you can create your own context and inject it.
Some links that might help:
SpecFlow docs on sharing data between bindings, which explains about ScenarioContext and FeatureContext:
https://docs.specflow.org/projects/specflow/en/latest/Bindings/Sharing-Data-between-Bindings.html
Here's an article on using SpecFlow with Selenium and the Page Object Model: https://docs.specflow.org/projects/specflow/en/latest/ui-automation/Selenium-with-Page-Object-Pattern.html
The SpecFlow YouTube channel will likely be helpful as it's full of experts walking through these sort of examples: https://www.youtube.com/c/SpecFlowBDD/videos
Here's the first video in a 5 part series on how to automate a web application with Selenium and SpecFlow: https://www.youtube.com/watch?v=y1dAogvWVh8
Based on your question, it's also possible that your issue could be that you want things to run in parallel, but have the problem of your tests being dependent on one another or running in a certain order. This will be a bit more complex to solve.
I strongly suggest you treat your tests so that they can be run in isolation. You may need to add separate data to a database, or operate your tests so that they're not touching the same thing. This takes more work, but is more than worth it, because it will enable better maintainability and reliability of your tests and also ensure they can run in parallel successfully.
I hope this helps!

How to use neo4j effectively for serious, repeatable analysis over time

New to neo4J and love the browser for exploratory work. But, I'm unsure of how to best use it to achieve, for lack of a better term, real work. Consider a sample project involving:
Importing 4 different CSV files
Creating appropriate relationships between nodes
Doing a variety of complex queries to derive data that I'll export for statistical analysis using another program.
I need to be able to replicate the project in the future, as well as adding new data, calculating different derived data, etc. I also need to be able to share the code so others can extend/verify it.
For non-relational data, I'd use something like R, Stata or SAS. While each allow interactive exploration like the neo4J browser, I'd never use that for serious analysis. Instead, I'd save a file or files of commands that I could modify and rerun whenever I needed to.
Neo4j's browser doesn't seem to support any of this functionality. Unless I am missing something, it doesn't even allow one to save a "session" along the lines of a iPython/Jupyter notebook. I know that there is a neo4-shell, but especially since they have dropped it from the standard desktop installation (and gotten rid of the console), I feel like I must be doing something wrong--or at least contrary to the designers' intent--if I can't do serious work in the browser. Clearly, lots of people are.
Can anyone point me in the right direction? How does one best develop an extensive, replicable project over time with neo4j? Thank you.
You can take your pick of several officially-supported language drivers to integrate neo4j into basically any other project structure, including Jupyter notebooks. I'm not sure what exactly you mean by "serious work", or where you got the idea that people did lots of it in the browser, but you are definitely able to save the results of a query from the browser in a variety of formats (pictures of the bubbles, result rows in a CSV, JSON response) if your prefer to work that way, or you can pipe data very efficiently into another language and manage it there. I don't see why they would re-create presentation and/or project management tools when there are already so many good ones out there.

Input and test data for a SpecFlow scenario

I have started recently using SpecFlow and I have 2 basic questions I need to clarify, also to confirm I am on the right way:
As I understand, it is a must that all the input data (test parameters for the scenarios) to be provided by the tester, the same about the test data (input data for the tables involved in the test scenarios)
Are there any existing tools for a quick way of generating test data (inserting it into the DB) ? I am using Entity Framework as part of the Data access layer. I was wondering about some tool that would read the data from a file or probably some Desktop application to provide values for the table's fields (which could also then generate a file from which some other tool could read all the data and generate all the required objects etc).
I also had a look at Preparing data for a SpecFlow scenario - I was thinking if there is already a framework which would achieve insert\delete of test data to use alongside with SpecFlow.
I don't think you are on the right track. SpecFlow is a BDD tool, but in some ways it only covers part of the process. Have a read of http://lizkeogh.com/2013/07/01/behavior-driven-development-shallow-and-deep/ and see if any if the scenarios sound familiar?
To move forwards I would recommend you start with http://dannorth.net/introducing-bdd/ to get a good idea of how it all began. Now lets consider your points;
The tester provides all the test data. Well yes and no. The idea is that between yourself and the feature expert, you are able to have a conversation that provides all the examples that you need to develop your feature. If you don't involve yourself in that conversation, then yes all the data will come from the other side, but the chances are it won't be such high quality as if you are able to ask the right questions and guide the conversation so the data follows a structure that you can code tests too.
As an example here, when I first started with BDD I thought I could get the business experts to write the plain text scenario files with less input from the development, but in practice the documents tended to be less useful than when we were involved. Not because they couldn't write decent specifications, but actually because they couldn't refactor them to reuse bindings etc. We were still needed to add our skills into the process.
Why does data go into a database? A good test is isolated to the scope that it is testing. For a UI layer test this means that we don't have a database. For a business tier test we shouldn't be reliant on the database to get data either.
In practice a database is one of the most difficult things to include in your testing because once any part of the data changes you cause cascading test failures.
Instead I would recommend making your features smaller and provide the data for your test in the scenario or binding. This also makes having your conversation easier, because the fiftieth row of test pack is not something either party is going to remember. ;-) I recommend instead trying to give you data identities, so "bob" might be individual in a test you can discuss, and both sides understand what makes him an interesting example.
good luck :-)
Update: With regard to using a database during testing, my experience is that there are a lot of complexities that make it a difficult choice to work with. Consider these points,
How will you reset the state of your data between tests?
How will you reset the state if one / some tests fail?
If you are using branches or even just if two developers are making changes at the same time, how will you support multiple test datasets?
How will you handle two instances of the tests running at the same time (don't forget the build server)?
Have a look at this question SpecFlow Integration Testing with Database Patterns which includes some patterns that you can use.

Formatting organizing and filtering data from text files

I'm looking to go through a bunch of text files in a bunch of folders. I'd like to go through each file line by line and do some basic statistics, like grabbing time stamp and count repeating values. Is there any tool or scripting solution that someone could recommend for doing this?
Another possibility is to have a script/tool that could just parse these files and add them to a database like sqlite or access, for easy filtering.
So far I tried using AIR, but it looks like there might be too much data for it to process, and it hangs, but that could be because of some inefficient filtering.
I have used QuickMacros for things like this. It can do just about anyting to a textfile (some illegal in 7 states) as well as connect to databases and perform sql tasks like create and modify tables etc.
I routinely used it to extract data, parse it, and then load it into another database. Especially useful with Scheduled Tasks.
Here's the website
I recommend Perl and CPAN

Resources