KSQL stream-table join unit test - ksqldb

I am looking to test my KSQL scripts which involves stream-table join based on a input topic.What is the option available here?

There is a feature request for building a testing framework for KSQL: https://github.com/confluentinc/ksql/issues/2103
You may want to chime in with your preference there since it is under active development at the moment.

Related

How do I connect an iOS app to Google Cloud SQL?

I had been building my database using Cloud Firestore because this was the easiest to implement. However, the querying capabilities of Firestore are insufficient for what I want to build, mostly due to the fact it can't handle querying inequalities on multiple fields. I need a SQL database.
I have an instance of Google Cloud SQL set up. The integration is far harder than Firebase where you just need to add a Cocoapods Pod. From my research it looks like I need to set up a Cloud SQL proxy, although if there is a simpler way of connecting it, I'd be glad to hear about it.
Essentially, I need a way for a client on the iOS to read and write to a SQL database. Cloud SQL seemed like the best, most scalable option (though I'd be open to hearing about alternatives that are easy to implement).
You probably don't want to configure your application to rely on connecting directly to an SQL database. Firestore is a highly scalable database that can handle thousands of connections - MySQL and Postgres do not scale as cleanly.
Instead, you should consider constructing a simple front end service that can be used to query the database and return formatted results. There are a variety of benefits to structuring this way, including being able to further optimize or distribute your queries. Google AppEngine and Google Cloud Functions can both be used to stand up such a service quickly, and both provide easy connection options to Cloud SQL.
I’ve found that querying with Firestore is best designed around your front end needs. Using nested sub collections, the ref property or document/collection id relationships can get you most of what you need for the front end.
You could also use Firebase Functions written in most of the major languages which perform stateless transactions to a Cloud SQL, Spanner or any other GCP database instance.
Alternatively you could deploy container images to Google Container Registry and easily deploy to Kubernetes Engine, Compute Engine or Cloud Run. Each of which have trade offs and advantages.
One advantage to using Firestore is to easily tie users with authentication {uid}; rules to protect the backend; custom claims for role based permissions on the front end and access to real-time streams as observables with extremely low latency.

Neo4J end user interface

I need to share a Neo4J graph visualization with end users. They should be able to interact with the graph, and perform some very basic querying. For example:
- show me the relationships up to 3 hops away from node named 'Joe'
A first option would be to just give them the standard user interface (usually exposed at port 7474); however this is too powerful as they could perform anything in Cypher.
Is there any way of restricting this interface (so that they cannot trigger expensive queries or even graph updates)? Or maybe other open source / community alternatives?
Thanks
If you are using the Enterprise Edition of neo4j, you will have access to extensive authentication and authorization capabilities, including the ability to assign a reader role to specific user names.
If you do want to use the standard browser interface, you can apply some settings on the neo4j.conf file that may help you out:
dbms.transaction.timeout=10s
dbms.read_only=true
dbms.transaction.timeout will terminate queries exceeding the timeout, so that can prevent expensive queries.
dbms.read_only makes the entire db instance read-only.
You may also build a custom web UI that calls the REST endpoint (need to auth in headers)
or
create an unmanaged extension
https://neo4j.com/docs/java-reference/3.1/#server-unmanaged-extensions
I suggest you the chapter 8 of the excellent book Learning Neo4j, by Rik Van Bruggen. This book is available for download at Neo4j web site.
One of the sections of this chapter shows some open source visualization libraries and visualization solutions.
EDIT 1:
Analyzing a bit more the chapter 8 of the Learning Neo4j book I believe that a promising tool for your use case is the paid solution Linkurio.us (you can run a demo in the site). This solution has a native integration with Neo4j and others graph databases.
EDIT 2:
Alternatively you can build your own visualization solution with a graph visualization library in JavaScript, for example. Here a very useful answer from another StackOverflow question that lists more some libraries that can help you.

Database Testing in Rails

I'm using Rails 4 and the testing framework with which it ships.
I'm using a relational database that needs to be rigorously tested for internal consistency. In this case, it's a sports statistics database with player stats that are updated nightly.
I need to run tests like the following when new data arrives each night:
--That the sum of player stats equals the team's.
--That the sum of team stats equals the league's.
--That the sum of wins and losses equals games played.
--etc.
For now I'm copying my development database over to testing and running these alongside my unit tests in the /test/models/ directory.
This is an awkward set-up as my database testing code isn't comprised of unit tests in their proper sense, and it doesn't rely on fixtures as the Rails documentation suggests this folder be used for.
My question is: In Rails, what is the best practice for database testing like that which I describe above?
This isn't really testing in the classical sense, so it doesn't really make sense to include this with the unit tests that you have. There are at least 2 good solutions:
Compute things like team and league stats on the fly. It doesn't sound like that's what you have going, though.
When you are adding new data to the database, check for consistency then. If one record/value breaks the internal consistency, don't add it to the database.

Input and test data for a SpecFlow scenario

I have started recently using SpecFlow and I have 2 basic questions I need to clarify, also to confirm I am on the right way:
As I understand, it is a must that all the input data (test parameters for the scenarios) to be provided by the tester, the same about the test data (input data for the tables involved in the test scenarios)
Are there any existing tools for a quick way of generating test data (inserting it into the DB) ? I am using Entity Framework as part of the Data access layer. I was wondering about some tool that would read the data from a file or probably some Desktop application to provide values for the table's fields (which could also then generate a file from which some other tool could read all the data and generate all the required objects etc).
I also had a look at Preparing data for a SpecFlow scenario - I was thinking if there is already a framework which would achieve insert\delete of test data to use alongside with SpecFlow.
I don't think you are on the right track. SpecFlow is a BDD tool, but in some ways it only covers part of the process. Have a read of http://lizkeogh.com/2013/07/01/behavior-driven-development-shallow-and-deep/ and see if any if the scenarios sound familiar?
To move forwards I would recommend you start with http://dannorth.net/introducing-bdd/ to get a good idea of how it all began. Now lets consider your points;
The tester provides all the test data. Well yes and no. The idea is that between yourself and the feature expert, you are able to have a conversation that provides all the examples that you need to develop your feature. If you don't involve yourself in that conversation, then yes all the data will come from the other side, but the chances are it won't be such high quality as if you are able to ask the right questions and guide the conversation so the data follows a structure that you can code tests too.
As an example here, when I first started with BDD I thought I could get the business experts to write the plain text scenario files with less input from the development, but in practice the documents tended to be less useful than when we were involved. Not because they couldn't write decent specifications, but actually because they couldn't refactor them to reuse bindings etc. We were still needed to add our skills into the process.
Why does data go into a database? A good test is isolated to the scope that it is testing. For a UI layer test this means that we don't have a database. For a business tier test we shouldn't be reliant on the database to get data either.
In practice a database is one of the most difficult things to include in your testing because once any part of the data changes you cause cascading test failures.
Instead I would recommend making your features smaller and provide the data for your test in the scenario or binding. This also makes having your conversation easier, because the fiftieth row of test pack is not something either party is going to remember. ;-) I recommend instead trying to give you data identities, so "bob" might be individual in a test you can discuss, and both sides understand what makes him an interesting example.
good luck :-)
Update: With regard to using a database during testing, my experience is that there are a lot of complexities that make it a difficult choice to work with. Consider these points,
How will you reset the state of your data between tests?
How will you reset the state if one / some tests fail?
If you are using branches or even just if two developers are making changes at the same time, how will you support multiple test datasets?
How will you handle two instances of the tests running at the same time (don't forget the build server)?
Have a look at this question SpecFlow Integration Testing with Database Patterns which includes some patterns that you can use.

Running MDX Queries on TFS

I would like to run MDX Queries on the TFS Warehouse Database.
I would like to query about the code churn, code coverage, ... and many other metrics.
Is there an easy way of creating those MDX queries? How can I achieve this?
I want to run those queries in a C# application.
Your help is much appreciated !
Josh,
SQL Server Management Studio has a built in interface for creating MDX queries. It's fairly intuitive if you understand the MDX language. Note that you will be writing MDX queries against the TFS_analysis OLAP cube and not against the TFS_Warehouse relational database.
In SQL Server Management Studio go to Connect -> Analysis Services and enter the database server\instance name for the SQL Server Analysis Services instance that you have connected to your TFS application tier. There is only one OLAP cube for TFS, Tfs_Analysis. Click "New Query" and you'll get a blank tab (just like with a SQL query) and an interface which lets you drag-and-drop measures and dimensions into the query window
That being said, I don't know if this is the best approach to getting the information that you want. I didn't find that writing straight up MDX queries to be all that useful. (admittedly I am not an MDX guru though) A better approach would be to use the SQL Server Reporting Services instance that you have associated with TFS and write reports against the TFS cube. You can use Microsoft's report builder application to write MDX expressions (they call these "calculated values") and then add those to a report.
This article pretty much explains everything you need to know write reports against the TFS cube, except for how to write MDX.
http://msdn.microsoft.com/en-us/library/ff730837.aspx#bkmk_tfscube
On the topic of MDX queries \ expressions... I recently worked with a consultant from Microsoft who was a developer on SSAS and he recommended the following books if you need to learn MDX. I found a copy of the first one and it's quite informative.
http://search.barnesandnoble.com/Fast-Track-to-MDX/Mark-Whitehorn/e/9781852336813
http://www.amazon.com/gp/product/0471748080?ie=UTF8&tag=inabsqseanse2-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0471748080
http://www.amazon.com/gp/product/1849681309/ref=as_li_tf_tl?ie=UTF8&tag=inabsqseanse2-20&linkCode=as2&camp=217153&creative=399701&creativeASIN=1849681309
One other, final option is to use Excel to connect to the TFS cube and use the "perspectives" which come out-of-the-box to get the data you're looking for. There's a "Build" perspective, a "Code Churn" perspective... This is about a million times easier but doesn't give you quite as much power over getting the data you are looking for.
Using Excel to connect to the TFS cube is documented here:
http://msdn.microsoft.com/en-us/library/ms244699(v=vs.100).aspx
So, in summary...
Connecting Excel to the TFS cube is easy, but gives you little flexibility
Writing reports against the TFS cube is more difficult, but gives you more power to get the data you want.
Pure MDX queries give you ultimate control over what you're pulling back, but they are rather difficult to understand and write.

Resources