I would like to run MDX Queries on the TFS Warehouse Database.
I would like to query about the code churn, code coverage, ... and many other metrics.
Is there an easy way of creating those MDX queries? How can I achieve this?
I want to run those queries in a C# application.
Your help is much appreciated !
Josh,
SQL Server Management Studio has a built in interface for creating MDX queries. It's fairly intuitive if you understand the MDX language. Note that you will be writing MDX queries against the TFS_analysis OLAP cube and not against the TFS_Warehouse relational database.
In SQL Server Management Studio go to Connect -> Analysis Services and enter the database server\instance name for the SQL Server Analysis Services instance that you have connected to your TFS application tier. There is only one OLAP cube for TFS, Tfs_Analysis. Click "New Query" and you'll get a blank tab (just like with a SQL query) and an interface which lets you drag-and-drop measures and dimensions into the query window
That being said, I don't know if this is the best approach to getting the information that you want. I didn't find that writing straight up MDX queries to be all that useful. (admittedly I am not an MDX guru though) A better approach would be to use the SQL Server Reporting Services instance that you have associated with TFS and write reports against the TFS cube. You can use Microsoft's report builder application to write MDX expressions (they call these "calculated values") and then add those to a report.
This article pretty much explains everything you need to know write reports against the TFS cube, except for how to write MDX.
http://msdn.microsoft.com/en-us/library/ff730837.aspx#bkmk_tfscube
On the topic of MDX queries \ expressions... I recently worked with a consultant from Microsoft who was a developer on SSAS and he recommended the following books if you need to learn MDX. I found a copy of the first one and it's quite informative.
http://search.barnesandnoble.com/Fast-Track-to-MDX/Mark-Whitehorn/e/9781852336813
http://www.amazon.com/gp/product/0471748080?ie=UTF8&tag=inabsqseanse2-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0471748080
http://www.amazon.com/gp/product/1849681309/ref=as_li_tf_tl?ie=UTF8&tag=inabsqseanse2-20&linkCode=as2&camp=217153&creative=399701&creativeASIN=1849681309
One other, final option is to use Excel to connect to the TFS cube and use the "perspectives" which come out-of-the-box to get the data you're looking for. There's a "Build" perspective, a "Code Churn" perspective... This is about a million times easier but doesn't give you quite as much power over getting the data you are looking for.
Using Excel to connect to the TFS cube is documented here:
http://msdn.microsoft.com/en-us/library/ms244699(v=vs.100).aspx
So, in summary...
Connecting Excel to the TFS cube is easy, but gives you little flexibility
Writing reports against the TFS cube is more difficult, but gives you more power to get the data you want.
Pure MDX queries give you ultimate control over what you're pulling back, but they are rather difficult to understand and write.
Related
I need to add tag reporting capability to a collection of custom SSRS reports which query TFS_Warehouse (and in one case I had to query the operational store to gather test case steps). These reports all use a SQL Server datasource connected to my custom TFS_Warehouse_Extensions database.
If this sounds familiar, I asked this question yesterday and got a wonderful response... then I discovered we upgraded from 2013 to 2015 last week and dbo.workitemsare is gone.
I am using VS 2015 and am more of a database developer than a C# programmer (I know just enough to be dangerous). Is there any way I can get tags from TFS 2015 workitems into my SSRS reports?
Edit: the proposed duplicate answer is not exactly the same problem. Whether or not some work items views went missing is ancillary. My requirement is for a way to query TFS tags in SSRS. So far I consider this unanswered in either thread since no one has proposed a solution that meets the requirement.
#Cyndi, I'm a Program Manager with the reporting team. Currently reporting with Tags is not supported aside from queries with the query editor. We do have a new reporting solution we're working on and reporting on tags will be supported. I don't have an exact date for the release yet, but see this blog post for some details. We'll have more announcements to make this summer.
One totally diffrent way would be to use Excel for reporting.
You build a query in TFS and then connect Excel to TFS and use the query. The functionality is kind of limited, but you can use Excels features to make great reports.
One problem with Excel is that you can't (you shouldn't) add further fields to the table you got from TFS. When you update the data, Excel will lock up and nothing happens.
If you need to use custom fields for your reporting, you should create a second table, where all the data is copied to. This can be as well automated with VBA.
That data can be analysed directly in the table or with a Pivot Table or visualised a Pivot Chart.
With the use of the Pivot feature Excel is able to create powerful reports.
It's not a good idea to query the operational store. This may cause some problems for normal usage.
You should NEVER write reports directly against the WorkItem* tables
in the collection database, since this is an operational store and is
100% unsupported and can cause performance problems for normal usage.
Source: How to query Work Items using SQL on the Relational Warehouse
I need to build query, that return result with network topology with more than 2 level in depth. For example, i wont to get next result:
but if i build "Work Items and Direct Links" query i get only 1 level depth, and if i build "Tree of Work Items" i can select onl Parent/Child type of tree and can't add my custom Successor/Predecessor type of tree.
So my direct question: may i get more then one level depth in Work Items and Direct Links Query, or change type of tree in "Tree of Work Items" Query? Or i can get that result only by integrating TFS with Project Server like this?
There are two ways to do what you are talking about and it depends if you want to create reporting or if you are talking about a work breakdown structure.
If you require a work breakdown structure I would recommend changing your process so that you do not, or processing that purely at the PMO level. Enshrining dependencies in a tool at anything lower than the Portfolio suggests to me that you may be creating solutions to effects rather than getting to the route cause of a particulate dysfunction that is enshrined in your culture.[preachMode=false] However...
#1 - Reporting
You can create a reporting services report that presents data from both the Data Warehouse and the Cube to create the desired view.
Create and Manage Reporting Services Reports for Visual Studio ALM
This will give you a read-only view of data and will lead to the least invasive dysfunctions.
#2 - MS Project
You can use MS Project to load both the Parent/Child & Network items and maintain the Gant Chart with dependencies that are stored in TFS.
I could not find a good link but there is documentation on MSDN
This will give you an interactive view of the data in the operational store, but will lead to the most invasive dysfunctions.
#3 - Project Server
If you are implementing something more like the Scaled Agile Framework then you may want to take advantage of the Earned Value management of Project Server integration as well as some of the Portfolio Management features.
Enable Data Flow Between Team Foundation Server and Microsoft Project Server
I hope that you can find something to suit and you get a chance to explore the ramification of your current process on the ability of your teams and organisation to achieve any sort of agility in the new normal of the modern application lifecycle.
From within TFS 2008, is there a way to view the disk space taken up on the server by a project (or by all projects)? Or is this something that can only be done by looking directly at the underlying database?
I cannot think of a good way to easily do this on a per project basis (including Version Control, Work Item Tracking, Build and Project Portal data). The data is all stored in the various SQL Server databases but there is no separation at the project level showing you how much that data ends up costing you in disk space - you'd need to sum up the totals of the various databases on disk to give you that number.
The following blog post from Brian Harry might help you get close to the data you need for TFS 2008
TFS Statistics
The way that stuff is stored in TFS 2010 is very different, see Grant Holliday's blog post for a set of queries that work against the TFS 2010 database.
I am currently a single BI developer over a corporate datawarehouse and cube. I use SQL Server 2008, SSAS, and SSIS as my basic toolkit. I use Visual Studio +BIDS and TFS for my IDE and source control. I am about to take on multiple projects with an offshore vendor and I am worried about managing change. My major concern is manging merges and changes between me and the offshore team. Merging and managing changes to SQL & XML for just one person is bad enough but with multiple developers it seems like a nightmare. Any thoughts on how best to structure development knowing that sometimes there is no way to avoid multiple individuals making changes to the same file?
SSIS, SSAS and SSRS files are not merge-friendly. They are stored in an xml file that is changed drastically - even with minor changes (such as changing a property) - so it becomes really impossible to merge.
So stop thinking about parallel development on one file. You need to think how you can achieve that people are not need to do parallel development on one file. So start with disabling the multiple checkout of a file. You might even want to consider to enable the option to get the latest version on a checkout.
Then start thinking how you can achieve that people can work independent. This is more in the way you structure the work and files:
Give people their own area they can work on. One SSIS package is only developed by person X at any given moment in time.
Make smaller files so the change that two people need to work in the same file is small.
I have given feedback to the product team of the imcompatability of BIDS to merge. It is a known issue, but will be hard to tackle. They don't know when it will be possible to really do parallel development on these files. Until then keep away from parallel development.
As Ewald Hofman mentioned, SSAS and SSIS is not merge-friendly.
In one environment I worked solved the problem as follows:
do only use SSIS when you have to (fuzz algorithm or something similar). Replace SSIS packages as often as you can with SQL code (see Linked Server for datasync. and MEARGE Command for dimension/fact-table-creating for instance).
build your data warehouse structure as follows:
build 2 databases, one for the "raw source data" from the source systems and one (the "stage" database) for the dimension and fact views and tables
use procedures that can deploy the whole "stage" database
put the structure for the "stage" database into your Repository
build a C# application that build your dimensions and cubes via the AMO API (I know, that's a tough job at the beginning but it is it worth - think on what you gain - Look at the Pros below )
add the stage database and the C# application to your Repository (TFS/Git etc.)
Pros of that structure:
you have a merge-able structure you can put in your Repository
you are using the AMO API witch has
you can automate the generation of new partitions
you can use procedures to automate and clone measure groups to different cubes (what I think is sometimes a big benefit!)
you could outsource your translation and import it easily (the cube designer is probably not the best translator)
Cons:
the vendor would probably not adapt that structure
you have to pay more (because of either higher skill requirements or for teaching him your individual structure)
you probably need knowledge over a new language C# - if you don't already have
Conclusion:
there are possibilities to get a merge-friendly environment
you will get lost of nice click-and-run tools f.e. BIDS - but will get into process of high automation functionality
outsourcing will be maybe unprofitable because of high individualization
http://code.google.com/p/support/wiki/DVCSAnalysis
maybe a better tag is DVCS?
https://stackoverflow.com/questions/tagged/dvcs
As long as both teams are using bids and TFS this should not be a problem.
assuming that your tsql code is checked in to source control in a single file per object, merging TSQL code is straight forward since it is text based. I have found the VSTS Database projects help with this.
Merging the XML based source files of SSIS and the MSAS can be cumbersome as you indicate below. to alleviate some of the pain, I find that keeping each package limited to a single dataflow or logical unit of work helps reduce developer contention on packages. I then call these packages from one or more master packages. I also try to externalize all of my tsql source queries using sprocs, view or udfs so that the need to edit the package is further reduced. using configuration files and variables also helps to a smaller extent.
MSSAS cubes are a little bit tougher. My best suggestion is to look into a 3rd party xml differencing tool. I have been able to successfully merge small changes use the standard text based tools but it can be a daunting task.
I am building out some reporting stuff for our website (a decent sized site that gets several million pageviews a day), and am wondering if there are any good free/open source data warehousing systems out there.
Specifically, I am looking for only something to store the data--I plan to build a custom front end/UI to it so that it shows the information we care about. However, I don't want to have to build a customized database for this, and while I'm pretty sure an SQL database would not work here, I'm not sure what to use exactly. Any pointers to helpful articles would also be appreciated.
Edit: I should mention--one DB I have looked at briefly was MongoDB. It seems like it might work, but their "Use Cases" specifically mention data warehousing as "Less Well Suited": http://www.mongodb.org/display/DOCS/Use+Cases . Also, it doesn't seem to be specifically targeted towards data warehousing.
http://www.hypertable.org/ might be what you are looking for is (and I'm going by your descriptions above here) something to store large amounts of logged data with normalization. i.e. a visitor log.
Hypertable is based on google's bigTable project.
see http://code.google.com/p/hypertable/wiki/PerformanceTestAOLQueryLog for benchmarks
you lose the relational capabilities of SQL based dbs but you gain a lot in performance. you could easily use hypertable to store millions of rows per hour (hard drive space withstanding).
hope that helps
I may not understand the problem correctly -- however, if you find some time to (re)visit Kimball’s “The Data Warehouse Toolkit”, you will find that all it takes for a basic DW is a plain-vanilla SQL database, in other words you could build a decent DW with MySQL using MyISAM for the storage engine. The question is only in desired granularity of information – what you want to keep and for how long. If your reports are mostly periodic, and you implement a report storage or cache, than you don’t need to store pre-calculated aggregations (no need for cubes). In other words, Kimball star with cached reporting can provide decent performance in many cases.
You could also look at the community edition of “Pentaho BI Suite” (open source) to get a quick start with ETL, analytics and reporting -- and experiment a bit to evaluate the performance before diving into custom development.
Although this may not be what you were expecting, it may be worth considering.
Pentaho Mondrian
Open source
Uses standard relational database
MDX (think pivot table)
ETL ( via Kettle )
I use this.
In addition to Mike's answer of hypertable, you may want to take a look at Apache's Hadoop project:
http://hadoop.apache.org/
They provide a number of tools which may be useful for your application, including HBase, another implementation of the BigTable concept. I'd imagine for reporting, you might find their mapreduce implementation useful as well.
It all depends on the data and how you plan to access it. MonetDB is a column-oriented database engine from the most revolutionary team on database technologies. They just got VLDB's 10-year best paper award. The DB is open source and there are plenty of reviews online praising them.
Perhaps you should have a look at TPC and see which of their test problem datasets match best your case and work from there.
Also consider the need for concurrency, it adds a big overhead for any kind of approach and sometimes is not really required. For example, you can pre-digest some summary or index data and only have that protected for high concurrency. Profiling your data queries is the following step.
About SQL, I don't like it either but I don't think it's smart ruling out an engine just because of the front-end language.
I see a similar problem and thinking of using plain MyISAM with http://www.jitterbit.com/ as data access layer. Jitterbit (or another free tool alike) seems very nice for this sort of transformations.
Hope this helps a bit.
A lot of people just use Mysql or Postgres :)