is it possible to implement query rewrite & materialized view in generated OBIEE query - data-warehouse

I have read a paper about materialized view selection in datawarehouse , and I have been trying to implement it in OBIEE, but the generated query from OBIEE are start using WITH Clause / CTE / Subquery factoring.
This query are not rewrited to materialized view as I expected, and its a similiar issue in this oracle community forum , but query rewrite work only when I create materialize view in base table on where clause as discussed in the paper above , but our generated obiee query use an extremely complex subquery factoring, then this won't make a significant increase in query speed.
I just want to know is it possible to implement query rewrite & materialized view in generated OBIEE query in terms of the discussion in the paper above ? Thanks.

Short answer: No. Longer answer: No, because that's not how the product is built. It's meant to be a source-agnostic query generator and analytics platform, not a programming platform to let you develop queries on your own. You need to understand how the product works, hoe IT optimizes queries and chooses its sources. There are literally dozens of ways to influence how your query is getting written but they are all semantic/structural configuration options. You do not write code in OBI.

Related

Neo4j Cypher Query Builder

I have been trying to come across a query builder for Neo4j's query language Cypher, ideally using a fluent API. I did not find much, and decided to invest some time on building one myself.
The result so far is a fluent API query builder for the Cypher 1.9 spec.
I wanted to use StackOverflow to kick off a discussion and see what the thoughts are, before I release the code.
Here is a demo query that you would want to send off to Neo4j using Cypher.
Show me all people who John knows who know software engineers at Google (Google company code assumed to be 12345).
The relationship strength between John and the people who connect him to Google employees should be at least 3 (assuming a range from 1-5).
Return all of John's connections and the people they know at Google, including the relationships between those people.
Sort the results by name of John's connections in ascending order and then by relationship strength in descending order.
Using Fluent-Cypher:
Cypher
.on(Node.named("john").with(Index.named("PERSON_NAMES").match(Key.named("name").is("John"))))
.on(Node.named("google").with(Id.is(12345)))
.match(Connection.named("rel1").andType("KNOWS").between("john").and("middle"))
.match(Connection.named("rel2").andType("KNOWS").between("middle").and("googleEmployee"))
.match(Connection.withType("WORKS_AT").from("googleEmployee").to("google"))
.where(Are.allOfTheseTrue(Column.named("rel1.STRENGTH").isGreaterThanOrEqualTo(3)
.and(Column.named("googleEmployee.TITLE").isEqualTo("Software Engineer"))))
.returns(Columns.named("rel1", "middle", "rel2", "googleEmployee"))
.orderBy(Asc.column("middle.NAME"), Desc.column("rel1.STRENGTH"))
which yields the following query:
START john=node:PERSON_NAMES(name='John'),google=node(12345) MATCH john-[rel1:KNOWS]-middle,middle-[rel2:KNOWS]-googleEmployee,googleEmployee-[:WORKS_AT]->google WHERE ((rel1.STRENGTH >= '3' AND googleEmployee.TITLE = 'Software Engineer')) RETURN rel1,middle,rel2,googleEmployee ORDER BY middle.NAME ASC,rel1.STRENGTH DESC
I agree that you should build this with an eye towards Cypher 2.0. As of 2.0, it's very important that WHERE clauses are matched up with the correct START, (OPTIONAL) MATCH, and WITH clauses making the design of a fluent API a bit more challenging.
I like your first example where you just use the text to describe the query. The second option, to tell you the truth, doesn't look so much easier to me than constructing the Cypher query itself. The language is quite easy to use and is well documented. Adding another layer of abstraction would only increase complexity. However, if you find a way of translating this natural language request into a Cypher request, that'd be cool :)
Also, why not start working directly with Cypher 2.0?
Finally, check out this here: http://github.com/noduslabs/infranodus – I'm working on a similar problem but for adding the nodes into the database, not querying them. I chose to use #hashtags to make it easier for people to understand how their queries should be structured (as we already use them). So in your case it could become something like
#show-all #people who #John :knows who :know #software-engineers :at #Google.
#relationship-strength between #John and the #people who are #linked to #Google #software-engineers should be at least #3
#return #all of #John's #connections and the #people they :know at #Google, including the #relationships-between those #people.
#sort the #results #by-name of #John's #connections in #ascending order and then by #relationship-strength in #descending order.
(let's say the #hashtags refer to nodes, the #at refers to actions on them)
If you could pull something like this off, I think that'd be a much better and more useful simplification of the already easy-to-use Cypher.

oData - applying filters to SQL queries

I am relatively new to oData service and I am trying to explore if oData is feasible for my project.
From all the examples / demos that I have come across,every demo always loads up all data into the repository and then oData filters are applied over the data.
Is there a way to not load up all data (apply the filters to SQL from oData) from SQL which will obviously be highly inefficient for N number of requests coming in /second ?
So for example if I had a movies service :
localhost:4502/OdataService/movies(55)
The above example is actually just filtering for movie id 55 from an "entire" set of movies.Is there a way to make this filter happen at SQL level instead of bloating the memory first with all movies and then allowing oData to filter it?
Can anyone guide me in the right direction?
I found out after doing a small POC that Entity framework takes care of building dynamic query based on the request.

In Lucene/Solr what is the difference between Join and BlockJoin?

Join is described as pseudo-Join, because it's more equivalent to an SQL inner-query.
Whereas BlockJoin is described as more like a SQL join but requiring a sophisticated indexing schema, one that anticipates all the possible joins you'd want to make.
Could someone explain the difference between these features in terms of how to implement them at index time and query time. And what are the implications for performance?
I don't think blockjoinquery is a Solr function. I think its Lucene feature.
The solr join doesn't score documents in the from query and it doesn't return combined results. So its best used as a filter query. This will allow the main query.to score.
Block join on the other hand does use scoring and returns both results.( not 100% sure)
You can also use querytime join. This has serval scoring options. This is also a lucene feature but doesn't require special indexing blocks. I've used this in combination with a solr query parser plugin. The performance is a bit lower then blockjoin but it Works.
I have only used solr join and querytimejoin So I can't really say much about blockjoin.
As I understand, BlockJoin is for joining against nested/child documents within the same core. Join is for joining against a separate core.

Linq-to-SQL query - Need to filter by IDs returned by Full-Text Search sql functions - Hitting limit for Contains

My objective:
I have built a working controller action in MVC which takes user input for various filter criteria and, using PredicateBuilder (part of LinqKit - sorry, I'm not allowed enough links yet) builds the appropriate LINQ query to return rows from a "master" table in SQL with a couple hundred thousand records. My implementation of the predicates is totally inelegant, as I'm new to a lot of this, and under a very tight deadline, but it did make life easier. The page operates perfectly as-is.
To this, I need to add a Full-Text search filter. Understanding the way LINQ translates Contains to LIKE(%%), using the advice in Simon Blog: LINQ-to-SQL - Enabling Full-Text Searching, I've already prepared Table Functions in SQL to run Freetext queries on the relevant columns. I have 4 functions, to match the query against 4 separate tables.
My approach:
At the moment, I'm building the predicates (I'll spare you) for the initial IQueryable data object, running a LINQ command to return them, like so:
var MyData = DB.Master_Items.Where(outer);
Then, I'm attempting to further filter MyData on the Keys returned by my full-text search functions:
var FTS_Matches_Subtable_1 = (from tbl in DB.Subtable_1
join fts in DB.udf_Subtable_1_FTSearch(KeywordTerms)
on tbl.ID equals fts.ID
select tbl.ForeignKey);
... I have 4 of those sets of matches which I've tried to use to filter my original dataset in several ways with no success. For instance:
MyNewData = MyData.Where(d => FTS_Matches_Subtable_1.Contains(d.Key) ||
FTS_Matches_Subtable_2.Contains(d.Key) ||
FTS_Matches_Subtable_3.Contains(d.Key) ||
FTS_Matches_Subtable_4.Contains(d.Key));
I just get the error: The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters were provided in this RPC request. The maximum is 2100.
I get that it's because I'm trying to pass a relatively large set of data into the Contains function and LINQ is converting each record into a separate parameter, exceeding the limit.
I just don't know how to get around it.
I found another post linq expression to return property value which seemed SO promising. I tried ifwdev's solution (2nd highest ranked answer): using LinqKit to build an extension that will break up the queries into manageable chunks. But I can't figure out how to implement it. Out of my depth right now maybe?
Is there another approach that I'm missing? Some simpler way to accomplish this that I've overlooked?
Sorry for the long post. But thank you for any help you can provide!
This is a perfect time to go back to raw ado.net.
Twisting things around just to use linq to sql is probably just as time consuming if you wrote the query and hydration by hand.

Is a full list returned first and then filtered when using linq to sql to filter data from a database or just the filtered list?

This is probably a very simple question that I am working through in an MVC project. Here's an example of what I am talking about.
I have an rdml file linked to a database with a table called Users that has 500,000 rows. But I only want to find the Users who were entered on 5/7/2010. So let's say I do this in my UserRepository:
from u in db.GetUsers() where u.CreatedDate = "5/7/2010" select u
(doing this from memory so don't kill me if my syntax is a little off, it's the concept I am looking for)
Does this statement first return all 500,000 rows and then filter it or does it only bring back the filtered list?
It filters in the database since your building your expression atop of an ITable returning a IQueryable<T> data source.
Linq to SQL translates your query into SQL before sending it to the database, so only the filtered list is returned.
When the query is executed it will create SQL to return the filtered set only.
One thing to be aware of is that if you do nothing with the results of that query nothing will be queried at all.
The query will be deferred until you enumerate the result set.
These folks are right and one recommendation I would have is to monitor the queries that LinqToSql is creating. LinqToSql is a great tool but it's not perfect. I've noticed a number of little inefficiencies by monitoring the queries that it creates and tweaking it a bit where needed.
The DataContext has a "Log" property that you can work with to view the queries created. I created a simple HttpModule that outputs the DataContext's Log (formatted for sweetness) to my output window. That way I can see the SQL it used and adjust if need be. It's been worth its weight in gold.
Side note - I don't mean to be negative about the SQL that LinqToSql creates as it's very good and efficient almost every time. Another good side effect of monitoring the queries is you can show your friends that are die-hard ADO.NET - Stored Proc people how efficient LinqToSql really is.

Resources