I am trying to build a query to get the data I need from the Db. Db doesn't have a proper relationship between tables, however to retrieve the data I need I can perform a left join. The problem is that in the documentation it says that leftOuterJoin is not supported (and I would get a notsupported exception is I try to force it). So I wonder if there is another way to perform left join via query expression.
Related
I am trying to connect to my organisation's SQL database using Power Query to create some reports. I need to delete/edit some tables and join multiple tables to come up with the desired report output...
I don't want the change or edit I will do on the excel-power query to reflect on the live database but just in excel .
The short answer is no, any button you press in the Power Query Editor interface does not modify the source database. I must admit that I have not found any page in the Microsoft Docs on Power Query that states this clearly. The page What is Power Query? states that:
Power Query is a data transformation and data preparation engine. Power Query comes with a graphical interface for getting data from sources and a Power Query Editor for applying transformations.
Other pages contain similarly general and vague descriptions but let me reassure you that any data transformation you carry out by using the Power Query Editor interface will not modify your SQL database. All you see in Power Query is a view of the source database.
Seeing as you are connecting to a SQL database, it is likely that query folding is activated. This means that when you remove a column (or row), this will update the SQL query used to extract the data from the database. That query is written as a single SELECT statement that can contain multiple clauses like GROUP BY and WHERE. Transformations that add data (e.g. Add Custom Column, Fill Down) are not included in the query, they are carried out only within the Power Query engine. You can read more about this in the docs.
How to edit a database with Power Query when native SQL queries are supported
That being said, you can actually edit a database from within Power Query if the database supports the use of native SQL queries, if you have write permission for the database, and if you edit and run one of the two M functions that let you write native SQL queries. Here is an example using the Sql.Database function:
Sql.Database("servername", "dbname", [Query = "DROP TABLE tablename"])
And here is an example using the Value.NativeQuery function:
Source = Sql.Databases("servername"){[Name="dbname"]}[Data],
#"Native Query" = Value.NativeQuery(Source, "DROP TABLE tablename")
Unless you have changed the default Query Options, these functions should raise a warning message requiring you to permit running the query:
This prevents you from modifying the database without confirmation, so any database modification cannot happen just by accident.
I verified this using Excel Microsoft 365 (Version 2108) on Windows 10 64-bit connected to a local SQL Server 2019 (15.x) database.
In java reactor, r2dbc. I have two tables A, B. I also have repositories for them defined.
How can i get data made up of A join B?
I only come up with the following approach:
call databaseClient.select from A and consequently in a loop call select from B.
But i want more efficient and reactive way. How to do it?
TL;DR: Using SQL.
Spring Data's DatabaseClient is an improved and reactive variant for R2DBC of what JdbcTemplate is for JDBC. It encapsulates various execution modes, resource management, and exception translation. Its fluent API select/insert/update/delete methods are suitable for simple and flat queries. Everything that goes beyond the provided API is subject to SQL usage.
That being said, the method you're looking for is DatabaseClient.execute(…):
DatabaseClient client = …;
client.execute("SELECT person.age, address.street FROM person INNER JOIN address ON person.address = address.id");
The exact same goes for repository #Query methods.
Calling the database during result processing is a good way to lock up the entire result processing as results are fetched stream-wise. Issuing a query while not all results are fetched yet can exhaust the prefetch buffer of 128 or 256 items and cause your result stream to stuck. Additionally, you're creating an N+1 problem.
I am trying to analyse how sql queries are generated by Pentaho mondrian. Let us assume there are no aggregate tables as of now. I have noticed two types of behaviour when I try to fetch data from data warehouse (star schema) using Pentaho.
Case 1: I apply various filters and try to get fact count corresponding to it which is the default measure in my case.
Case 2: I apply the same filters as mentioned in case 1 and try to get some other measure by explicitly putting it into the measures selection box.
Observation: In both the cases, sql queries generated in the back-end include joins of fact table with multiple dimension tables as per the filters applied and columns and rows selected in Pentaho.
However, the join order is different in both the cases. In case 1, the fact table is placed at the left-most position of join whereas it is placed somewhere between the dimension tables in case 2.
I have connected Pentaho with AWS Athena at the back-end to execute queries on data stored on s3 with the help of jdbc connection. Since Athena has Presto at the back-end and Presto does not do automatic JOIN re-ordering, queries in case 2 are getting failed.
(http://docs.qubole.com/en/latest/user-guide/presto/best-practices.html)
I noticed that hash joins are being performed by Presto here. For hash joins to be effective, the largest table should be placed on the left side of join so that the smaller table is cached in memory while performing join. This is not happening in second case and it is trying to hash the fact table which consists of a large amount of data as compared to any of the dimension tables. This causes the query to fail whenever I add measure explicitly (other than default measure) and the data range is large (across an year for example).
Can someone please give an insight into the logic behind query formation of Mondrian in both the cases. Also, is there a way we can make the fact table to always remain on the left-most position of joins in the sql queries generated by Mondrian. Or is there any property of Presto which could be set through Athena to change the join type from hash join to some other type of join in which could solve this problem.
Pentaho version - 6.1.0
Saiku version - 3.10
I'm setting up a custom query that uses a range of OR statements in conjunction with BETWEEN statements and a final GROUP BY id HAVING COUNT(*) >= #{tolerance}. Not to mention INNER and LEFT join operations.
I would assume that it would not be possible to setup using active record. So I used the Model.connection.select_all() command to fire a query. This works, but how do I not map all of the rows to that specific model?
Rails is pretty powerful especially if you are using Rails 3 & ARel. So I wouldn't be surprised if you actually could write your query using rails.
However, there will always be times when writing raw SQL is desired.
To do that, instead of Model.connection use Model.find_by_sql(QUERY_STRING).
This way the query will get parsed for you automatically just make sure you only select "model.*"
This is probably a very simple question that I am working through in an MVC project. Here's an example of what I am talking about.
I have an rdml file linked to a database with a table called Users that has 500,000 rows. But I only want to find the Users who were entered on 5/7/2010. So let's say I do this in my UserRepository:
from u in db.GetUsers() where u.CreatedDate = "5/7/2010" select u
(doing this from memory so don't kill me if my syntax is a little off, it's the concept I am looking for)
Does this statement first return all 500,000 rows and then filter it or does it only bring back the filtered list?
It filters in the database since your building your expression atop of an ITable returning a IQueryable<T> data source.
Linq to SQL translates your query into SQL before sending it to the database, so only the filtered list is returned.
When the query is executed it will create SQL to return the filtered set only.
One thing to be aware of is that if you do nothing with the results of that query nothing will be queried at all.
The query will be deferred until you enumerate the result set.
These folks are right and one recommendation I would have is to monitor the queries that LinqToSql is creating. LinqToSql is a great tool but it's not perfect. I've noticed a number of little inefficiencies by monitoring the queries that it creates and tweaking it a bit where needed.
The DataContext has a "Log" property that you can work with to view the queries created. I created a simple HttpModule that outputs the DataContext's Log (formatted for sweetness) to my output window. That way I can see the SQL it used and adjust if need be. It's been worth its weight in gold.
Side note - I don't mean to be negative about the SQL that LinqToSql creates as it's very good and efficient almost every time. Another good side effect of monitoring the queries is you can show your friends that are die-hard ADO.NET - Stored Proc people how efficient LinqToSql really is.