How to join two completely different cubes in MDX? - join

I want to join two completely different cubes in mdx (I am using MS SSRS 2008). I am really new to mdx and I have no idea how to do it. I want to join by SKU if possible. Can any body please tell me how to do it?
mdx Query 1
SELECT NON EMPTY { [Measures].[Sales], [Measures].[Quantity] } ON COLUMNS,
NON EMPTY { ([Date YMD].[Day].ALLMEMBERS *
[Regions And Stores].[Store Name].[Store Name].ALLMEMBERS *
[Products].[Products].ALLMEMBERS *
[SKU].[SKU].ALLMEMBERS ) } DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME ON ROWS
FROM [Super] CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE, FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS
mdx Query 2
SELECT NON EMPTY { [Measures].[Quantity] } ON COLUMNS,
NON EMPTY { ([Store Name].[Store Name].ALLMEMBERS *
[Products].[Products].ALLMEMBERS *
[SKU].[SKU].ALLMEMBERS ) } DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME ON ROWS
FROM [Inventory Activity] CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE, FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS
Any help will highly appreciated.
Thank you

In SSRS you need to load two Datasets in your report and join them in the tablix. For example:
Load Dataset1 and Dataset2 into your report, with the column ID which links Dataset1 to Dataset2. Then put a tablix in your report. Display Dataset1 in your tablix. Now add a new column to your tablix and add the following expression:
=Lookup(Fields!Dataset1ID.Value, Fields!Dataset2ID.Value, Fields!SalesAmount.Value, "Dataset2")
The expression works as follows:
- First argument is the foreign key column from Dataset1
Second argument is corrspondending key column from Dataset2
Third argument is the column you want to display in the tablix which is from Dataset2
Forth argument is the name from the dataset you want to join with (Dataset2)
Here is the reference for the Lookup() function: https://learn.microsoft.com/de-de/sql/reporting-services/report-design/report-builder-functions-lookup-function?view=sql-server-2017

Welcome to MDX, I guess you are looking for SQL Join equivalent in MDX. However MDX doesnt support joins like SQL. One way to solve the issue is to retrieve the data via ADOMD into data cells and then join them in memory. However I would like to know the scenario, which requires you to join results from two cubes.

Related

Joining two tables based on matching two columns

I'm trying to join two tables:
Table A has three columns: State, County, and Count (of Farmer's Markets in said county)
Table B has several columns: State, County, and several data columns (like food access score)
I'm trying to combine them in such a way as to put the Count for each State/County combination (since there are multiple counties with the same name) together with the State and County and data columns from Table B.
I've been banging my head on SAS, trying to get a join to cooperate. I read a few other questions on here, but I can't find where the mistake is in my code.
PROC SQL;
CREATE TABLE WORK.QUERY1
AS
SELECT FMDV4.State, FMDV4.County, FMDV4.Count, CFSDV1.GROC14,
CFSDV1.SUPERC14, CFSDV1.CONVS14, CFSDV1.SPECS14, CFSDV1.FOODINSEC_13_15,
CFSDV1.PCT_LACCESS_POP15, CFSDV1.DIRSALES_FARMS12, CFSDV1.FMRKT16,
CFSDV1.FOODHUB16, CFSDV1.CSA12, CFSDV1.POVRATE15, CFSDV1.PERPOV10
FROM FNLPRJT.CFSDV1 AS CFSDV1
INNER JOIN FNLPRJT.FMDV4 AS FMDV4
ON (( CFSDV1.State = FMDV4.State ) AND ( CFSDV1.County =
FMDV4.County ));
QUIT;
I also tried a few variants, like:
PROC SQL;
CREATE TABLE WORK.QUERY1
AS
SELECT FMDV4.State, FMDV4.County, FMDV4.Count, CFSDV1.GROC14,
CFSDV1.SUPERC14, CFSDV1.CONVS14, CFSDV1.SPECS14, CFSDV1.FOODINSEC_13_15,
CFSDV1.PCT_LACCESS_POP15, CFSDV1.DIRSALES_FARMS12, CFSDV1.FMRKT16,
CFSDV1.FOODHUB16, CFSDV1.CSA12, CFSDV1.POVRATE15, CFSDV1.PERPOV10
FROM FNLPRJT.CFSDV1 AS CFSDV1
INNER JOIN FNLPRJT.FMDV4 AS FMDV4
ON CFSDV1.State = FMDV4.State
WHERE CFSDV1.County = FMDV4.County;
QUIT;
I get a table of 0 rows with the columns as they should be (State, County, Count, ). I'm just missing the dang data! Can anyone please help me find my mistake?
Can you try
propcase(CFSDV1.State) = propcase(FMDV4.State)
and
propcase(CFSDV1.County) = propcase(FMDV4.County);
If this doesn't work try character functions like trim and compress to remove any blanks that might be present in the data.

Left join table on multiple tables in SAS

I've got multiple master tables in the same format with the same variables. I now want to left join another variable but I can't combine the master tables due to limited storage on my computer. Is there a way that I can left join a variable onto multiple master tables within one PROC SQL? Maybe with the help of a macro?
The LEFT JOIN code looks like this for one join but I'm looking for an alternative than to copy and paste this 5 times:
PROC SQL;
CREATE TABLE New AS
SELECT a.*, b.Value
FROM Old a LEFT JOIN Additional b
ON a.ID = b.ID;
QUIT;
You can't do it in one create table statement, as it only creates one table at a time. But you can do a few things, depending on what your actual limiting factor is (you mention a few).
If you simply want to avoid writing the same code five times, but otherwise don't care how it executes, then just write the code in a macro, as you reference.
%macro update_table(old=, new=);
PROC SQL;
CREATE TABLE &new. AS
SELECT a.*, b.Value
FROM &old. a LEFT JOIN Additional b
ON a.ID = b.ID;
QUIT;
%mend update_table;
%update_table(old=old1, new=new1)
%update_table(old=old2, new=new2)
%update_table(old=old3, new=new3)
Of course, if the names of the five tables are in a pattern, you can perhaps automate this further based on that pattern, but you don't give sufficient information to figure that out.
If you on the other hand need to do this more efficiently in terms of processing than running the SQL query five times, it can be done a number of ways, depending on the specifics of your additional table and your specific limitations. It looks to me that you have a good use case for a format lookup here, for example; see for example Jenine Eason's paper, Proc Format, a Speedy Alternative to Sort/Merge. If you're just merging on the ID, this is very easy.
data for_format;
set additional;
start = ID;
label = value;
fmtname='AdditionalF'; *or '$AdditionalF' if ID is character-valued;
output;
if _n_=1 then do; *creating an "other" option so it returns missing if not found;
hlo='o';
label = ' ';
output;
end;
run;
And then you just have five data steps with a PUT statement adding the value, or even you could simply format the ID variable with that format and it would have that value whenever you did most PROCs (if this is something like a classifier that you don't truly need "in" the data).
You can do this in a single pass through the data in a Data Step using a hash table to lookup values.
data new1 new2 new3;
set old1(in=a) old2(in=b) old3(in=c);
format value best.;
if _n_=1 then do;
%create_hash(lk,id,value,"Additional");
end;
value = .;
rc = lk.find();
drop rc;
if a then
output new1;
else if b then
output new2;
else if c then
output new3;
run;
%create_hash() macro available here.
You could, alternatively, use Joe's format with the same Data Step syntax.

Ambiguous column error creating table in Aster Studio 6.0

I am new to databases and am posting a problem from work. I am creating a table in Aster Studio 6.0, but got an error about an ambiguous column. I ran the same query in Teradata SQL Assistant and did not get an error.
I have six tables with millions of rows named EDW.SWIFTIQ_TRANS_DTL, EDW.SWIFTIQ_STORE, EDW.SWIFTIQ_PROD, EDW.STORE_XREF, EDW.TDLNX_STR_OUTLT, and EDW.SURV_CWC.
EDW represents the original database, but the columns were labeled with aliases.
I did a trim() on the VARCHAR columns for saving spool space. For the error about TDLNX_RTL_OUTLT_NBR, I performed an INNER JOIN on similar columns from two different tables. Doing a preview in SQL Assistant, there was a temporary table with only one column called TDLNX_RTL_OUTLT_NBR.
Here’s the SQL query:
CREATE TABLE public.table_name
DISTRIBUTE BY HASH (SRC_SYS_PROD_ID) AS (
SELECT * FROM load_from_teradata(
ON public.load_from_teradata_dummy
TDPID(‘database_name')
USERNAME(’user_name')
PASSWORD(’ss')
QUERY ('SELECT e.TDLNX_RTL_OUTLT_NBR, e.OUTLT_ST_ADDR_TXT, e.STORE_OUTLT_ZIP_CD, d.TRANS_ID, d.TRANS_DT,
d.TRANS_TM, d.UNIT_QTY, d.SRC_SYS_STORE_ID, d.SRC_SYS_PROD_ID, d.SRC_SYS_NM, a.SRC_SYS_STORE_ID, a.SRC_SYS_NM, a.STORE_NM,
a.CITY_NM, a.ZIP_CD, a.ST_cd, p.SRC_SYS_PROD_ID, p.SRC_SYS_NM, p.UPC_CD, p.PROD_ID, f.SRC_SYS_STORE_ID, f.SRC_SYS_NM,
f.TDLNX_RTL_OUTLT_NBR, g.SURV_CWC_WSLR_CUST_PARTY_ID, g.AGE_CD, g.HIGH_END_ACCT_FLG, g.RACE_ETHNC_CD, g.OCCPN_CD
FROM EDW.SWIFTIQ_TRANS_DTL d
INNER JOIN EDW.SWIFTIQ_STORE a
ON trim( a.SRC_SYS_STORE_ID) = trim(d.SRC_SYS_STORE_ID)
INNER JOIN EDW.SWIFTIQ_PROD p
ON trim(p.SRC_SYS_PROD_ID) = trim(d.SRC_SYS_PROD_ID)
and p.SRC_SYS_NM = d.SRC_SYS_NM
INNER JOIN EDW.STORE_XREF f
ON trim(f.SRC_SYS_STORE_ID) = trim(a.SRC_SYS_STORE_ID)
INNER JOIN EDW.TDLNX_STR_OUTLT e
ON trim(e.TDLNX_RTL_OUTLT_NBR)= trim(f.TDLNX_RTL_OUTLT_NBR)
INNER JOIN EDW.SURV_CWC g
ON g.SURV_CWC_WSLR_CUST_PARTY_ID = e.WSLR_CUST_PARTY_ID
WHERE TRANS_DT between ''2015-01-01'' and ''2015-03-31''')
num_instances('4') ) );
ERROR: column reference 'TDLNX_RTL_OUTLT_NBR' is ambiguous.
EDIT: Forgot to include a description about the table aliases. a stands for EDW.SWIFTIQ_STORE, p for EDW.SWIFTIQ_PROD, f for EDW.STORE_XREF, e for EDW.TDLNX_STR_OUTLT, g for EDW.SURV_CWC, and d for EDW.SWIFTIQ_TRANS_DTL.
You will get the same error when you try CREATE TABLE AS SELECT in Teradata. There are three column names, SRC_SYS_NM & SRC_SYS_PROD_ID & SRC_SYS_STORE_ID, which are used multiple times (with different table aliases) within the SELECT.
Add column aliases to make those names unique, e.g. trans_SRC_SYS_NM instead of d.SRC_SYS_NM.
Additionally the TRIMs in the joins are a very bad idea. You will probably not save that much spool, but force the optimizer to redistribute all spools for join-preparation.

Alternative way of joining two datasets in SAS

I have two datasets DS1 and DS2. DS1 is 100,000rows x 40cols, DS2 is 20,000rows x 20cols. I actually need to pull COL1 from DS1 if some fields match DS2.
Since I am very-very new to SAS, I am trying to stick to SQL logic.
So basically I did (shot version)
proc sql;
...
SELECT DS1.col1
FROM DS1 INNER JOIN DS2
on DS1.COL2=DS2.COL3
OR DS1.COL3=DS2.COL3
OR DS1.COL4=DS2.COL2
...
After an hour or so, it was still running, but I was getting emails from SAS that I am using 700gb or so. Is there a better and faster SAS-way of doing this operation?
I would use 3 separate queries and use a UNION
proc sql;
...
SELECT DS1.col1
FROM DS1 INNER JOIN DS2
on DS1.COL2=DS2.COL3
UNION
SELECT DS1.col1
FROM DS1 INNER JOIN DS2
On DS1.COL3=DS2.COL3
UNION
SELECT DS1.col1
FROM DS1 INNER JOIN DS2
ON DS1.COL4=DS2.COL2
...
You may have null or blank values in the columns you are joining on. Your query is probably matching all the null/blank columns together resulting in a very large result set.
I suggest adding additional clauses to exclude null results.
Also - if the same row happens to exist in both tables, then you should also prevent the row from joining to itself.
Either of these could effectively result in a cartesian product join (or something close to a cartesian product join).
EDIT : By the way - a good way of debugging this type of problem is to limit both datasets to a certain number of rows - say 100 in each - and then running it and checking the output to make sure it's expected. You can do this using the SQL options inobs=, outobs=, and loops=. Here's a link to the documentation.
First sort the datasets that you are trying to merge using proc sort. Then merge the datasets based on id.
Here is how you can do it.
I have assumed you match field as ID
proc sort data=DS1;
by ID;
proc sort data=DS2;
by ID;
data out;
merge DS1 DS2;
by ID;
run;
You can use proc sort for Ds3 and DS4 and then include them in merge statement if you need to join them as well.

How to get row Count of the sqlite3_stmt *statement? [duplicate]

I want to get the number of selected rows as well as the selected data. At the present I have to use two sql statements:
one is
select * from XXX where XXX;
the other is
select count(*) from XXX where XXX;
Can it be realised with a single sql string?
I've checked the source code of sqlite3, and I found the function of sqlite3_changes(). But the function is only useful when the database is changed (after insert, delete or update).
Can anyone help me with this problem? Thank you very much!
SQL can't mix single-row (counting) and multi-row results (selecting data from your tables). This is a common problem with returning huge amounts of data. Here are some tips how to handle this:
Read the first N rows and tell the user "more than N rows available". Not very precise but often good enough. If you keep the cursor open, you can fetch more data when the user hits the bottom of the view (Google Reader does this)
Instead of selecting the data directly, first copy it into a temporary table. The INSERT statement will return the number of rows copied. Later, you can use the data in the temporary table to display the data. You can add a "row number" to this temporary table to make paging more simple.
Fetch the data in a background thread. This allows the user to use your application while the data grid or table fills with more data.
try this way
select (select count() from XXX) as count, *
from XXX;
select (select COUNT(0)
from xxx t1
where t1.b <= t2.b
) as 'Row Number', b from xxx t2 ORDER BY b;
just try this.
You could combine them into a single statement:
select count(*), * from XXX where XXX
or
select count(*) as MYCOUNT, * from XXX where XXX
To get the number of unique titles, you need to pass the DISTINCT clause to the COUNT function as the following statement:
SELECT
COUNT(DISTINCT column_name)
FROM
'table_name';
Source: http://www.sqlitetutorial.net/sqlite-count-function/
For those who are still looking for another method, the more elegant one I found to get the total of row was to use a CTE.
this ensure that the count is only calculated once :
WITH cnt(total) as (SELECT COUNT(*) from xxx) select * from xxx,cnt
the only drawback is if a WHERE clause is needed, it should be applied in both main query and CTE query.
In the first comment, Alttag said that there is no issue to run 2 queries. I don't agree with that unless both are part of a unique transaction. If not, the source table can be altered between the 2 queries by any INSERT or DELETE from another thread/process. In such case, the count value might be wrong.
Once you already have the select * from XXX results, you can just find the array length in your program right?
If you use sqlite3_get_table instead of prepare/step/finalize you will get all the results at once in an array ("result table"), including the numbers and names of columns, and the number of rows. Then you should free the result with sqlite3_free_table
int rows_count = 0;
while (sqlite3_step(stmt) == SQLITE_ROW)
{
rows_count++;
}
// The rows_count is available for use
sqlite3_reset(stmt); // reset the stmt for use it again
while (sqlite3_step(stmt) == SQLITE_ROW)
{
// your code in the query result
}

Resources