I have to calculate / present number of team trough years in olap cube
My team fact is structured this way:
TeamId DateFrom DateTo (FactTeams)
1 2012 2015
2 2012 2015
3 2012 2015
4 2015 2018
1 2018 2019
Cube must be able to answer, for example, how many teams have been active in year 2012 (3 teams)
I have prepared another helper fact table that contains all combination of teams id and their dates.
TeamId DateRange (FactTeamDates)
1 2012
1 2013
1 2014
1 2015
1 2018
1 2019
2 2012
2 2013
2 2014
2 2015
...and so on ...
I have created two facts one FactTeams and another FactTeamDates. I have also standard date dimension. Here is my data source view:
https://www.dropbox.com/s/5d2gzumxv5fejdq/teams.jpg
FactTeams.TeamId is linked to FactTeamDates.TeamId and FactTeamDates.DateRange is linked to DimDates.DateKey.
I have measure “Team Number” that is distinct count of FactTeams on column TeamId.
My desired MDX query output for measure Team Number on COLUMNS and Years ON ROWS is:
Team Number
Year 2012 3
Year 2013 3
Year 2014 3
Year 2015 4
My question: How to organize my fact and set dimension usage in my cube to get desired output?
SQL query that produce desired output:
SELECT
d.CalendarYear
,COUNT(DISTINCT TeamId)
FROM FactTeams zt
INNER JOIN FactTeamsDates td ON zt.TeamId = td.TeamId
INNER JOIN DimDates d ON d.DateKey = td.DateRange
GROUP BY d.CalendarYear ORDER BY 1
Note that I know that I can create data view based on the above sql query (with joins) and then have one joined fact table, but I want to have some kind of join between my cubes dimensions and facts – to have joins in cube (olap) level only, not in sql (database, or cube data view)
Thanks in advance
You can create a distinct count measure and this should solve your problem
Related
I have a dataset with Year, school_id, Performance range (let's call it P_range).
Years range from 2019 - 2016
Performance Range from 0-5
Over 2000 unique school_ids
Basically I need to know how each school_id perfomed over the years within performance range.
Example:
YEAR
School_ID
P_Range
2019
1
4
2018
1
5
2017
1
3
2016
1
3
2019
2
1
I would like a fourth column that would look at the dataset and tell me in which year did its performance alter
2019-2018 this school decreased the range
2018-2017 this school increased the range
2017-2016 this school maintained the range
2016 would not compare to anything which is "First evaluation"
So at any given time, when we look at the results of 2019 we would see a sum of results being compared to 2018 and so on.
We need to know at a yearly rate, how many "maintained", "decreased" "increased" or "First evaluation"
If it were on excel this would be the formula:
=IF(B2=B3;IF(C2>C3;"INCREASED";IF(C2=C3;"MAINTANED";"DECREASED"));IF(B2<>B3;"1ST EVALUATION"))
Hi everybody I am trying to query an already formatted google sheets, I am able to filter some of those data (I used =query(x,select * where ... )). The output I get is the following:
may
may
june
june
july
july
july
planned
name
1
0
1
1
2
3
1
Now I want to refer to all the numbers under may (or june or july) in order to do some operation. I can' t just select the value I want because I need to automate it.
How can I get all the columns containing a specific marker(in my case the name of the month)? If it is not possible can you suggest me a different way to do that ? (I am not very experienced with google sheets or excel)
Since query can't select rows, you'd transpose it first and then select the columns you want and then retranspose it back, if needed:
Input:
may
may
june
june
july
july
july
planned
name
1
0
1
1
2
3
1
Formula(select columns >0):
=QUERY(TRANSPOSE(A27:I28),"Select * where Col2>0")
Output:
planned name
may
1
june
1
june
1
july
2
july
3
july
1
In quick report in Delphi how can I use a multiple data fields in qrgroup expression property?
For example I have a table like this one:
student departement entry year
---------------------------------
1 math 2016
2 math 2017
3 physique 2017
4 math 2016
5 physique 2018
I want to regroup my report using departement and entry year.
How can I use it in qrgroup expression property?
Thanks guys
I have two tables (A and B) that are related through the following four columns:
TECHNOLOGY - LAYER - YEAR - WORKWEEK
YEAR and WORKWEEK are calculated columns in both of the tables.
I have a column (WAFCOUNT) in table A that I want to insert into table B based off of those four related columns.
I've tried Insert->Columns, but it won't allow me to join the YEAR and WORKWEEK columns. I know this will work if I freeze them, but I'm trying not to do that so the tables don't become embedded.
It's my goal to keep this library item as dynamic as possible.
Here's a data sample for table A.
TECHNOLOGY LAYER YEAR WORKWEEK WAFCOUNT
XV-15 A 2016 1 23
XV-15 A 2016 2 14
XV-15 B 2016 2 49
XV-20 A 2016 1 7
XV-20 B 2016 1 19
Here's a data sample for table B.
TECHNOLOGY LAYER YEAR WORKWEEK
XV-20 A 2016 1
XV-20 B 2016 1
XV-15 A 2016 1
XV-15 A 2016 2
XV-15 B 2016 2
I have created a 'Unique_ID' column concatenating TECHNOLOGY - LAYER - YEAR - WORKWEEK in both the tables. Using the Unique_ID column, I have added 'WAFCOUNT' column into B. Please let me know if this helps.
Below is the screenshot for your reference:
enter image description here
Use a transformation instead of a calculated column from Insert>tranformations>Calculate new Column and then try to join
I am new to SAS and have this basic problem. I have a list of NYSE trading dates in table A as follows -
trading_date
1st March 2012
2nd March 2012
3rd March 2012
4th March 2012
5th March 2012
6th March 2012
I have another table B that has share price information as -
Date ID Ret Price
1st March 2012 1 … …
3rd March 2012 1 … …
4th March 2012 1 … …
5th March 2012 1 … …
6th March 2012 1 … …
1st March 2012 2 … …
3rd March 2012 2 … …
4th March 2012 2 … …
... has numeric data related to price and returns.
Now I need to join the NYSE Data table to the above table to get the following table -
Date ID Ret Price
1st March 2012 1 … …
2nd March 2012 1 0 0
3rd March 2012 1 … …
4th March 2012 1 … …
5th March 2012 1 … …
6th March 2012 1 … …
1st March 2012 2 … …
2nd March 2012 2 0 0
3rd March 2012 2 … …
4th March 2012 2 … …
i.e. a simple left join. The zero's will be filled with . in SAS to indicate missing values, but you get the idea. But if I use the following command -
proc sql;
create table joined as
select table_a.trading_date, table_b.* from table_a LEFT OUTER join table_b on table_a.trading_date=table_b.date;
quit;
The join happens only for the first ID (i.e. ID=1) while for the rest of the IDs, the same data is maintained. But I need to insert the trade dates for all IDs.
How can get the final data without running a do while loop for all IDs? I have 1000 IDs and looping and joining 1000 times is not an option due to limited memory.
Joe is right, you need to take also ID into consideration, but with his solution you cannot get 2nd March 2012 because no one is trading that day. You can do everything with just one sql step (which will take a bit longer):
proc sql;
create table final as
select d.trading_date, d.ID, t.Price, t.Ret
from
(
select trading_date, ID
from table_a, (select distinct ID from table_b)
) d
left join
(
select *
from table_b
) t
on t.Date=d.trading_date and t.ID=d.ID
order by d.id, d.trading_date;
quit;
Your left join doesn't work since it doesn't take ID into account. SAS (or rather SQL) doesn't know that it should repeat by ID.
The easiest way to get the full combination is PROC FREQ with SPARSE, assuming someone has a trade on every valid trading day.
proc freq data=table_b noprint;
tables id*trading_date/sparse out=table_all(keep=id trading_date);
run;
Then join that to the original table_b by id and date.
Alternately, you can use PROC MEANS, which can get your numerics (it can't get characters this way, unless you can use them as a class value).
Using table_b as created by Anton (With ret and price variables):
proc means data=table_b noprint completetypes nway;
class id trading_date;
var ret price;
output out=table_allmeans sum=;
run;
This will output missing for missing rows and values for present rows, and will have a _FREQ_ variable that allows you to differentiate whether a row is really present in the trading dataset or not.
I suppose there must be something off with the data because your query looks fine and worked on the testing data I generated along the lines you described:
data table_a;
format trading_date date9.;
do trading_date= "01MAR2012"d to "06MAR2012"d;
output;
end;
run;
data table_b;
format date date9.;
ret = 0;
price = 0;
do date= "01MAR2012"d to "06MAR2012"d;
do ID = 1 to 4;
if ranuni(123) < 0.3 then
output;
end;
end;
run;
Below is what I get after running your query copied verbatim:
trading_date date ret price ID
01MAR2012 01MAR2012 0 0 3
02MAR2012 02MAR2012 0 0 2
03MAR2012 03MAR2012 0 0 1
03MAR2012 03MAR2012 0 0 2
04MAR2012 04MAR2012 0 0 2
05MAR2012 05MAR2012 0 0 3
06MAR2012 . . . .
It is worth checking the format of your dates- are they numeric? If they are character, are they formatted the same way? If they are numeric, are they dates or datetimes with some odd format applied?