how to join to different levels of a date dimension in a data warehouse - data-warehouse

I have a simple data warehouse with an existing data mart. This data mart includes a date dimension table. Because the date grain of the fact table is day, then the date dimension table is also at the grain of day (i.e., 1 row per day). Because date is a hierarchical dimension, I have de-normalized the hierarchy into the date dimension table. So, even though the grain of the date dimension table is day, it also includes attributes like week, month, and year.
I have a new data mart that I'm designing whose fact table date grain is month. So, I have to join to a month dimension from this new fact table. What is the best implementation of this month dimension? That is, should it be a view using the date dimension table? Or, should it be its own physical table?

Related

Subset data with dates from lookup table in SAS

I have a dataset that I need to subset for dates that fall within daylight savings time and dates that are not within daylight savings time to perform different adjustments on each group of data. The lookup table, DST_HOUR_SHIFT contains dates from 2007-2020 (example below) and the main data table contains hourly meter data with a value for each hour from 2007-2020 (example below). Joins seem to be my Achilles heel and I do not know the most efficient way to separate the data. Any help would be greatly appreciated.
DST_HOUR_SHIFT
DST_BEG DST_END
03/11/2007 11/04/2007
03/09/2008 11/02/2008
03/08/2009 11/01/2009
INITIAL_SYSTEM_LOAD
SHORT_DATE HOUR VALUE
01/01/2007 0 1225.00
01/01/2007 1 1170.00
01/01/2007 2 1124.00
01/01/2007 3 1101.00

Creating Google Sheets pivot tables with custom formula

I am creating table for finance: will have a data base of trades: date open and close for trade (), open and close prices, ticket is a stock name, change is percentage which is calculated base on open-close price and days are also calculated base on two dates as on the picture:
And I need to generate a new table for each month of the year (in which I have date records). So, Google sheets has Pivot tables and that what I need. I need columns: average win % per month, average loss % per month, average number of win days per month, average number of loss days per month.
I did that in 2 separate tables:
First table:
First table settings:
Second table:
Second table settings:
But I can not create that in one table - I do not know how to make custom formula. So, I am looking for some help here.
I tried some things, I can filter, make average. But I do not know how to get array of items with sorted pivot table data by months...If I can get sorted pivot table data by months - I can filter by positive/negative and find average.
My sample: https://docs.google.com/spreadsheets/d/1TCLWZ7-oUSwM8DLODPpH6wwssgfYyo3BVlEpWj78kV4/edit?usp=sharing

How to Model Date Dimensions with Fact Tables of Different Grains

We have some use cases for our DW where we have fact tables at different grains - e.g., sales by store by day (fact 1) and sales budget targets by month (fact 2). They both involve Date as a grain, but in one case the grain is day and the other the grain is period.
Assuming we can't in the near term change the grain, what's the right way to model this?
A Date and a Month dimension, which will have conformed attributes?
1 Date dimension, with nulls or flags or something when it's representing a higher value (e.g., month)
Something else?
You only need one date dimension with one row per day. Just link to the last day of your period.
E.g. for a monthly aggregated fact just link to the last day of the month in your date dimension.
Two different dimensions, one for Date and one for Month

Fact tables with different level Date Dimension Data as Date Dimension Key

I am a beginner in warehousing. I have two facts Which their names are sales and budget.
I can put days (Date Dimension key) in my sales Fact, but the table i have for budget can be just in month detail. so i don't know what i should do. would you please tell me what are the best practices in this case?
regards
Mana
In this scenario, I generally find it easiest to store the month level data always on either the first/last day of the month. This way, you can still aggregate up to month from date and compare sales & budget; and you will only store the budget value once a month as intended. This would also help if down the road you're asked to store the budget data at the day level.
If you don't want to use this approach, then you would want to snowflake out your date dimension and have a separate month dimension, and then your budget fact table can FK to this new dimension.

Data warehouse reporting questions

I've just begun diving into data warehousing and I have one question that I just can't seem to figure out.
I have a business which has ten stores, each with a certain employees. In my data warehouse I have a dimension representing the store. The employee dimension is a SCD, with a column for start/end, and the store at which the employee is working.
My fact table is based on suggestions the employees give (anonymously) to the store managers. This table contains the suggestion type (cleanliness, salary issue, etc), the date it was submitted (foreign keyed to a Time dimension table), and the store at which it was submitted.
What I want to do is create a report showing the ratio of the number of suggestions to the number of employees in a given year. Because the number of employees changes periodically I just can't do a simple query for the total number of employees.
Unfortunately I've searched the web quite a bit trying to find a solution but the majority of the examples are retail based sales, which is different from what I'm trying to do.
Any help would be appreciated. I do have the AdventureWorksDW installed on my machine so I can use that as a point of reference if anyone offers a suggestion using that.
Thanks in advance!
The slowly changing dimension should have a natural key that identifies the source of the row (otherwise how would it know what to compare to detect changes). This should be constant amongst all iterations of the dimension. You can get a count of employees by computing a distinct count of the natural key.
Edit: If your transaction table (suggestion) has a date on it, a distinct count of employees grouped by a computed function of the suggestion date (e.g. datepart (yy, s.SuggestionDate)) and the business unit should do it. You don't need to worry about the date on the employee dimension as the applicable row should join directly to the transaction table.
Add another fact table for number of Employees in each store for each month -- you could use max number for the month. Then average months for the year, use this as "number of employees in a year".
Load your new fact table at the end of each month. The new table would look like:
fact table: EmployeeCount
KeyEmployeeCount int -- surrogate key
KeyDate int -- FK to date dimension, point to last day of a month
KeyStore int -- FK to store dimension
NumberOfEmployes int -- (max) number of employees for the month in a given store
If you need a finer resolution, use "per week" or even "per day". The main idea is to average the NumberOfEmployes measure for a given store over the year.

Resources