PowerBI DAX: Budget reallocation for future dates based on Spend actual - time-series

Could you please help me how to write a DAX measure in Power BI Desktop that would a) calculate the budget left for a set period (quarter) by subtracting total spend from total budget and b) redistribute the resulted budget leftover daily based on a daily % skew (that is set up manually)?
Please see the screenshot below for an example of a calculation (yellow column) I need to have in DAX.
dax_budget_leftovers_distribution

Measure1 = Calculate(Sum(table1[spend actual]), All(table1))
Measure2 = Calculate(Sum(table2[budget]), All(table2))
Measure3 = Measure2 - Measure1
Measure4 = Divide(Calculate(Sum([BudgetReForecast%])), 100, blank()) * [Measure3]
Measure4 is what gets put in the table next to the table3 Date field.
This question is very vague, and I think your logic may not work when you apply this to a whole dataset of dates, but this is the general idea..

Related

Complex Google Sheets Puzzle - Help appreciated - Link to sheet included

I have a finance sheet that tracks the following in different columns:
(A) Amount Already Built Into Budget for the Year [Purple]
(B) Amount Spent Year-to-Date [Red]
(C,E,G,I) Q1-Q4 Reimbursement amounts [Green]
(D,F,H,J) Q1-Q4 Hidden columns to be used to help create this function and to tick on and off based on reimbursement amount input [Gold]
(K) Reimbursement Remaining [Blue]
The amount already built into the budget needs to be divided by 4 and to show up on each quarter as reimbursed. That amount will be entered into each quarter by default with a code that divides Column A by 4. The user will replace that value each quarter by adding the column K value to the value for that quarter.
Each quarter, the user should be able to add the value in column K to the appropriate quarter and end up with zero balance in column K.
The Amount Spent Column will update monthly and include:
Expenses built into the budget to-date
Additional expenses to-date
The goal of this sheet is to allow someone to input how much was actually reimbursed in the Q1-Q4 Reimbursement amounts [Green] and to provide a tool for that person to know how much needs to be reimbursed at any given time in the Reimbursement Remaining [Blue] column.
Column K needs to still be able to function if all expenses appear in the actuals Q4--meaning, Column K will need to be zero for Q1-Q3 and only show a balance if the sum of the actuals recorded exceeds what was built into the budget.
Wow that was hard to write out.
What is a formula that could go in Column K to make this work?
I hope this makes sense to someone!
-Alfred
try:
=if(C2+E2+G2+I2=A2+B2,0,
if(AND(D2+F2+H2+J2=0,D2=0),B2-C2,
if(AND(D2+F2+H2+J2=1,D2=1),B2-C2,
if(AND(D2+F2+H2+J2=1,F2=1),B2-C2-E2,
if(AND(D2+F2+H2+J2=1,H2=1),B2-C2-E2-G2,
if(AND(D2+F2+H2+J2=1,J2=1),B2-C2-E2-G2-I2, 0)
)))))

How do I calculate average tons per hour by driver in Google Sheets?

https://docs.google.com/spreadsheets/d/1SiUfqrJNHPAYjibeNBdzWQEcuzka5srf7mSHAv_bn5k/edit?usp=sharing
What would a formula look like to calculate the average tons per hour by driver in this example spreadsheet? Correcting for long times or even days between loads.
We're being charged on an hourly basis for freight so I'd like to figure out which drivers are the most efficient.
It's been tricky because the only concrete source of information we have is the scale tickets. So if they only do a single load in a day or go several hours between loads then the data would be skewed if you use a simple metric like time elapsed.
Also, I'll need the time elapsed between rows (not just the difference between Time In and Time Out) unless that time is > 1.5 hours. So something like:
=(TIMEVALUE(E3)-TIMEVALUE(D2))*24
...With some added logic to not include anything over 1.5 hours.
If a pivot table would be better than a lengthy formula, that's fine with me.
Here's an example for some added context: Driver Cody goes to Farm Nic to receive a load of hay, then comes back to the weigh station (Ticket, Time In, Gross are then determined), dumps the load, comes back to weigh again empty (Tare, Net, and Time Out are determined here), and goes back to Farm Nic until all the hay is harvested. Then it's on to Farm Zach and Farm Williams to repeat the process. There are several Drivers going at a time, which can be seen if the spreadsheet is sorted by Ticket. My goal is to figure out how many Tons each driver delivers per hour. The time elapsed would include the time between Tickets, because Time In and Time Out just show the time elapsed between coming in with a load of hay and leaving to go back to the field. To get a true measure of tons delivered per hour, you'd need to include the time between tickets, but also remove any instance where that time is greater than 1.5 hours. That will account for circumstances where the Driver isn't working and we aren't being billed, such as during equipment breakdowns.
I'm not much of a formulas guy so I hope this suffice your needs.
First I added a column to your sheet, to calculate how many amount of hours is taking for every single row, to do that I made use of the TIMEVALUE function:
=(TIMEVALUE(E2)-TIMEVALUE(D2))*24
Now you just need to get all the driver's hours and tons and make the quotient total_tons / total_hours. For that they may be some other functions that would do the job, myself I have used QUERY:
=QUERY(Sheet1!A:M, "select C, sum(I), sum(M), sum(I) / sum(M) group by C", 1)
I think pretty straightforward query, group all the data by C (Driver's name) and then sum the column I (tons) and the column M (hours).
With the following result:
The format may be a little off but you can change it as mush as you want. You can copy or play with the sheet
EDIT
After you change your requirements I made a change to my formula to calculate the hours worked:
=IF(
AND(
C3=C2,
A3=A2,
IFERROR(
(TIMEVALUE(E3)-TIMEVALUE(D2))*24) <= 1.5,
TRUE
),
(TIMEVALUE(E3)-TIMEVALUE(D2))*24,
(TIMEVALUE(E2)-TIMEVALUE(D2))*24
)
Let me explain here, before there was a much simpler formula but now having multiple rows that we need to check makes the formula more lenghty.
First with the IF and AND statement we check if the next row has:
The same day (A3=A2)
The same Driver (C3=C2)
Less than an hour and a half of difference (TIMEVALUE(E3)-TIMEVALUE(D2))*24) <= 1.5)
And also because the last row throws an error trying to TIMEVALUE an empty column I had to add the IFERROR
After that the TRUE condition (same day, same driver, under 1.5h hours difference) will calculate from the current Time in (D2) to the next Time in (D3):
(TIMEVALUE(D3)-TIMEVALUE(D2))*24
And in the FALSE statement we do the same we were doing before:
(TIMEVALUE(E2)-TIMEVALUE(D2))*24
The QUERY function stays the same. And the results have decreased drastically:
If you have any doubts you can go ahead and see the sheet

Calculate weekly average given only year and week number

This is a school assignment, though unfortunately I'm either overthinking the question or this is significantly easier than I think.
Starting off here is a link to my spreadsheet: https://docs.google.com/spreadsheets/d/1jDFzitEGi319i6hUjqjJDF8nYZ8qm09-ieMGHk2T7AA/edit?usp=sharing
I am trying to calculate a weekly average from [Point Spread], though column A, B only offer a year and a week number. What would be the most efficient to tackle this?
I guess you're supposed to calculate the average of Point Spread for each distinct values of the year and the week, so for week 1 of 1998, you would calculate the average of the Point Spread on the first 16 rows.

Lookup and calculation with mulitple criteria also based on a cell value

I'm trying to create a spreadsheet that will allow me to quickly calculate the amount of time my trains were delayed on a daily basis.
I need a formula that will check for all trains on a particular route after a planned departure time (written in a cell),check these trains actual arrival time and then display the earliest possible time I could have arrived at my destination.
spreadsheet image
For example, in G4 I would like a formula that looks for all trains that depart after 7:49 (A4) and also match both of it's "From" and "To" (C4 & D4). It would then need to check these trains corresponding "actual arrival times" in column F and show the earliest possible train. So for row 4 this would be 9:36.
Any help would be really appreciated as I have been messing around with this for over a day and have gotten nowhere!
A link to the example is here - https://docs.google.com/spreadsheets/d/1eE8t4-_hKB6o5j3W57EHgKzsF9p1usm7nojerjmrDwY/edit#gid=0
Thanks
Oli
Not sure about 9:36 Do you mean 9:39 ?
It's a little difficult to do this but i think what you are looking for is a multiconditional lookup array. I have put below what I think you are trying to achieve.
If A2:A8 is greater than A4, C2:C8 = C4 and D2:D8 = D4, what is the lowest value in F2:F8
Is this correct?
If so then I came up with this formula:
=ArrayFormula(MIN(IF((A2:A8>A4),IF((C2:C8=C4),IF((D2:D8=D4),F2:F8)))))
If you get 0.402 or something, format the cell to time.
Otherwise, could you break it down for us a bit more?

Time and date dimension in data warehouse

I'm building a data warehouse. Each fact has it's timestamp. I need to create reports by day, month, quarter but by hours too. Looking at the examples I see that dates tend to be saved in dimension tables.
(source: etl-tools.info)
But I think, that it makes no sense for time. The dimension table would grow and grow. On the other hand JOIN with date dimension table is more efficient than using date/time functions in SQL.
What are your opinions/solutions ?
(I'm using Infobright)
Kimball recommends having separate time- and date dimensions:
design-tip-51-latest-thinking-on-time-dimension-tables
In previous Toolkit books, we have
recommended building such a dimension
with the minutes or seconds component
of time as an offset from midnight of
each day, but we have come to realize
that the resulting end user
applications became too difficult,
especially when trying to compute time
spans. Also, unlike the calendar day
dimension, there are very few
descriptive attributes for the
specific minute or second within a
day. If the enterprise has well
defined attributes for time slices
within a day, such as shift names, or
advertising time slots, an additional
time-of-day dimension can be added to
the design where this dimension is
defined as the number of minutes (or
even seconds) past midnight. Thus this
time-ofday dimension would either have
1440 records if the grain were minutes
or 86,400 records if the grain were
seconds.
My guess is that it depends on your reporting requirement.
If you need need something like
WHERE "Hour" = 10
meaning every day between 10:00:00 and 10:59:59, then I would use the time dimension, because it is faster than
WHERE date_part('hour', TimeStamp) = 10
because the date_part() function will be evaluated for every row.
You should still keep the TimeStamp in the fact table in order to aggregate over boundaries of days, like in:
WHERE TimeStamp between '2010-03-22 23:30' and '2010-03-23 11:15'
which gets awkward when using dimension fields.
Usually, time dimension has a minute resolution, so 1440 rows.
Time should be a dimension on data warehouses, since you will frequently want to aggregate about it. You could use the snowflake-Schema to reduce the overhead. In general, as I pointed out in my comment, hours seem like an unusually high resolution. If you insist on them, making the hour of the day a separate dimension might help, but I cannot tell you if this is good design.
I would recommend having seperate dimension for date and time. Date Dimension would have 1 record for each date as part of identified valid range of dates. For example: 01/01/1980 to 12/31/2025.
And a seperate dimension for time having 86400 records with each second having a record identified by the time key.
In the fact records, where u need date and time both, add both keys having references to these conformed dimensions.

Resources