Show I expand the year data or sum the month data - machine-learning

The model's goal is to predict variable A with a collection of variables--B. However, A is only in years(20 rows) while B is in month(200+ rows). Should I expend A into 200+ monthly array using the year's value for every month of that year, or should I sum B into 20 rows yearly array using the sum of every month in that year as the year's value?

Related

Google Sheets: select top n daily values over entire year

I have a google sheets raw data set of 5 columns over a 2 year time period:
Col A: year (2017...2018)
Col B: month (1...12)
Col C: day of month (1...31)
Col D: hour of day (0...23)
Col E: hourly electricity consumption (0.00-9999.99)
What I'd like to extract is the top 3 hours of highest electricity consumption of each and every day (i.e. 3 points * 365 days/year * 2 years = 2190 of the rows).
I know how to either:
Get the top 3 consumption hours within a single day:
=QUERY(A1:E23, "select A,B,C,D,E order by E desc limit 3") for the first day
=QUERY(A24:E47, "select A,B,C,D,E order by E desc limit 3") for the second day, etc.
or get the highest (single) consumption hour of each and every day, without knowing during which hour it occurs:
=QUERY(A1:E, "select A,B,C,max(E) group by A,B,C")
How do I combine the two, so I still capture all info (all columns)?
I made some sample data on a tab called Data on this sheet.
Then I put this formula in cell B1 on the tab called MK.Help Top 3.
=FILTER(Data!A:E,COUNTIFS(Data!A:A,Data!A:A,Data!B:B,Data!B:B,Data!C:C,Data!C:C,Data!E:E,">="&Data!E:E)<=3)
Countifs() is useful for creating a sort of ranking.

Google Spreadsheet - SUMIF Date Range

I'm trying to create a spreadsheet, in which I want to sum the table if records are older than 1 year.
I have inventory in 1 sheet with purchase date and other stuff, and in 2nd sheet, I want to sum the inventory which is older than 1 year (inspection date is a separate column in 2nd sheet)
sum B column older than year:
=SUMIF(A2:A, ">"&DATE(YEAR(TODAY())-1, MONTH(TODAY()), DAY(TODAY())), B2:B)
sum stuff between two dates:
=SUMIFS(B2:B,
A2:A, ">"&DATE(YEAR(TODAY())-1, MONTH(TODAY()), DAY(TODAY())),
A2:A, "<="&TODAY())

I have a list of customers, each customer has invoice dates, I am trying to average the dates between customer orders

I am importing the data from QB, with headers Customer, Date - I am calculating the Days between last purchase and am trying to take the average of those dates. What I am left with is this - How do I take the average of each separate customer?
The way I would do it is to add another column to calculate which customer the value relates to, then work out the average.
To do this, insert a column between "Days Btwn Orders" and "Average Days Btwn Orders". We'll call this "Corresponding Customer". The rest assumes "Date" is column B, "Days Btwn orders" is column C and "Corresponding Customer" is column D.
Put this formula into that column, row 2: =IF(A2<>"",A2,D1)
Then, put this into row 2 of "Average Days Btwn Orders": =IF(AND(A2<>"",C2=""),averageifs(C:C,D:D,D2),"")
Then drag both formulas down as far as you need. There's probably a way to do this in an arrayformula so it doesn't need the extra column.

Min Custom formula for monthly expenses spreadsheet

I have a sheet with this table as an example:
What I want to do is a formula that displays the lowest value from today until the last day.
Example: If today is the first day, the lowest value (column "Current") until the end of the table is $50 (day 2).
But if today is day 3, for example, I want the cell to display the lowest value from day 3 until the last day, in this case, it would show value $450 at day 4, ignoring all the previous values before day 3.
Is this possible?
D4:
=MIN(C4:$C$10)
Drag fill down
I could find in a forum how to use matrix to calculate:
{=MIN(IF(days>=DAY(TODAY());values))}
Where days and values are named ranges.

Google Sheets: Compare this month's sales to last month's sales over same number of days

I'm sure this is doable and I'm just not finding the solution in the documentation, so big thanks in advance for your help. I want to calculate sales growth month over month.
For example, I'm posting this question on 10/22/2014. Calculating sales thus far for this month is easy, but I also need to know what sales were for the first 22 days of LAST month.
I already have a column containing the values for each day this month, and another column containing the values for each day last month. All I need to do is a way to sum the values for the first 22 days of last month.
Column AH = A list of the dates for last month: 9/1/2014, 9/2/2014...
Column AI = A helper column containing only the DAY of the month of the value in Column AH: 1,2,3,4...
$AJ35 = The day of today's date =DAY(TODAY())
Column AN = The numbers I want to (conditionally) sum
Why won't this formula work?
=SUMIF(AI1:AI34,"<=$AJ35",AN1:AN34)
It calculates a sum of 0.
If I take out the comparison ("<=$AJ35") and manually insert a number, it works fine:
=SUMIF(AI1:AI34,"<=22",AN1:AN34) returns a value of 362, as expected.
That is because you have put a cell reference inside a string (surrounded by double quotes)
which means that google is trying to compare against a literal string "$AJ35"
you have to concatenate the cell reference and the operator like so:
=SUMIF(AI1:AI34,"<=" & $AJ35,AN1:AN34)

Resources