Predict future values for a whole row - machine-learning

I am working on a time-series model where I have a date-time column and other five columns. My question is, can I predict future values for all the columns in a row based only on previous row values and the current date-time? I mean without having input columns and one output column?

Related

Arrayformula for Index Match multiple columns with date values

I am collecting data on intervals and want to display a summary of that data on a single line in a new sheet.
https://docs.google.com/spreadsheets/d/1EOV4-VwVfwWvhwQ24qkQbCRGxUp74oe0dLWrbj0wiNE/edit#gid=566541214
Grade Data Sheet
Each batch of data comes in by date with a Name, Course and Grade like below
Raw data is like this for a large number of Name / Course / Grade:
Date
Name
Course
Grade
10/1/2022
Joe
Math
65-D
10/15/2022
Joe
Math
58-F
10/30/2022
Joe
Math
50-F
Summary Sheet
Single line that takes each unique Name-Course pair and I'm attempting to lookup a grade for each date column.
(note: I'm trying to extract the dates in the columns dynamically as the Grade Data sheet expands)
So I've successfully extracted the Dates to create new columns, and I am trying to create an index-match that grabs the column date and creates an array of DATE-NAME-COURSE and matches DATE-NAME-COURSE on the Grade Data sheet to return the grade for that student on the Date. The formula works for the first row, but when it fills down it returns the value of the first match.
I can't quite figure out how to reference the single date cell into the array while dynamically filling down. Not sure if I a different approach, but hopefully this makes sense.
=arrayformula(if(len($A2:$A),(index(GradeData!$D2:$D,match(TEXT(C$1,"yyyy-mm-dd")&$A2:$A&$B2:$B,TEXT(GradeData!$A2:$A,"yyyy-mm-dd")&GradeData!$B2:$B&GradeData!$C2:$C,0))),""))`
The goal is to have Grade Data populate automatically, and the Summary page to add a column for each new date and fill data down for each student.
Thanks in advance, I have shared the actual sheet above so you can see the data
I've attempted several different ways but can't quite get the dynamic array matches with the date formatting to work.
=arrayformula(if(len($A2:$A),(index(GradeData!$D2:$D,match(TEXT(C$1,"yyyy-mm-dd")&$A2:$A&$B2:$B,TEXT(GradeData!$A2:$A,"yyyy-mm-dd")&GradeData!$B2:$B&GradeData!$C2:$C,0))),""))
Thank you!
As far as I've seen the issue in your sheet is that the dates of the header do not match the dates of your source data. I've added a new header line:
=LAMBDA(dates,FILTER(dates,regexmatch(dates,"/")))(Transpose(Unique(arrayformula(left(GradeDataVlookup!A4:A,10)))))
And just put an IFERROR in order to avoid all the errors of the values without matches:
=arrayformula(if(len($A$2:$A),iferror(vlookup(TEXT(C$1,"MM/DD/YYYY")&$A2:$A&$B2:$B,GradeDataVlookup!$A:$E,5,FALSE),""),""))
PS: with MAP or MAKEARRAY you could summarize all the table in just one formula

How to expand date ranges and unpivot data to create a lookup table

I have this sheet that includes a pivoted table with a range of months in columns and years in the header row and some values in the main table. I want to be able to expand the monthly dates along with the year, id, and values to more easily use it as a look up table to find valid user ids that have a greater value than the look up value for a given year. I provided a sample sheet with limited rows as an example. The row that is highlighted yellow (row 69) makes user 2 valid within year 2022. I would then conditionally highlight the valid user.
I can get the date ranges expanded but I have trouble keeping the other data matched up with the appropriate corresponding rows. I think it would be a fairly straightforward task with apps script but I am a very inexperienced user in that regard.
Here is my sample sheet: https://docs.google.com/spreadsheets/d/1bIji78wYu32O70C2xZLn8dntyYogs5ivuNvCFj5-Z0s/edit?usp=sharing

Creating Google Sheets pivot tables with custom formula

I am creating table for finance: will have a data base of trades: date open and close for trade (), open and close prices, ticket is a stock name, change is percentage which is calculated base on open-close price and days are also calculated base on two dates as on the picture:
And I need to generate a new table for each month of the year (in which I have date records). So, Google sheets has Pivot tables and that what I need. I need columns: average win % per month, average loss % per month, average number of win days per month, average number of loss days per month.
I did that in 2 separate tables:
First table:
First table settings:
Second table:
Second table settings:
But I can not create that in one table - I do not know how to make custom formula. So, I am looking for some help here.
I tried some things, I can filter, make average. But I do not know how to get array of items with sorted pivot table data by months...If I can get sorted pivot table data by months - I can filter by positive/negative and find average.
My sample: https://docs.google.com/spreadsheets/d/1TCLWZ7-oUSwM8DLODPpH6wwssgfYyo3BVlEpWj78kV4/edit?usp=sharing

Google Sheets pulling out specific data from multiple rows and columns to put into a logistic regression function

I have a spreadsheet of multiple years of student annual writing assessment scores.
Each row is the data for one test (Test Year, Student ID, Test score with subsections, etc.).
I need to fill in each student’s data into a logistic regression model with the following variables:
SUMPRODUCT FUNCTION where I need the selected data to appear:
Spreadsheet and corresponding cells needed in logistic regression function
B Constant Y3 -16.266 [Generate a number ‘1’ to balance the sumproduct function.]
B T1AvgScore 0.911 [Student’s first year test average score] I need a function to put the data here
B T3AvgScore 2.399 [Student’s third year test average score] I need a function to put the data here
B T3SF2 0.434 [Student’s third year subsection ‘Sentence Fluency (SF)’ score] I need a function to put the data here
B T3Conv2 0.251 [Student’s third year subsection ‘Conventions (Conv)’ score] I need a function to put the data here
y* = ln(p/(1-p)) [Calculated from the above sumproduct function]
p = exp(y*)/(exp(y*)+1 [calculation for the prediction percent]
Thanks in advance for any assistance!
Well I'm not clear if I'm answering what you're looking for, but I have the formulas that pull the Average Score values from the AWA sheet for a given student number. See the tab I added to your sheet, Example-GK.
The query formula is simply:
=query(AWA,"select F where A = "&E$15&" and B = '"&D19&"' ",0)
where 15 is the specified StudentID (a numeric value, so no single quotes used), and D19 is the specific year.
I also added the ability to select the StudentID number from a dropdown list, in E15 on that sheet. Or the StudentName could be used for the selection criteria, instead of the StudentID, if that was available and easier for you to use. For now, the StudentName is ignored, since it wasn't available in the data.
Let me know if this is what you're looking for. One issue is there might be more years of data for some students. There are other ways of listing the years, which might help you. I'll see if I can add that function.
Update Sept 9,2020:
If I've understood your comments correctly, and that for each model, there is a set of constants that apply to all students (see below for the Model 3 constants), then I may have a generic set of formulae that calcute the probabilities for each student, using all three models, provided there is sufficient data for that student.
See my updated Example-GK in your sheet.
And let me know if I still haven't understood how your final probabilities are calculated from the individual student data values.

Match Rows By ID and return a value from the row with the closest date before a specified date

I'm using Google Sheets and have dataset 1 pictured below, which includes ID, Date, Value. This dataset has a number of rows with the some duplicate ID's but different dates against them.
Dataset 1
I then have dataset 2 with ID, Date, Empty Column. I want to be able to populate the empty column with the value from dataset 1 that matches the row ID, however is pulled from the row with the closest date before the date specific in dataset 2. (Hope I've explained that well enough). Attached a couple of images for references. Any help would be really appreciated on this one!
Dataset 2
For clarity and maintenance, I am doing this in 2 steps. In theory it should be doable in one as described at sql with dates. I have also referred to dates with qoogle query, and looking there one may find simplifications. On the Dataset2 sheet, I added a column D, which may be hidden later if you like, and I named the first sheet Dataset1. in D2, I placed the following formula, which I then dragged down.
=iferror(index(query(Dataset1!$A$2:$C$11,"select MAX(B) where A='"&A2&"' AND B<Date'"&TEXT(DATEVALUE(C2),"yyyy-mm-dd")&"'"),2),"")
The iferror guards against the case where nothing is found, as is the case for ID 2 in your example. index 2, simply picks out the query result as opposed to the header "Max." When you get inside the query you can see it looking at your original data, and finding the largest date where the id's match and the Dataset1 date is less than that in Dataset2 for this line.
Now once you have the date you need (I am assuming there is just one entry corresponding to that date, otherwise you need to handle that), you can query again in B2 (and drag that down as well) with
=iferror(query(Dataset1!$A$2:$C$11,"select C where A='"&A2&"' AND B=Date'"&TEXT(DATEVALUE(D2),"yyyy-mm-dd")&"'"))
Again the iferror is for the same reason (to avoid a bad date format message for the empty one), and now we pick out the value for the item matching the ID and the date we calculated.
That is what your goal was.

Resources