Reshaping DataFrame without changing index - time-series

I have a DataFrame object in the long format as shown:
I want to change this into a wide format around the Name Column where the Name column contains names of Stocks. Meanwhile, I want to keep index as date and only want Open value of each stock.
I used pivot function from Pandas but in place of index it is not accepting Date even when I use df.index and giving me a multi-index dataframe.
So I want timeseries data where date is the index and columns are Open and Names of stocks.

Related

Arrayformula for Index Match multiple columns with date values

I am collecting data on intervals and want to display a summary of that data on a single line in a new sheet.
https://docs.google.com/spreadsheets/d/1EOV4-VwVfwWvhwQ24qkQbCRGxUp74oe0dLWrbj0wiNE/edit#gid=566541214
Grade Data Sheet
Each batch of data comes in by date with a Name, Course and Grade like below
Raw data is like this for a large number of Name / Course / Grade:
Date
Name
Course
Grade
10/1/2022
Joe
Math
65-D
10/15/2022
Joe
Math
58-F
10/30/2022
Joe
Math
50-F
Summary Sheet
Single line that takes each unique Name-Course pair and I'm attempting to lookup a grade for each date column.
(note: I'm trying to extract the dates in the columns dynamically as the Grade Data sheet expands)
So I've successfully extracted the Dates to create new columns, and I am trying to create an index-match that grabs the column date and creates an array of DATE-NAME-COURSE and matches DATE-NAME-COURSE on the Grade Data sheet to return the grade for that student on the Date. The formula works for the first row, but when it fills down it returns the value of the first match.
I can't quite figure out how to reference the single date cell into the array while dynamically filling down. Not sure if I a different approach, but hopefully this makes sense.
=arrayformula(if(len($A2:$A),(index(GradeData!$D2:$D,match(TEXT(C$1,"yyyy-mm-dd")&$A2:$A&$B2:$B,TEXT(GradeData!$A2:$A,"yyyy-mm-dd")&GradeData!$B2:$B&GradeData!$C2:$C,0))),""))`
The goal is to have Grade Data populate automatically, and the Summary page to add a column for each new date and fill data down for each student.
Thanks in advance, I have shared the actual sheet above so you can see the data
I've attempted several different ways but can't quite get the dynamic array matches with the date formatting to work.
=arrayformula(if(len($A2:$A),(index(GradeData!$D2:$D,match(TEXT(C$1,"yyyy-mm-dd")&$A2:$A&$B2:$B,TEXT(GradeData!$A2:$A,"yyyy-mm-dd")&GradeData!$B2:$B&GradeData!$C2:$C,0))),""))
Thank you!
As far as I've seen the issue in your sheet is that the dates of the header do not match the dates of your source data. I've added a new header line:
=LAMBDA(dates,FILTER(dates,regexmatch(dates,"/")))(Transpose(Unique(arrayformula(left(GradeDataVlookup!A4:A,10)))))
And just put an IFERROR in order to avoid all the errors of the values without matches:
=arrayformula(if(len($A$2:$A),iferror(vlookup(TEXT(C$1,"MM/DD/YYYY")&$A2:$A&$B2:$B,GradeDataVlookup!$A:$E,5,FALSE),""),""))
PS: with MAP or MAKEARRAY you could summarize all the table in just one formula

To find XIRR for different investments using google sheet

I am currently trying to calculate the XIRR of a huge portfolio containing non-periodic cashflows. The database contains lot of transactions and I want to calculate the XIRR for each one.
This image contains the format and the last column contains the TICKER names of firms. I want to calculate the XIRR for these firms. The database on the left contains all the data for the ticker names
Please find the sample sheet here:
https://docs.google.com/spreadsheets/d/1LnTHOuw5FROyZ8tNo1Zl270RhTDX1gfB2m7jtEU9F_k/edit?usp=sharing
on your sheet you will find a new tab called MK.Help.
This is how you find XIRR for an investment like what you have:
=XIRR({FILTER(D:D*E:E,A:A=H5);-I5},{FILTER(B:B,A:A=H5);TODAY()})
The key is that you need to add the CURRENT HOLDING and todays date at the end of the arrays of cashflows. The idea is to imagine that you liquidated the position right NOW.

How to apply arrayformula to a series of columns

I'm trying to make a spreadsheet to track membership for an organization.
Basically my design is an input sheet with columns of names associated with expiration dates, then another sheet that collects all the unique names and all of their associated expiration dates, and then one last sheet that filters the names into only those with expiration dates in the future.
I am able to collect all the unique names into one column using an arrayformula, but I am stuck trying to do a lookup operation of some kind that, for each name, will look for the name in each column and if it appears then it will add the associated expiration date to it's list (and otherwise add a blank cell, and then I can filter out the blank cells).
Is there a way to use vlookup or anything else in an arrayformula to do a series of operations for all columns in a range? Also, I want to use arrayformula because I want the formula to be infinite so the spreadsheet can keep growing. I've tried using
=ARRAYFORMULA(IF(ISERROR(VLOOKUP(A1:A,Sheet1!A2:200,1,FALSE)),,Sheet1!A1:1))
But vlookup, and anything else I tried like match, interprets Sheet1!A2:200 as a single range and performs a lookup only in the first column and does not do a separate lookup in each column.
For example, I might have this input on Sheet1
And want this result on another sheet
I suspect the combination of what you would really like and what is reasonably practical is a script but the following is an array formula, though would be cumbersome to extend and does require copying down (from B1):
=split(if(ISERROR(match(A1,Sheet1!A:A,0)),"",Sheet1!A$1)&"|"&if(ISERROR(match(A1,Sheet1!B:B,0)),"",Sheet1!B$1)&"|"&if(ISERROR(match(A1,Sheet1!C:C,0)),"",Sheet1!C$1),"|")
Assumes a unique list of names in ColumnA, such as created by:
=unique(QUERY({Sheet1!A2:A6;Sheet1!B2:B6;Sheet1!C2:C6},"where Col1 is not NULL"))
in A1.

How to get last value of column in multiple sheets and add them together

I'm currently trying to get the last value of Column "D" in several sheets, then add all the values together, then calculate a percentage based on a value from a main sheet cell.
I can get =VALUE(D:D) to work and =VALUE(Animations!D15), but not a combination of both which is what I need (since the size of the column will continue to grow).
It would be best if it was the last numerical value in column D, and not account for blank spaces or strings.
Thanks!
To find the last populated number in a column use Index with an approximate Match to 1E+99.
=index(sheet2!d:d, match(1e99, sheet2!d:d))
The above retrieves the last number in column D on Sheet2.
Google sheets will not process an array of worksheet names through INDIRECT like Excel will but a 'helper' column will take care of that. If you want to hard-code a series of worksheet names into a sum of index/match formulas, then Indirect isn't even necessary.
In the accompanying linked worksheet, I've used this method to retrieve the last number from columns with numbers, text and errors. I've thrown in the 'last number' cell address as well.
Linked spreadsheet

How do I populate a cell based on multiple criteria in Google Sheets?

So here is the situation. I have one spreadsheet in Google sheets that has a column for the names of TV stations. I have a second column that lists airing times for ads. This is the format the date and times are in.
14-12-22 08:06:05
I have a second sheet that has the same column for TV station names. I also have a column that has a time range in the format
09:00-16:00
Then there is a third column for Rate.
What I am trying to do is add a Rate column to the first spreadsheet and populate that my matching up the TV Station name and the time range on the second sheet. My first thought was a VLOOKUP but I'm trying to match 2 conditions with the second one being a bit tricky since I am using an exact time vs a time range.
Any ideas?
As it is permitted to parse the time intervals I would recommend doing so (say with something like =SPLIT(A1,"-") since the results might then be arranged into a compact matrix such as shown in the image in ColumnsF:J. The differences in the rates for different stations at different times are readily apparent.
I have left the above in the same sheet as one with a representation of your other data since I (am lazy and) don't know the relevant sheet names anyway - but prefix the relevant sheet name (and !) to the column references in the formula that are later in the alphabet than C:
=vlookup(A2,F:J,match(C2,$G$1:$J$1,1)+1,0)
With extraction of the time element (into ColumnC) of your data (from ColumnA) the formula attempts to find the time from C in the first row, but accepts an inexact comparison by defaulting to the next lower value where there is no exact match. Once found, the MATCH() function returns the position of the match relative to the start of the range searched.
This is then used in a VLOOKUP() function to determine how far across to return the result of a search for the exact A column value in ColumnF.
Details of the syntax of the functions may be found via Help > Function list.

Resources