Finding the likelihood of a string across multiple columns? - google-sheets

I'm working with film production rental data and am interested in generating an ideal rental package based on the rental histories of 4 similar customers.
I have separated their rentals into 4 tables and would like to sort a new list of "Items" based on their likelihood of being rented again...which I'm assuming would be based on how often an "Item" intersects all 4 rental histories? Even finding this percentage alone would greatly help.
Having no prior statistics experience, I'm at a loss as far as best practices are concerned and any insight at all would be greatly appreciated. The example below has 4 rental histories with # of times rented. I've generated a unique list of items in column M.
Example spreadsheet

You can try QUERY:
=QUERY({A3:B;D3:E;G3:H;J3:K},
"SELECT Col1, SUM(Col2), COUNT(Col1), COUNT(Col1)/4*100
GROUP BY Col1
ORDER BY COUNT(Col1) DESC
LABEL Col1 'Item',
SUM(Col2) 'Sum Of Rentals',
COUNT(Col1) 'Count in Lists',
COUNT(Col1)/4*100 'Percentage in Lists'")

Related

Google Sheets - Query Table with output corresponding to specific cell criteria

I am trying to create a query based on a date range, that will display output based on the values in another column.
Here is the sample dataset I'm working with.
I would like the # Allotted (Column F) to be queried into 2 separate columns, depending on whether the Cost = 0. If the Cost = 0, I want the # Allotted to be listed under column "Free Trial" - otherwise, it should be listed under "Purchased."
I tried to create 2 separate queries for the "Purchased" and "Free Trial" columns but I can't figure out how to tell it to list the output based on a key value, such as Customer.
You can see my attempt in the sheet attached as well as what I'd like the output to look like. I highlighted the columns I'm having trouble with.
Thank you for your help!
Try:
={query(
{query({A:J},"select Col1,Col3,Col9,Col6,Col7 where Col7 =0 ",1);
query({A:J},"select Col1,Col3,Col6,Col9,Col7 where Col10 > date '"&text(A2,"yyyy-mm-dd")&"' and Col10 < date '"&text(B2,"yyyy-mm-dd")&"' and Col7>0 ",1)}
,"where Col2 is not null order by Col3,Col4 label Col1 'Customer',Col2 'Type', Col3 'Purchased', Col4 'Free Trial', Col5 'Cost'",1)}
I figured it out! Ended up using Filter instead of Query so I could filter based on the criteria selected.
I updated the solution to the sheet linked in the question.
For column "Purchased":
=iferror(FILTER(F4:F14,A4:A14=G19,J4:J14>=A$2,J4:J14<=B$2,G4:G14<>0),"")
For column "Free Trial":
=iferror(FILTER(F4:F14,A4:A14=G19,J4:J14>=A$2,J4:J14<=B$2,G4:G14=0),"")

How to combine all the columns in into one and count them respectively in google sheet?

Data
Hi everyone,
I have 3 columns of data for the age of people in a city and also 3 columns of the count for the ages. I want to combine all the 6 columns into 2 columns which means there will be only one column of age data and one column for number of count. I try to use query function in googlesheet but not sure how to use it. Please give me some advice on this if there is other method that can achieve the same result. Thank you.
Try
=query({G8:H; J9:K; M9:N}, "Select Col1, sum(Col2) where Col1 is not null group by Col1 label sum(Col2) 'Count of people'", 1)
and see if that works?
and see if that works?

Pivot table with double rows or going from wide to long in Google Sheets

How can I go from wide to long in Google Sheets based on two different columns or create a pivot table where I specify two different columns from the original matrix as rows?
Please see example for intended effect:
you can do it like this all in one go:
=ARRAYFORMULA(QUERY({
A1:B, TEXT(C1:C, "hh:mm");
A2:A, D2:D, TEXT(C2:C, "hh:mm")},
"select Col2,max(Col1)
where Col2 is not null
group by Col2
pivot Col3", 1))
Here's a fairly simple approach. I don't think you can use a pivot table because the values have to be summaries of a numeric value, or at least counts, not a string value.
To get the times:
=transpose(C2:C)
To get the lecturers (fairly big assumption that there are no lecturers that work only as assistants but this can be changed later):
=unique(B:B)
If there are additional lecturers working only as assistants:
=unique({B:B;D2:D})
Then to get the topic corresponding to a particular lecturer or assistant:
=ArrayFormula(IFERROR(vlookup(filter(F2:F,F2:F<>"")&filter(G1:1,G1:1<>""),{B2:B&C2:C,A2:A},2,false))&
IFERROR(vlookup(filter(F2:F,F2:F<>"")&filter(G1:1,G1:1<>""),{D2:D&C2:C,A2:A},2,false)))

Merging two data sets in order to add default values for missing data

I'm trying to merge two datasets in order to insert default rows for missing data. The use case is that I have a list of dates and attendance numbers for training sessions on those dates, but if I have no records at all for a training session then it's missing from the list.
In my sheet at the moment I have a two column set of dates and attendance numbers, and in another sheet I have worked out all the Wednesdays and Fridays (training days) between the start and end dates of all the sessions we have data for.
Is there a way to merge the two datasets together so that the zero attendance for each session is the base set and then I merge in the rows for which I have data? I've tried using some of the query command but if I specify two datasets using {Sheet1!A1:A,Sheet2!B1:B} I get array errors.
The attendance information is currently gathered with a query like this:
=QUERY({Records!A2:B}, "SELECT Col1, COUNT(Col2) WHERE (Col1 IS NOT NULL) GROUP BY Col1 ORDER BY Col1 ASC LABEL Col1 'Session Date', COUNT(Col2) 'Skaters'") where the Records sheets is just date and names.
If I update it to read from two datasets (=QUERY({Records!A2:B, Scratch!B2:B}, "SELECT Col1, COUNT(Col2) WHERE (Col1 IS NOT NULL) GROUP BY Col1 ORDER BY Col1 ASC LABEL Col1 'Session Date', COUNT(Col2) 'Skaters'")then I get a REF error of Function ARRAY_ROW parameter 2 has mismatched row size. Expected: 982. Actual: 999. Seems fair, as it's created misaligned dataset, rather than merging based on the date column.
I'm probably treating the spreadsheet a bit too much like a database, and while I would be more comfortable dropping into the script editor to resolve this I'm trying to learn a few spreadsheet techniques.
Data
Records looks like this:
| 2018-05-04 | Bob |
| 2018-05-04 | Fred |
| 2018-05-12 | Bob |
So no-one took attendance on the 9th, and so the stats are skewed as Bob gets a misleading 100% attendance record.
I do not understand the details of what you are trying to do but since it seems to involve combining one list of just dates and at least two lists of dates and names offer the following example:
The formula is:
=ArrayFormula(query({Sheet1!B1:C20;Sheet2!E1:F20;Sheet3!I1:J20},"select * where Col2 is not NULL order by Col1 "))

Merge multiple tables

I have lots of sheets describing different kind of expenses and gains of my small company, and I find no easy way to merge my tables like in this example I made:
I want the last table to be auto filled with the lines of the others tables when I update them, so I can foresee the expenses and gain in time (by ordering the green table automatically by date ascending).
By now the only temp solution I found is to copy references to the other tables lines (yellow and blue) in the merging table (green) in advance.
Pivot tables do not permit to achieve this kind of gathering on several tables.
Use this Query formula in cell I2:
=QUERY({A2:C; E2:G}, "select * where Col1 is not null", 1)
To also order them by Date, add the order by:
=QUERY({A2:C; E2:G}, "select * where Col1 is not null order by Col1", 1)

Resources