Structuring a query between multiple tabs to join values by name - google-sheets

I'm trying to write a SQL query in Google Sheets to try and get data for "matching" results from two different tabs, but running into some trouble.
This is a sheet that's basically an automated scoring engine for instructors who take a two-part test (written and practical). After the results are entered, I'd like to use some SQL to take the results from the two tabs and collate them into a final score.
Link to the sheet in question.
There's a "Practical Scores" tab (which takes all the data from the associated Google Form), and a "Written Scores" tab. I'd like to get the name of the instructors who match in both those tabs, and give the associated score for them, but I'm mostly having trouble with writing the correct SQL.
Most of what I'm trying to do is working fine. I'm able to pull the final practical scores via the following SQL:
=query(PracticalScores!A2:E, "select A, count(E),SUM(E)/3 group by A")
I can also pull the written scores as follows:
=query('Written Scores'!B2:C,"select B,C")
But I want the intersection of the two as well, and that's where I'm running into problems.
=query(A8:E, "select A,C,D where A = E")
will simply return the rows where the names match up, and I want the instances where the names match up, regardless of whether the rows do.
That is, I want all the rows where the names match from tab 1 to tab 2 and not just the few rows that happen to line up perfectly.
If I'm not explaining this well, please let me know and I can provide additional information. Any assistance would be very greatly appreciated!

Since the query function does not support joins, this can't all be done in one query. Instead, the following device can be used:
=arrayformula(vlookup(name column, table, # of column to extract, False))
For example, suppose I have a table
+---+-------+---+
| | A | B |
+---+-------+---+
| 2 | Jim | 3 |
| 3 | Sarah | 4 |
| 4 | Bob | 5 |
+---+-------+---+
to which I want to add another column, taking it from
+---+-------+---+
| | E | F |
+---+-------+---+
| 2 | Sarah | 9 |
| 3 | Bob | 8 |
| 4 | Jim | 7 |
+---+-------+---+
The basic idea is to put in cell C2 the formula
=arrayformula(vlookup(A2:A, E2:F, 2, false))
which will look up every name from first table (column A) in the column E, and return the matching value in column F. Result:
+---+-------+---+---+
| | A | B | C |
+---+-------+---+---+
| 2 | Jim | 3 | 7 |
| 3 | Sarah | 4 | 9 |
| 4 | Bob | 5 | 8 |
+---+-------+---+---+
In practice, one should filter out empty lookup values to improve performance:
=arrayformula(vlookup(filter(A2:A, len(A2:A)), E2:F, 2, false))
If the second table contains some names not present in the first, they will not be returned by the above formula. In this case it is better to prepare a full list of names, for example with
=sort(unique({Sheet1!A2:A; Sheet2!A2:A}))
which collects the names from A columns of two sheets, eliminating duplicates and sorting. Then look up those using vlookup as above.

Related

Count unique cells and display them in column

I am building a list of gigs I attended and I want to count how many times I've seen each band.
I know about UNIQUE, but because I keep each band in separate column it just copies each row.
Given the table (or screenshot of real data):
| Date | Venue | Bands |
|----------|--------|--------|--------|--------|--------|--------|
| 02.02.17 | Venue1 | Band A | Band B | Band C | Band D | Band E |
| 02.07.17 | Venue3 | Band D | Band C | | | |
The output I want:
| Band | Attended |
| | (times) |
|--------|----------|
| Band A | 1 |
| Band B | 1 |
| Band C | 2 |
| Band D | 2 |
| Band E | 1 |
I can change structure if needed.
What happens after using UNIQUE: https://i.stack.imgur.com/qmszk.png
Thanks in advance.
Step 1. Get list of all unique bands in one column, one per row
=ArrayFormula(UNIQUE(TRANSPOSE(SPLIT(CONCATENATE(Gigs!D2:Z&CHAR(9)); CHAR(9)))))
Step 2. Place this formula in next column, and drag it down
=SUM(COUNTIF(Gigs!D:Z; E2))
Transform your data to a simple table format in order to make easier to do data-analysis.
A simple table use the first row for column headers a.k.a. fields and has one and only one column for each entity, let say only one column for band names.
The above could be done in a single but complex formula hard to debug, so it's better to start by doing this using simple formulas and once you are certain that all is working fine, think about making a complex formula or writing and script.
Related
Unpivot Matrix to Tabular. Using counts of two variables into individual rows
Generate a list of all unique values of a multi-column range and give the values a rating according to how many times they appear in the last X cols
Normalize (reformat) cross-tab data for Tableau without using Excel
How do you create a "reverse pivot" in Google Sheets?

Google Sheets: How to eliminate duplicates in some columns and show only the most recent data in others?

I have a spreadsheet of books, with one row for every time a book was checked out (this is a small classroom library). Here are the columns:
BookTitle | Author | DateCheckedOut | CheckedOutBy | Status
=========================================================================
The BFG | Dahl, Roald | 6/1/2016 | Suzy | Out
The BFG | Dahl, Roald | 4/5/2016 | Johnny | Returned
The BFG | Dahl, Roald | 12/4/2015 | Wendy | Returned
Charlotte's Web | White, E.B. | | | Added
Wonder | Palacio, R.J. | 5/29/2016 | Joey | Returned
Wonder | Palacio, R.J. | 3/21/2016 | Mary | Returned
I want to query it to get only the row with the highest date value for each book and then display all columns of that row except CheckedOutBy.
I wanted to get a list of unique book title / author combinations and then join it with the original table the way I would in DB2, but it seems that joins like that are not possible in Google Sheets. I tried grouping and the max function, but when I get those things to work I either haven't been able to eliminate earlier dates or haven't been able to display columns that aren't being used in the aggregate function. My Google Sheets querying skills are not up to par :/
Is there a simple way to do this that I'm missing? I would appreciate any tips.
Here's a copy of that sample data from above in a Google Sheet.:
https://docs.google.com/spreadsheets/d/1J384S0fsc8tgxVMehPb_uyRNc5-6cQx-xKN-q8K8Gds/edit?usp=sharing
I created a new sheet and entered in cell A1
=ArrayFormula(iferror(vlookup(unique(Sheet1!A2:A), sort(Sheet1!A2:E, 3, 0), {1, 2, 3, 5}, 0)))
See if that works for you ?
BREAKDOWN:
The general idea behind the formula is to make use of the fact that VLOOKUP only returns the first match. We want that 'first match' to be the latest date per book.
So first we sort the table so that the latest dates are on top.
We 'lookup' the unique book titles in that sorted table and we return the columns {1, 2, 3, 5}.
Links:
sort() function
vlookup() function

Count Correct Answers

I'm trying to create a spreadsheet to automatically grade test answers.
Column A has question numbers (from paper sheet), column B has the correct answers, and columns C,D,E... have student answers.
# | Answer | Student A | Student B | Student C | Student D
-------------------------------------------------------------
1 | A | A | A | C | A
2 | B | B | B | B | B
3 | C | C | C | B | C
I'd like to add a row above the headers that shows the number of correct answers for each student, but I can't seem to get the variables right. I'm using the formula =$B2 for conditional formatting, and that works fine. I've tried
ACOUNT(FILTER()), SUMIF, COUNTIF
I think I want something to the effect of
=SUM(IF(B2:B152=C2:C152,1,0))
A SUMPRODUCT function¹ should be sufficient.
=sumproduct(--(C3:C5=$B3:$B5))
Lock the column references to the answers so the formula can be filled right.
¹ The documentation link is for MS-Office, Excel but the syntax is identical.
This single arrayFormula should work:
=mmult(SPLIT(rept("1|",COUNTA(B3:B)),"|"),ArrayFormula(--(C3:F5=B3:B5)))
Paste it in C1 and see the result:
To make formula work, when new rows added:
=mmult(SPLIT(rept("1|",COUNTA(B3:B)),"|"),
ArrayFormula(--(OFFSET(B3,,,COUNTA(B3:B))=OFFSET(C3:F3,,,COUNTA(B3:B)))))

Get multiple values, selected according to another column, into a single cell

I am trying to find a way to get multiple values from an array to display in one cell
For example I have the two columns as below
| a | 1 |
| b | 2 |
| c | 1 |
| d | 3 |
| e | 2 |
So if the parameter is 2 the cell would display "be"
I want all the values form the first column where the second column is 1.
I have tried to do this with dget but that only returns a single value. Is there a way to do this with formulas or does it require a Javascript solution?
You can do it by using filter to return only the letters next to "2", and then join to join them in one cell.
=join("", filter(A1:A, B1:B = 2))

Search for a particular cell in Google Spreadsheets and return the row number

For instance, I have a bunch of categories with 1 on each row and each category has 1 or more data on their own column. Given a string, I want to find which category it belongs to.
A | B | C | D
1 CARS | Civic | |
2 TRUCKS | F-150 | F-650 | F-750
3 PLANES | 747 | F/A-18 |
Given 747, I want to know that it is from row 3 or that it is a plane or that F- is a truck.
I've tried using several functions, including vlookup, filter, match, etc, but couldn't get them to work.
Is it possible to do this without scripts?
Assuming the data is in columns A to E (this could be extended), and that the search term is in F1, and the search term must start the string, and that all applicable matches will be returned, try:
=IF(LEN(F1),FILTER(A:A,COUNTIF(IF(REGEXMATCH(B:E&"","^"&F1),ROW(A:A)),ROW(A:A))),)

Resources