GS Query to split a column (Last, First) into 2 columns - google-sheets

I have a 'Raw Invoice' tab which is an excel file copy/pasted directly over. I'm then trying to format the data in the 'Invoices' tab in that column order using a query. I need to be able to break out the Student Name into two separate columns, hopefully within the query itself. Preferably it would then change it so column C is Last Name and column D is First name and the rest of columns shift over one.
I don't know if there's a way to perform a SPLIT function within the query. Right now I'm using a clunky method by doing a VLOOKUP on the student ID to get the names from another tab (not included in the Sample GS cuz it's an importrange from a work file), but it then creates two separate queries. Ideally I can somehow split column C within one query, but am getting lost by nesting queries and arrays together. I might be able to use REGEXEXTRACT, but again get lost in where to put it in the query or whether that's overkill.
QUERY('Raw Invoice'!$A:$I,"Select F,B,A, 'Bobs Diner',D,G,I where C is not null label 'Bobs Diner' 'Company' Format F 'M/DD/YYYY' ",1)
Link to sheet.

split in query arguments is not possible but you can do:
=ARRAYFORMULA(QUERY({IFERROR(SPLIT('Raw Invoice'!A:A, ", ")), 'Raw Invoice'!B:I},
"select Col7,Col3,Col1,Col2,'Bobs Diner',Col5,Col8,Col10
where Col3 is not null
label 'Bobs Diner' 'Company'
format Col7 'M/DD/YYYY'", 1))

Wat I suggest is to implement in Row Invoice
=arrayformula(split(A3:A,","))
and then
=QUERY('Raw Invoice'!$A:$K,"Select F,B,J,K, 'Bobs Diner',D,G,I where C is not null label 'Bobs Diner' 'Company' Format F 'M/DD/YYYY' ",1)

Related

Google Sheets, splitting cell values within a Query?

(Related to this question)
I want to split the values in each cell, that is either blank or contains one or more comma-separated tags. Can I do this from within the QUERY? Or, how would I copy the column to a scratch column that is longer because the cell values are split into one or more columnar values?
This formula works nicely to show tags and counts, but treats each cell as a single text value:
=QUERY(Notes!D1:D, "Select D, count(D)
where D matches '^(?!(?:Labels|Tags)$).+'
group by D order by count(D) DESC label count(D) ''")
I also have this formula, which returns an array of non-blank, comma-separated values in a range:
=ArrayFormula(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ","))
But this also has the problem that it splits values across columns (instead of rows), so I can't use the results as a simple range.
I have tried wrapping occurences of D, the data column, with the ArrayFormula. Each time I get a #VALUE! error from QUERY.
For what I get you're trying to do, you may find useful to FLATTEN your range and make it all in one column:
=FLATTEN(ArrayFormula(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ",")))
Just if needed, you can add TRIM too so you don' have undesired spaces:
=FLATTEN(ArrayFormula(TRIM(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ","))))
I don't know what your purpose then is, but you can wrap this in a QUERY to count as you expressed in your post too. Since it's a new column, you should name that column Col1:
=QUERY(FLATTEN(ArrayFormula(TRIM(SPLIT(filter(Notes!D1:D, not(isblank(Notes!D1:D))), ",")))),"Select Col1,COUNT(Col1) group by Col1 order by count(Col1) DESC label count(Col1) ''",)

vlookup get latest match of duplicates

I'm trying to display a filtered version of sheet1 data in sheet2:
Sheet1:
Sheet2:
I used vlookup on sheet2 columns C,D,E to display sheet1 columns B,D,A respectively.
i.e., Sheet 2 - Column C
=vlookup(A2,Sheet1!A3:D,3)
but I'm not sure how to make it work with duplicates and to only get the latest one.
I tried using vlookup on query result but it didn't work out (because I was referencing a reference?)
=sortn(query(Sheet1!A2:D6,"select * where A is not null order by B,A desc"),99^99, 2, 2, true)
How can I apply vlookup to get the latest match of duplicate rows? If it's not possible, how can I go about this (if possible, without having to add extra sheets)
If I wanted to use Vlookup and Query to do it, I would end up with something like this:
=ArrayFormula(vlookup(query(B2:B,"select min(B) where B is not null group by B label min(B) ''"),
query(A2:D,"select B,A,C+D,C,D where A is not null order by B,A"),{2,1,3,4,5}))
so the first query gets the unique values of column B, and the second query gets the original data plus total (C+D) sorted in ascending order of family name then timestamp. Then vlookup finds the last matching value for each family name, which in this case is the latest one. You could also sort on timestamp descending and use the exact form of vlookup to find the first occurrence - probably a little bit slower:
=ArrayFormula(vlookup(query(B2:B,"select min(B) where B is not null group by B label min(B) ''"),
query(A2:D,"select B,A,C+D,C,D where A is not null order by B,A desc"),{2,1,3,4,5},0))
In reality I would probably use sort and sortn as I think you started to do in your question:
=sortn(sort(filter({A2:B,C2:C+D2:D,C2:D},A2:A<>""),2,1,1,0),999,2,2,1)
This time it's sorted on family name ascending then timestamp descending, then sortn removes duplicates.
use in row 2:
=INDEX(IFNA(VLOOKUP(A2:A, SORT(Sheet1!A:D, ROW(Sheet1!A:D), 0), 3, 0)))

How can I separate a column into multiple columns based on values?

I have searched on a lot of pages but I cannot find a solution to my problem except in reverse order. I have simplified what I do, but I have a query that comes looking for information in my data sheet. Here there are 3 columns, the date, the amount and the source.
I would like, with a query function, to be able to make different columns which counts the information of column C based on the values of its cells per month, like this
I'm okay with the start of the formula
=QUERY(A2:C,"select month(A)+1, sum(B), count(C) where A is not null group by month(A)+1")
But as soon as I try a little different things by putting 2 query together in an arrayformula, obviously the row count doesn't match as some minus are 0 for some sources.
Do you have a solution for what I'm trying to do? Thank you in advance :)
Solution:
It's not possible in Google Query Language to have a single query statement that has one result grouped by one column and another result grouped by another.
The first two columns can be like this:
=QUERY(A2:C,"select month(A)+1, sum(B) where A is not null group by month(A)+1 label month(A)+1 'Month', sum(B) 'Amount'")
To create the column labels for the succeeding columns, use in the first row, in my example, I1:
=TRANSPOSE(UNIQUE(C2:C))
Then from cell I2, enter this:
=COUNTIFS(arrayformula(month($A$2:$A)),$G2,$C$2:$C,I$1)
Then drag horizontally and vertically to apply to the entire table.
Results:
try:
=INDEX({
QUERY({MONTH(A2:A), B2:C},
"select Col1,sum(Col2) where Col2 is not null group by Col1 label Col1'month',sum(Col2)'amount'"),
QUERY({MONTH(A2:A), B2:C, C2:C},
"select count(Col3) where Col2 is not null group by Col1 pivot Col4")})

Google Sheet Filter formula based on range of cells

I've got some data
Basically a list of Items
and another sheet that contains a list of orders
(Some items can appear in multiple orders, which is why I can't use a vlookup for this)
My problem is I want to get the ALl the Order IDs of all items in dynamic list(in my example there's only 3 items, but that can grow.
I'm trying to use the filter formula and have got this so far:
=filter('Orders'!AC1:AD,'Orders'!K:K=A4)
which works fine at retrieving all the Order ID's for the item number in cell A4.
But I want the Order ID's for all the Items in column A.
I tried
=filter('Orders'!AC1:AD,'Orders'!K:K=A2:A)
But that doesn't work. I'm guessing I need to do some kind of array formula maybe.
But I can't figure it out.
You can use the QUERY function
This is a SQL like syntax to manipulate your data.
=query({Orders!$A:$B},"select Col2 where Col1 matches '"&textjoin("|",true,unique(Summary!$A4:$A))&"' ",1)
or this if you need to sort the result:
=query({Orders!$A:$B},"select Col2 where Col1 matches '"&textjoin("|",true,unique(Summary!$A4:$A))&"' order by Col2 ",1)
The first argument is the range that you want to query. Which in this case is inserted with the array notation {Orders!$A:$B}.
The next argument is a string representing an SQL like statment that in this case says "Select column 2 when column 1 matches Item A or Item C or Item D".
The "Item A or Item C or Item D" part is constructed with another formula, TEXTJOIN. Just grabbing the range to join and the delimiter is set to the OR operator which is |.

Google Sheets Combine a column with duplicates and update total sum in another colum

This might be something fairly simple but struggling to find a way to do it.
In Column B, I have a list of foods required.
In Column C, I have the amount needed.
In Column D, I have g (for grams) ml (for mills) etc.
I would like to combine the duplicates in Column B and update the totals from Column C, with the g or ml in Column D beside it.
The list I have has been created by using an array formula based on dropdowns in another sheet.
I have seen people using UNIQUE formula in 1 column (this works) and then a SUMIF formula in another column and then a JOIN formula in another... I tried this but the SUMIF is always returning 0.
Would someone please be able to advise on how I can do this?
TIA :D
It's hard to be sure exactly what you need without seeing the data. But based on my understanding of solely what you've posted, this QUERY formula should generate a condensed mini-report:
=QUERY({B2:D},"Select Col1, SUM(Col2), Col3 WHERE Col1 Is Not Null GROUP BY Col1, Col3 LABEL SUM(Col2) ''")
In plain English, this means "Arrange the data from the range B2:D in the same order as the raw data, but sum the second column's data according to matches in both the first and third columns. Only return results for the raw data where the first column is not blank. Replace the default 'sum' header on the second column with nothing; I don't need it."
This formula assumes that every ingredient will always be attached to the same measurement (e.g., 'salt' in Col B is always paired with 'mg' in Col D, etc.). If this is not the case, you will wind up with ingredients being listed as many times as there are different measures in Col D.

Resources