GoogleSheets QUERY REGEXREPLACE - google-sheets

I was able to format a column with zipcode in my query, the problem is that I want to display several other columns, but I don't know how to reference at this point.
I could only display 1 columns, but I want many more
=QUERY(ARRAYFORMULA(REGEXREPLACE(P43:P44;"-";""));"Select Col1")
I also tried with the substitute:
=ARRAYFORMULA(QUERY(A43:AD44;SUBSTITUTE(E43:E44;"-";"");"Select *"))

Given the following example table:
In case you would want to query after removing the - characters in the ZIPCODE column, you could do it as follows:
=QUERY({A:A, ARRAYFORMULA(REGEXREPLACE(B:B, "-", "")), C:D},"Select * WHERE Col4>23")
The idea is that using the curly bracket notation you join the multiple ranges you want to query from. Using it, you can apply the REGEXREPLACE to the whole B Column, and afterwards query from the resulting range.
This is the obtained result with the previous query:

Related

Unnest two columns in google sheet

I have a table like this one here (basically it's data from a google form with multiple choice answers in column A and B and non-muliple choice data in column C) I need a separate row for each multiple choice answer.
Column A
Column B
Email
A,B
XX,YY
1#gmail.com
A,C
FF,DD
2#gmail.com
I tried to un-nest the first column and keep the remaining columns like this
enter image description here
I tried several approaches I found with flatten and split with array formulas but I don't know where to start really.
Any help or hint would be much appreciated!
You can use the split function on the column A and after that, use the index function. Considering the table, you can use:
=index(split(A2,","),1,1)
The split function separate the text using the delimiter indicated, returning an array with 1 line and 2 columns; the index function will return the first line and the first column from this array. To return the second element from the column A, just change to
=index(split(A2,","),1,2)
I think there's no easy solution for this. You're asking for as many combinations of elements as multiple-choice elections have been made. Any function in Google Sheets has its potentials and limitations about how many elements it can express. One very useful formula here is REDUCE. With REDUCE and sequences of elements separated by commas counted with COUNTA, you can stablish this formula:
=QUERY(REDUCE({"Col A","Col B","Email"},SEQUENCE(COUNTA(A2:A)),LAMBDA(z,c,{z;LAMBDA(ax,bx,
REDUCE({"","",""},SEQUENCE(ax),LAMBDA(w,a,
{w;
REDUCE({"","",""},SEQUENCE(bx),LAMBDA(y,b,
{y;INDEX(SPLIT(INDEX(A2:A,c),","),,a),INDEX(SPLIT(INDEX(B2:B,c),","),,b),INDEX(C2:C,c)}
))})))
(COUNTA(SPLIT(INDEX(A2:A,c),",")),COUNTA(SPLIT(INDEX(B2:B,c),",")))})),
"Where Col1 is not null",1)
Since I had to use a "initial value" in every REDUCE, I then used QUERY to filter the empty values:

Array Formula messing with Query Table

When I add an array formula to a column which is used to generate a query table, the query table doesn't sort the data as expected. When I remove the array formula it displays correctly.
The document is here: https://docs.google.com/spreadsheets/d/1r3bpNFy9k1h8anZJfefk6KrGYSy7mW2izxKQZb9mWoU/edit?usp=sharing
An example of the error:
If I add an array formula to 'Book Rating'!J:J, the results of the query at 'Book League'!K1 (and E13 and H13) no longer order the books in the desired Desc order. When I remove the array, they order correctly. This type of problem is repeated throughout the sheet for all of the respective League tabs - e.g. at 'Chefs League'!A1.
Can someone help me understand why these Query tables are being messed up by the Array formulas?
The issue is happening because of the nature of QUERYs, in that each column of a QUERY can only return one type of data (e.g., text or numbers, but not both). In the case where multiple data types exist in one column, QUERY will return the most populous type for the column. In your case, you've inserted "-" in place of null, and that is text. I'm guessing that your array formula filled the entire column of empty cells after your data set with that hyphen, making text the most populous type for the column. Therefore, all of your percentages were being converted to text. And in descending order of text, for instance, 9.25% (as a string) is "higher" than 25%, because the former begins with "9" and the latter begins with "2."
One way to resolve the issue would be to remove the "-" from your 'Book Rating'!J2 array formula and replace it with IFERROR(1/0), which will leave those cells null instead of filled with a hyphen. This will leave numbers as the most populous type for the column and your QUERY will work as expected.
Using E13 as an example, here was your original formula:
=Query('Book Rating'!$A$1:$K,"Select A,J where A<>'' Order by J Desc Limit 10")
If you want to leave that hyphen running in the array formula, here are some ways to leave the 'Book Rating'!J2 array formula as I suspect you had it, instead changing your QUERY formula:
1.) Pre-FILTER the 'Book Rating' data before performing the QUERY:
=Query(FILTER('Book Rating'!$A:$K,'Book Rating'!J:J<>"-"),"Select Col1,Col10 Where Col1 <> '' Order by Col10 Desc Limit 10",1)
2.) Use SORTN and FILTER together instead of QUERY, since FILTER can handle multiple data types in the same column:
=ArrayFormula({"Books","6 Stars";SORTN(FILTER({'Book Rating'!A2:A,'Book Rating'!J2:J},ISNUMBER('Book Rating'!J2:J)),10,0,2,0)})

How to import multiple column values with one condition?

I am having problems in getting the values. I need to get the values of July 10, 2020 to July 25, 2020 under column TL "June Troy". I have tried to do query with importrange and filter with importrange. But I cannot get it right. Please help.
If I understand your question, the following query should work for you:
=QUERY(ARRAYFORMULA(TO_TEXT({importrange("https://docs.google.com/spreadsheets/d/1CQkhI5dZoIUfoKF1aQ8lm1Y8rmOOZapaoYBJw8BJTSE/edit?usp=sharing","Attendance!A1:BC99")})),
"select Col7,Col8,Col9,Col10,Col11,Col12,Col13,Col14,Col15,Col16,Col17,Col18,Col19,Col20,Col21,Col22,Col23,Col24,Col25 where Col7 = 'June Troy' ",1)
Note that since your test data has June Troy on every row, this ends up selecting every row.
More importantly, your "value" columns have mixed data types, both numeric and string values, and QUERY ignores the minority data types and returns blanks for those values. So I included the TO_TEXT function to convert individual cells to text before passing them to the QUERY. And to make the TO_TEXT act on every cell in the range, it is wrapped in an ARRAYFORMULA.
Let us know if this works for you.
UPDATED: To correct formula. Sorry about that.
This is an answer to your second question, which should perhaps be a separate from the first part, since it is a different issue. But yes, you should be able to connect two IMPORTRANGE queries. Consider this formula:
={
QUERY(ARRAYFORMULA(TO_TEXT({importrange("https://docs.google.com/spreadsheets/d/1CQkhI5dZoIUfoKF1aQ8lm1Y8rmOOZapaoYBJw8BJTSE/edit?usp=sharing","Attendance!A1:BC99")})),
"select Col7,Col8,Col9,Col10,Col11,Col12,Col13,Col14,Col15
where Col7 = 'June Troy' ",1),
QUERY(ARRAYFORMULA(TO_TEXT({importrange("https://docs.google.com/spreadsheets/d/1CQkhI5dZoIUfoKF1aQ8lm1Y8rmOOZapaoYBJw8BJTSE/edit?usp=sharing","Attendance!A1:BC99")})),
"select Col16,Col17,Col18,Col19,Col20,Col21,Col22,Col23,Col24,Col25
where Col7 = 'June Troy' ",1)
}
Basically, you would have two very similar queries. In my example, I point them both at the same sheet, but you can point to a different link, for one of the queries.
They are wrapped in braces, "{...}", to form a new array. And most importantly, the first query has a comma, ",", after it, to force the result of the second query to be in adjacent columns, on the same rows. If you separate the two queries with a semi-colon, ";", the result of the second query would be added as rows underneath the first query, not in columns beside it.
HOWEVER, I think this causes an error if the two queries don't both return the same number of rows. So that will depend on your data. But since you are getting related columns, I'm assuming they should return the same number of rows. If not, share the data from your two sample sheets, and what the desired outcome should look like.

How to use substitute function with query function in google spreadsheet

I am trying to use substitute function inside a query function but not able to find the correct syntax to do that. My use case is as follows.
I have two columns Name and Salary. Values in these columns have comas ',' in them. I want to import these two columns to a new spreadsheet but replace comas in "Salary" column with empty string and retain comas in "Name" column. I also want to apply value function to "Salary" column after removing comas to do number formatting.
I tried with the following code but it is replacing comas in both the columns. I want a code which can apply the substitute function only to a subset of columns.
Code:
=arrayformula(SUBSTITUTE(QUERY(IMPORTRANGE(Address,"Sheet1!A2:B5"),"Select *"),",",""))
Result:
Converted v/s Expected Result
Note :
I have almost 10 columns to import and comas should be removed from 3 of them.
Based on your suggestions, I was able to achieve the objective by treating columns separately. Below is the code.
=QUERY({IMPORTRANGE(Address,"Sheet1!A3:A5"),arrayformula(VALUE(SUBSTITUTE(IMPORTRANGE(Address,"Sheet1!B3:B5"),",","")))},"Select * where Col2 is not null")
Basically, two IMPORTRANGE functions side by side for each column.
The same query on the actual data with 10 columns will look like this.
=QUERY({IMPORTRANGE(Address,"Sheet1!A3:C"),arrayformula(VALUE(SUBSTITUTE(IMPORTRANGE(Address,"Sheet1!D3:H"),",",""))),IMPORTRANGE(Address,"Sheet1!I3:J")},"Select * where Col2 is not null")
I used 3 IMPORTRANGE functions so that I can format the columns D to H by removing comas and changing them to number.
My suggestion is to use 2 formulas and more space in your sheets.
Formula #1: get the data and replace commas:
=arrayformula(SUBSTITUTE(IMPORTRANGE(Address,"Sheet1!A2:B5"),",",""))
Formula #2: to convert text into numbers:
=arrayformula (range_of_text_to_convert * 1)
Notes:
using 2 formulas will need extra space, but will speed up formulas (importrange is slow)
the second formula uses math operation (*1) which is the same as value formula.
Try this. I treats the columns separately.
=arrayformula(QUERY({Sheet1!A2:A5,SUBSTITUTE(Sheet1!B2:B5,",","")},"Select *"))
Thanks to Ed Nelson, I was able to figure out this:
=arrayformula(QUERY({'Accepted Connections'!A:R,SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE('Accepted Connections'!A:R,"AIF®",""),"APA",""),"APMA®",""),"ASA",""),"C(k)P®",""),"C(k)PS",""),"CAIA",""),"CBA",""),"CBI",""),"CCIM",""),"","")},"Select *"))
That removed all the text I didn't need in specific columns.

How to definitely use column names in Google Sheet Query

query function doesn't let you use column names; you have instead to use letters if you refer to a cell range or ColN if you refer to an array.
This is very annoying, most of all when you alter the queried table adding, deleting or exchanging columns.
I would like to use column names, like in a standard SQL query.
You can actually get around this by splitting the Query formula and using other formula's to automatically get the desired column names from a list.
For example if you have a table in range A1:E15 with headers "H1, H2, H3, H4, H5", and you'd like to only get columns H3 & H5:
Store the desired headers (H3 & H5) in another table/range as a list - lets say this range is G1:G2
Use MATCH formula along with TextJoin formula to generate an concatenated string like Col3, Col5
=TextJoin(", ",TRUE,ArrayFormula(IFERROR("Col"&MATCH(G1:G6,$A$1:$E$1,0),"")))
Lets say this was in cell H1
You can refer to this cell in your Query formula like below
=QUERY({A1:E20},"SELECT "&H1&" WHERE Col2='w'")
You can see it in action in below screenshot:
One solution could be recurring to some custom function created by a script, but when you have a not so small table you surely will incur in some error due to the exceeding computation time.
The most efficient solution (using only native functions) I found is as follows.
Suppose you are working on a sheet range, your column names are in row 1 and you want to refer to the column "salary"; you can obtain the column letter by
substitute(address(1,match("salary",A1:1,0),4),"1","")
Instead, if you are querying arrays, it is simpler; the string you need is
"Col"&match("salary",A1:1,0)
The final query could be not so elegant, but the efficiency is guaranteed:
query(
employeessheet!A:E,
"select "&substitute(address(1,match("salary",employeessheet!A1:1,0),4),"1","")&" where ...",
1)

Resources