Split & Transpose challenges - google-sheets

I am struggling to make my sheet work.
I have 5 columns, Column 1 = permit number, Column 2 = individuals names, column 3 = locations and column 4 = duration start date, column 5 = duration end date
I would like to transpose on a new sheet where column A would be Permit numbers, column B individuals under each other, column C locations under each other and column D and E start & end range.
Data are coming on a Gform and the same person can be on the same permit. Does it make sence?
For example:
Snip enclosed here
https://docs.google.com/spreadsheets/d/1x3HrfTDndaLPCxBZeons_xkL6hWM9E72b_RYT0oYeV0/edit?usp=sharing

try:
=ARRAYFORMULA(SUBSTITUTE(TRIM(SPLIT(QUERY(FLATTEN(IF(
IFERROR(SPLIT(B2:B, ","))="",,"♦"&TO_TEXT(A2:A)&"♣"&
SPLIT(B2:B, ",")&"♣"&C2:C&"♣"&TO_TEXT(D2:D)&"♣"&TO_TEXT(E2:E))),
"where Col1 is not null", 0), "♣")), "♦", ))

Related

Count of items in column A not in column B (Google sheets)

I have two columns of numbers in Google sheets. I’m trying to find a formula to give me a count of items in column A that are not in column B. The numbers in the columns are descending and unique in each column but can be duplicated across columns. The columns can also have different amounts of items in them.
Column A has 5 4 3 1
Column B has 4 2 1
The answer is this case would be 2 as the numbers 5 and 3 in column A are not in column B.
I’ve tried using sum, if and countif but can’t come up with a solution. Also not sure if this would be an array formula or not.
Without a helper column you can use reduce and lambda functions.
=reduce(0, A:A, LAMBDA(prev, a_value, prev + if(not(a_value = ""), iferror(min(0, match(a_value, B:B, false)), 1), 0)))
Fill column c with this formula:
=iferror(min(0,match(A1,B:B,false)),1)
If A1 is in column b the match function will return an value, which is then reduced to 0 by the min function. Otherwise match will return an error, and the iferror will return 1.
Then you just need to sum column c.

Count Values under Columns with Same Name Google Sheets

I have a Google sheet which has columns with the same name and there are different values under each column. I want to count the same value that appear under the same column name.
1
2
3
1
2
R
B
C
R
D
D
C
R
B
D
For example, I would like to get the number "R" that appear under column "1", so I would expect a count of 2 for "R" appearing under columns 1.
Here is a link to Google Sheet with actual data.
I have tried countif and countifs in Google Sheets, but can't figure out how to get the count right based on column name.
Try this formula, it outputs an array which shows how many of each letters are contained in each column name:
=LAMBDA(NUMBERS,LETTERS,
LAMBDA(UNUM,ULET,
{
{"",TRANSPOSE(UNUM)};
{ULET,
MAKEARRAY(COUNTA(ULET),COUNTA(UNUM),LAMBDA(ROW,COL,
COUNTIF(FILTER(LETTERS,NUMBERS=INDEX(UNUM,COL)),INDEX(ULET,ROW))
))
}
}
)(UNIQUE(FLATTEN(NUMBERS)),UNIQUE(FLATTEN(LETTERS)))
)($A$1:$AE$1,$A$2:$AE$18)
Assume that your sample datarange is A1:AE18.
apply UNIQUE() and FLATTEN() to A1:AE1, to get the unique entries of column names.
apply UNIQUE() and FLATTEN() to A2:AE18, to get the unique entries of data.
use LAMBDA() to name the dataranges and output of step 1 & 2 as:
NUMBERS (=A1:AE1),
LETTERS (=A2:AE18),
UNUM (=UNIQUE(FLATTEN(NUMBERS))),
ULET (=UNIQUE(FLATTEN(LETTERS))).
create Arrays with {}, which...
1st column's value is a blank, followed by TRANSPOSE(UNUM) in the row,
1st row's value is a blank, followed by ULET in the column.
inside the above said range, use MAKEARRAY() to create results.
MAKEARRAY() set an array by defining the length of ROW and COL, which we uses...
COUNTA(ULET) as the number of rows and,
COUNTA(UNUM) as the number of columns.
inside MAKEARRAY(), you also need a LAMBDA() to apply what to do with each CELL of the new created array, each CELL is accessed by the ROW and COL index.
in our case, we set up the row and col number of the new array using ULET and UNUM. Therefor, the index of each CELL of the new array will be equal to the index of each value inside ULET and UNUM, we can than take that as reference and use COUNTIF() with FILTER() to calculate the number of repeats of each letter in each column name.
You can try this:
= ARRAYFORMULA(
query(
query(
SPLIT(TRANSPOSE(SPLIT(
QUERY(
TRANSPOSE(
QUERY(
TRANSPOSE(
IF(Original!A2:AE18<>"",
"😊"&Original!A1:AE1&"♥"&Original!A2:AE18, )
),,999^99)
),,999^99),
"😊")),
"♥"),
"Select Col1,Col2,count(Col2) group by Col1,Col2"),
"Select max(Col2),Col3 group by Col2,Col3 pivot Col1")
)
Note: (Got inspired by player0 useful answers)
Output:
We can read from the table that: R is appearing 40 times under the column named '1', 24 times under the colum named '2', etc...
The following is a more compact approach than the previous answers:
=arrayformula(query(split(flatten(A1:AE1&"|"&A2:AE18),"|"),"select Col2,count(Col2) group by Col2 pivot Col1"))
N.B. I'm assuming the order of the grouped values in each column is irrelevant, so the QUERY default of lexicographical ordering is fine.

QUERY Function: How to count with multiple counting criterias

[Goal]
I'm trying to count the number of tickets per employee in one column that has a status with either "Finished," "Finished (Scope)," or "Routed (Sales)" for a specific week. In another column I also want to count the number of tickets for a specific week without criteria. The data that I'm pulling from to count the tickets has the following column names.
Column A: Date,
Column B: Ticket ID,
Column E: Employee,
Column H: Finished Week
Column K: Week
In the formula, you'll notice that it's referring to cell H1, which the cell contains the current week which is this formula: =TODAY()-MOD(TODAY()-2,7)-1
[Current Formula]
=QUERY('Data'!$A$3:$J,"Select E,
COUNT(B) where D matches 'Finished|Finished \(Scope\)|Routed \(Sales\)'
AND H = "&H1&" GROUP BY E LABEL COUNT(B) 'Total Finished Tickets'",0)
[What it should look like]
I've created a sample spreadsheet that you can refer to.
Link: https://docs.google.com/spreadsheets/d/1MQLgt_SSbUIKv1rEwx-Y21hooxNOOgcUm_j1rFehHdg/edit?usp=sharing
[Issue]
I was able to create a table that counts the number of tickets per employee with the status as "Finished" OR "Finished (Scope)" OR "Routed (Sales)." Which is the "Current Result" table (Link: https://docs.google.com/spreadsheets/d/1MQLgt_SSbUIKv1rEwx-Y21hooxNOOgcUm_j1rFehHdg/edit#gid=0).
However, as I tried to add another count criteria, it gave me errors and I don't understand how to properly make this work. I wanted to look like the table of the title "Ideal Result" in the shared link. Can someone please help?
You can use the pivot clause to get a breakdown by the Status column like this:
=query(
Data!A3:J,
"select E, count(E)
where H = " & E4 & "
group by E
pivot D
label E 'Employee' ",
0
)
The downside is that the grand total must then be calculated separately, but that can be done with a simple sum() formula.
Alternatively, get the totals first, and then do a lookup to get the number of finished tickets, like this:
=query(
Data!A2:J,
"select E, count(D)
where H = " & E4 & "
group by E
label E 'Employee', count(D) 'Total new tickets' ",
0
)
=arrayformula(
iferror(
vlookup(
E12:E,
query(
Data!A2:J,
"select E, count(D)
where H = " & E4 & "
and (D = 'Finished' or D = 'Finished (Scope)')
group by E
label count(D) 'Finished tickets' ",
1
),
2,
false
)
)
)
Note that this serves just to illustrate how to aggregate the data into a report. Your question leaves it unclear as to which status values should be counted for each type of aggregation. No rows with status Routed (Sales) appear in the data, and I cannot see how the expected results you show could be derived from the data.
See your sample spreadsheet.
H1, which the cell contains the current week which is this formula
=TODAY()-MOD(TODAY()-2,7)-1
You may want to try the weeknum() function.
To get two independent counts, you can't use a Where clause because that would exclude cases from both counts, but you could use the fact that Query does not count empty cells something like this:
=ArrayFormula(query({if(regexmatch(D3:D,"Finished$|Finished \(Scope\)$|Routed \(Sales\)$"),true,),E3:E,if(K3:K>=H1,true,)},"select Col2,count(Col3),count(Col1) where Col2 is not null group by Col2 label count(Col1) 'Finished', count(Col3) 'New'",1))

How can I adjust this query to add conditionals from other columns? Formula / sample sheet included

https://docs.google.com/spreadsheets/d/1TjkR3TEg_eSei-25zUm8yRimftQ6ocRKQNEfrN-9Ogc/edit?usp=sharing
^ Sample sheet with my current formula, sample data, and description of the problem/current situation.
The current formula calculates the average of the last 10 appearances (going from the bottom of the sheet upwards) of columns C or D when "New York" (cell K1) is in columns B or C.
If New York appears in column B then it uses the value in column D, and if New York appears in column C it uses the value in column E.
The improvement I want to make is that it only uses the values (within those last 10 appearances of "New York" / cell K1) based on conditionals of columns G/F. In this case, let's say >10 as the conditional.
When "New York" is in columns B/C, for the last 10 appearances, it should bring the value in D into the equation if the value in F is >10 (and New York is in column B), and it should bring E into the equation if the value in G >10 (and New York is in column C).
Any ideas?
range construct:
={A:A, B:B, D:D, G:G;
A:A, C:C, E:E, F:F}
or shorter:
={A:B, D:D, G:G;
A:A, C:C, E:F}
use:
=AVERAGE(QUERY(SORT({A:B, D:D, G:G; A:A, C:C, E:F}, 1, ),
"select Col3
where Col4 > 10
and Col2 = '"&K1&"'
limit 10"))
I won't calculate the average, just so you can see the data records the query is pulling, and confirm the records. But I think my formula works.
=query(
query(
{query(A1:G,"select A,B,D,G where B='"&K1&"' ",0);
query(A1:G,"select A,C,E,F where C='"&K1&"' ",0)},
"select * order by Col1 desc limit 10",0),
"select * where Col4 > 10",0)
To get the average, change the last line of the formula to:
"select avg(Col3) where Col4 > 10",0)
Note: my understanding is that you want to filter the ten latest records with New York, and then filter those ten records to just those which have a value > 10 in the right column. This is different then the ten latest records that are New York AND have a value > 10 in the right column. But either solution can be provided.
I've stacked two queries together, to make the correct columns align vertically. So the first inner query gets column A,B,D and G, checking for New York (ie equal to K1) in B. Then the second query stacks columns A,C,E, and F underneath, checking for New York in C.
An outer query then sorts them in descending order by the date column, Col1 (column A). By setting a limit of ten, we get the ten latest records.
A final query is used to select the records with Col4>10. By changing this query to just return the avg(Col3), you should have your desired result.
It should be easy to modify this formula to get what you need.
Note also I believe that you missed a couple of records to be blue - G21 and F28? And E21 should be green also?
Update
When using the final version of the formula, to extract the Average, you can add the LABEL parameter to the QUERY statement to rename, or remove, the header label for that average. So in my example, the SELECT statement would become:
"select avg(Col3) where Col4 > 10 label avg(Col3) '' ",0)
or
"select avg(Col3) where Col4 > 10 label avg(Col3) 'New Label Name Here' ",0)
Update #2
I have provided a sample sheet, which has the enhancements you requested. The formula that calculates the result, the average, is in J3. The formula looks to a variable cell, I3, for the city name. I3 uses data validation, from a list in K2:L, to present the drop down list of city names to pick from.
The selection criteria are located in J6 and J7. If you had standard values you wanted to pick from here, maybe between 10 and 20, they could also be presented with a drop down list. But otherwise, just type in the desired limit values.
As an enhancement, I used conditional formatting to color the active cells in the data. Note that all matching rows will get colored, not just the latest ten. But the formula calculating the average should just be suing the ten latest, THEN applying the criteria, before calculating the average. Test this carefully to be sure it is doing what you expect.
Note that the correct placement of the single and double quotes is very important when referencing criteria cells with the SELECT ... WHERE ... statements. Comparison to text values requires single quotes, whereas comparison to numeric values excludes the single quotes.
Valid QUERY Select statements for a numeric comparison:
"select * where A >= " & $B$5 & " limit 5 "
Valid QUERY Select statements for a text/string comparison:
"select * where A >= 'New York' limit 5 "
"select * where A >= '" & $B$5 & "' limit 5 "
<<== Do not have any spaces between the single and double quotes!
Invalid QUERY Select statements for a text/string comparison
"select * where A >= "New York" limit 5 "
<<== Do not have any spaces between the single and double quotes!
"select * where A >= ' " & $B$5 & " ' limit 5 "
<<== Valid, but matches " New York ", not "New York"!

IF statement in query

I have a table with two columns A and B the first is a tag and the second is an amount. I am trying to write a query with two columns, one summing up negative values while the other summing up positive ones.
Coming from SQL, I tried the following
=QUERY(A1:B100,
"SELECT A, SUM( B * IF(B>0, 0, 1) ),
SUM( B * IF(B<0, 0, 1) ) GROUP BY A ")
But it seems that the IF function is not supported in a query. I know I can create two intermediate columns in my sheet (one for positive value and one for negative ones), but I was wondering if it's possible to achieve what I want with a query or somehow without intermediate columns.
If you must use the query function, assuming your Tag Data is in Column A, and your Values in Column B:
=arrayformula(query({A1:A100,if(B1:B100>0,B1:B100,),if(B1:B100<0,B1:B100,)},"Select Col1, sum(Col2), sum(Col3) where Col1 <>'' group by Col1 label Col1 'Tag', sum(Col2) 'Positive', sum(Col3) 'Negative'"))
Here's the example output: https://docs.google.com/spreadsheets/d/1DW5CyPCC71CopW48uKy6basn-WP4hMfh7kuuJXT-C4o/edit#gid=1606239479
=arrayformula(query({a1:a100,if(b1:b100>0,b1:b100,),if(b1:b100<0,b1:b100,)},"Select Col1,sum(Col2),sum(Col3) group by Col1"))
Please see this sheet for an example of using the FILTER function which is probably better than your query function for this use case:
https://docs.google.com/spreadsheets/d/1DW5CyPCC71CopW48uKy6basn-WP4hMfh7kuuJXT-C4o/edit?usp=sharing
I didn't know what you meant by tag, but I just created a list of random words, 10 negative, and 10 positive
With Tags in Column A and Numbers in Column B. Then in Column D I put this for the "positive" filter:
=filter($A$2:$A,$B$2:$B>0)
And for the Positive Sum:
=sum(filter($B$3:$B,$B$3:$B>0))
And in Column E for the Negative filter:
=filter($A$2:$A,$B$2:$B<0)
And for the Negative Sum:
=sum(filter($B$3:$B,$B$3:$B<0))
EDIT: I added another sheet in the workbook that shows you how to list the sum next to each tag in a filtered list of the tags:
On this sheet, I created examples of how to list the total sums of each particular tag: https://docs.google.com/spreadsheets/d/1DW5CyPCC71CopW48uKy6basn-WP4hMfh7kuuJXT-C4o/edit#gid=1784614303
This formula will look at the list of tags/values in Columns A & B, and then match and sum all tags that are in the cell to the left in Column D:
=sum(filter($B$3:$B,$B$3:$B>0,$A$3:$A=D3))

Resources