I have a Google sheet which has columns with the same name and there are different values under each column. I want to count the same value that appear under the same column name.
1
2
3
1
2
R
B
C
R
D
D
C
R
B
D
For example, I would like to get the number "R" that appear under column "1", so I would expect a count of 2 for "R" appearing under columns 1.
Here is a link to Google Sheet with actual data.
I have tried countif and countifs in Google Sheets, but can't figure out how to get the count right based on column name.
Try this formula, it outputs an array which shows how many of each letters are contained in each column name:
=LAMBDA(NUMBERS,LETTERS,
LAMBDA(UNUM,ULET,
{
{"",TRANSPOSE(UNUM)};
{ULET,
MAKEARRAY(COUNTA(ULET),COUNTA(UNUM),LAMBDA(ROW,COL,
COUNTIF(FILTER(LETTERS,NUMBERS=INDEX(UNUM,COL)),INDEX(ULET,ROW))
))
}
}
)(UNIQUE(FLATTEN(NUMBERS)),UNIQUE(FLATTEN(LETTERS)))
)($A$1:$AE$1,$A$2:$AE$18)
Assume that your sample datarange is A1:AE18.
apply UNIQUE() and FLATTEN() to A1:AE1, to get the unique entries of column names.
apply UNIQUE() and FLATTEN() to A2:AE18, to get the unique entries of data.
use LAMBDA() to name the dataranges and output of step 1 & 2 as:
NUMBERS (=A1:AE1),
LETTERS (=A2:AE18),
UNUM (=UNIQUE(FLATTEN(NUMBERS))),
ULET (=UNIQUE(FLATTEN(LETTERS))).
create Arrays with {}, which...
1st column's value is a blank, followed by TRANSPOSE(UNUM) in the row,
1st row's value is a blank, followed by ULET in the column.
inside the above said range, use MAKEARRAY() to create results.
MAKEARRAY() set an array by defining the length of ROW and COL, which we uses...
COUNTA(ULET) as the number of rows and,
COUNTA(UNUM) as the number of columns.
inside MAKEARRAY(), you also need a LAMBDA() to apply what to do with each CELL of the new created array, each CELL is accessed by the ROW and COL index.
in our case, we set up the row and col number of the new array using ULET and UNUM. Therefor, the index of each CELL of the new array will be equal to the index of each value inside ULET and UNUM, we can than take that as reference and use COUNTIF() with FILTER() to calculate the number of repeats of each letter in each column name.
You can try this:
= ARRAYFORMULA(
query(
query(
SPLIT(TRANSPOSE(SPLIT(
QUERY(
TRANSPOSE(
QUERY(
TRANSPOSE(
IF(Original!A2:AE18<>"",
"😊"&Original!A1:AE1&"♥"&Original!A2:AE18, )
),,999^99)
),,999^99),
"😊")),
"♥"),
"Select Col1,Col2,count(Col2) group by Col1,Col2"),
"Select max(Col2),Col3 group by Col2,Col3 pivot Col1")
)
Note: (Got inspired by player0 useful answers)
Output:
We can read from the table that: R is appearing 40 times under the column named '1', 24 times under the colum named '2', etc...
The following is a more compact approach than the previous answers:
=arrayformula(query(split(flatten(A1:AE1&"|"&A2:AE18),"|"),"select Col2,count(Col2) group by Col2 pivot Col1"))
N.B. I'm assuming the order of the grouped values in each column is irrelevant, so the QUERY default of lexicographical ordering is fine.
Related
I have two column in a Google sheet that have values in them. Column A has all possible values (inclusive duplicate values) and Column B has some of the values in A (also inclusive duplicates).
I wanted to find out which values in Column A do not appear in Column B, also taking into account number of occurrences of these values.
The MATCH function works well however I wanted to have three instances of HQR123 appearing in the Pending column, as there are 4 occurrences of this value in Column A vs only 1 occurrence in Column B. If another instance of HQR123 is entered in Column B then only two instances should appear in the Pending column.
Is this possible?
Thanks and regards,
Shalin.
You could try something like
=sort(
index(
filter(A2:A,
isna(
match(A2:A&countifs(A2:A, A2:A, row(A2:A), "<="&row(A2:A)),
B2:B&countifs(B2:B, B2:B, row(B2:B), "<="&row(B2:B))
, 0)
))))
For each of the email id, I want to get latest 10 records by timestamp. How do I get the results with arrayformula? Query function is not important as long as I can still achieve this with arrayformula. Here is the sample data:
https://docs.google.com/spreadsheets/d/1YAHA02VM-5MXzVKhkxu_eODPKObpoz441mGX8lOFu5M/edit?usp=sharing
Try this on another sheet, row 1:
=arrayformula(query({query({Sheet1!$A:$C},"order by Col1 desc,Col2",1),{"Dupe position";countifs(query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),row(Sheet1!$A2:$C),"<="&row(Sheet1!$A2:$C))}},"select Col1,Col2,Col3 where Col1 is not null and Col4 <= 10 order by Col1",1))
You can adjust the number of records found by adjusting Col4 <= 10, and also the final sort by altering order by Col1 at the end of the formula.
Explanation
This gets the data from Sheet1, sorts it by date desc then email asc:
query({Sheet1!$A:$C},"order by Col1 desc,Col2",1)
Then to the side of this data, a COUNTIFS() is used to get the number each time an email appears in the list above (since it's sorted desc, 1 represents the most recent instance).
countifs(<EmailColumnData>,<EmailColumnData>,row(<EmailColumn>),"<="&row(<EmailColumn>))
In place of <EmailColumnData> in the COUNTIF() is:
query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0)
In place of <EmailColumn> above, we only want the row number so we don't need the actual data. We can use:
Sheet1!$A2:$C
Various {} work as arrays to bring the data together.
Eg., {a,b,c;d,e,f} would result in three columns, with a, b, c in row 1 and d, e, f in row 2. , is a new column, ; is a return for a new row.
A final query around everything gets the 3 columns we need, where the count number in col 4 is <=10, then sorts the output by Col1 (date asc).
On second thoughts, maybe this is bit cheeky, but this might do it ( taken from conditional rank idea )
=ArrayFormula(filter(A2:C,countifs(A2:A,">="&A2:A,B2:B,B2:B)<=10,A2:A<>""))
EDIT
The above assumes (because the data is time-stamped) dups shouldn't occur. If they do and the data is pre-sorted, you can use row number as a proxy for time stamp as suggested by #Aresvik.
Alternatively, you could count separately
(a) only rows with a later timestamp
plus
(b) rows with the same time stamp but with earlier (or identical) row number
=ArrayFormula(filter(A2:C,countifs(A2:A,">"&A2:A,B2:B,B2:B)+countifs(A2:A,"="&A2:A,B2:B,B2:B,row(A2:A),"<="&row(A2:A))<=10,A2:A<>""))
I have added a new sheet ("Erik Help") with the following formula in A1:
=ArrayFormula({"Submitted Time","Email","Score";SORT(SPLIT(FLATTEN(QUERY(SORT(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(IF(Sheet1!B2:B=TRANSPOSE(UNIQUE(FILTER(Sheet1!B2:B,Sheet1!B2:B<>""))),Sheet1!A2:A&"|"&Sheet1!B2:B&"|"&Sheet1!C2:C,),,COUNTA(Sheet1!A2:A)))," ",0,1)),SEQUENCE(MAX(COUNTIF(Sheet1!B2:B,Sheet1!B2:B))),0),"LIMIT 10")),"|",1,0),1,0)})
The number of records is set after LIMIT.
The order is set by the final two numbers: 1,0 (meaning "sort by column 1 in reverse order," which, as currently set, is sorting in reverse order by date/time).
In the attached google sheet.
Google Sheet - https://docs.google.com/spreadsheets/d/1KxqaI-GYWur0Knt_bShI0GURucqU_cawu_sCoG_8Xlc/edit?usp=sharing
I need to execute the following two operations on input.
Add same number of rows against which ID if any user put the value as 1 in column K.
For Example, If I input 1 in column K for ID = ID_3 which has 5 rows, I want to append that five rows below where the last instance of ID_3 and in appended rows all the values should be the same as ID_3 except column C which will now be ID_3_Append, column K and O which should be blank for appended rows.
If someone input value as 1 in column M, We need to check for which ID it belongs and look for that id in column E, F, and G and highlight rows with red color if ID against which user has provided input as 1 is available in column E, F, and G.
If someone adds value as 1 in change type, I need to update the column B to I in the Test_1 sheet from Put_list but want to keep the Mark Status for common ID (if it has any Mark Status) between earlier Test_1andPut_list` unchanged. Also, we need to highlight the dependent rows accordingly.
Once we update column B to I we need to change the Name from 'Call' to 'Put'.
use:
=ARRAYFORMULA(QUERY({QUERY({Test_1!A:O;
IFERROR(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!K2:K}, 2, 0), 2, 0))=1,
{Test_1!A2:A, Test_1!B2:B, Test_1!C2:C&"_Append", Test_1!D2:J, Test_1!X2:X*0,
Test_1!L2:L, Test_1!X2:Y*0, Test_1!O2:O},
{"","","","","","","","","","","","","","",""}),
{"","","","","","","","","","","","","","",""})},
"where Col1 is not null and not Col3 matches '"&
TEXTJOIN("|", 1, "×", UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C,
SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, )))&"' order by Col3", 1);
IF(LEN(TEXTJOIN("|", 1,
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, ))))>0,
QUERY({IF(Put_list!A2:A="",,IFERROR(Put_list!A2:A*1, "Put")), Put_list!A2:H,
IFNA(VLOOKUP(Put_list!B2:B&"♦"&COUNTIFS(Put_list!B2:B, Put_list!B2:B, ROW(Put_list!B2:B), "<="&ROW(Put_list!B2:B)),
{Test_1!C2:C&"♦"&COUNTIFS(Test_1!C2:C, Test_1!C2:C, ROW(Test_1!C2:C), "<="&ROW(Test_1!C2:C)), Test_1!J2:O}, {2,3,4,5,6,7}, 0))},
"where Col3 matches '"&TEXTJOIN("|", 1,
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C, )),
UNIQUE(IF(IFNA(VLOOKUP(Test_1!C2:C, SORT({Test_1!C2:C, Test_1!N2:N}, 2, 0), 2, 0))=1, Test_1!C2:C&".+", )))&"'"),
{"","","","","","","","","","","","","","",""})}, "where Col1 is not null order by Col3", 1))
transcript:
we start with array of {C, K} columns that we sort based on K column so if K column contains 1 then it will be moved up
this is convinient for VLOOKUP coz it will always look for 1st unique value eg. exactly wat we need if our array is sorted
so we vlookup C values to match our sorted array and return column K for all same IDs 2 stands for 2nd column from sorted array and 0 stands for "exact match"
if no match is found vlookup will output #N/A error so we wrap it into IFNA - then if no match is found vlookup will output empty rows
then we put this into IF statement... if our vlookup outputs 1, we output our columns (again in {array} form
if our vlookup outputs 0 then we output array of 15 empty cells in a row {"","","", .....} this is because we use arrays {} and all ranges in arrays needs to be of same size
our table has 15 columns so if some error would happen the we would get error in one single cell aganist our 15 columns
this way we avoid "array_literal error" having {15 columns; 15 cells in a row which is also 15 columns}
; semicolon puts these two arrays/ranges under each other while , comma will put then next to each other
so if vlookup results in 1 then we assemble our array with ranges... A, B columns are same then we append to C column the phrase "_Append" with &
D:J columns are same... then we force column of zeros by multipling random empty column (X) with 0 etc.
then again we use {"","","", ....} within IFERROR to deal with any possible error at this point
next we put all written above under our whole range Test_1!A:O and we wrap it into QUERY where we state to filter out all empty rows where Col1 is empty and 2nd condition of our query states
that Col3 of our array cant match our regex pattern which we assemble with TEXTJOIN formula
| stands for "or" in regex. 1 in textjoin means "all non empty cells". then we use × (unique symbol) in case textjoin would output empty cell on its own resulting in some error somewhere
again we perform same vlookup, same ifna and same IF but now we output only C column and we are interested only into UNIQUE values
then within the query we sort / order by Col3 our table and 1 at the end stands for "header rows"
This may answer the second part of your question.
Select the range of rows you want the red conditional formatting on, for example A2:A12 in your sample sheet.
Select Format - Conditional Formatting, verify the range, and apply a custom formula, as follows:
=OR(
IF($E2<>"NA",IFERROR(MATCH($E2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0),
IF($F2<>"NA",IFERROR(MATCH($F2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0),
IF($G2<>"NA",IFERROR(MATCH($G2,FILTER($C$1:$C$12,$M$1:$M$12=1),0)),0))
The FILTER pulls which IDs have a "1" in the update column, ColM.
MATCH looks to see if the ID in the Dependency column is in that filtered list.
The IFERROR sets the result to zero if no match.
And the initial IF, filters out the NA values in the Dependency column.
This logic is replicated for each Dependency column, so three IFs.
And the whole thing is wrapped in an OR function, so if any of the Dependency columns have a matching ID, the whole row is flagged for red.
I've applied this in a tab added to your sheet, Test_1-GK.
Let me know if this helps.
I'm afraid I didn't understand the first part of your question. I got confused with this bit:
I want to append that five rows below where the last instance of ID_3 and in appended rows all the values should be same as ID_3 except column C which will be now be ID_3_Append, column K and O which should be blank for appended rows.
I have a table with two columns A and B the first is a tag and the second is an amount. I am trying to write a query with two columns, one summing up negative values while the other summing up positive ones.
Coming from SQL, I tried the following
=QUERY(A1:B100,
"SELECT A, SUM( B * IF(B>0, 0, 1) ),
SUM( B * IF(B<0, 0, 1) ) GROUP BY A ")
But it seems that the IF function is not supported in a query. I know I can create two intermediate columns in my sheet (one for positive value and one for negative ones), but I was wondering if it's possible to achieve what I want with a query or somehow without intermediate columns.
If you must use the query function, assuming your Tag Data is in Column A, and your Values in Column B:
=arrayformula(query({A1:A100,if(B1:B100>0,B1:B100,),if(B1:B100<0,B1:B100,)},"Select Col1, sum(Col2), sum(Col3) where Col1 <>'' group by Col1 label Col1 'Tag', sum(Col2) 'Positive', sum(Col3) 'Negative'"))
Here's the example output: https://docs.google.com/spreadsheets/d/1DW5CyPCC71CopW48uKy6basn-WP4hMfh7kuuJXT-C4o/edit#gid=1606239479
=arrayformula(query({a1:a100,if(b1:b100>0,b1:b100,),if(b1:b100<0,b1:b100,)},"Select Col1,sum(Col2),sum(Col3) group by Col1"))
Please see this sheet for an example of using the FILTER function which is probably better than your query function for this use case:
https://docs.google.com/spreadsheets/d/1DW5CyPCC71CopW48uKy6basn-WP4hMfh7kuuJXT-C4o/edit?usp=sharing
I didn't know what you meant by tag, but I just created a list of random words, 10 negative, and 10 positive
With Tags in Column A and Numbers in Column B. Then in Column D I put this for the "positive" filter:
=filter($A$2:$A,$B$2:$B>0)
And for the Positive Sum:
=sum(filter($B$3:$B,$B$3:$B>0))
And in Column E for the Negative filter:
=filter($A$2:$A,$B$2:$B<0)
And for the Negative Sum:
=sum(filter($B$3:$B,$B$3:$B<0))
EDIT: I added another sheet in the workbook that shows you how to list the sum next to each tag in a filtered list of the tags:
On this sheet, I created examples of how to list the total sums of each particular tag: https://docs.google.com/spreadsheets/d/1DW5CyPCC71CopW48uKy6basn-WP4hMfh7kuuJXT-C4o/edit#gid=1784614303
This formula will look at the list of tags/values in Columns A & B, and then match and sum all tags that are in the cell to the left in Column D:
=sum(filter($B$3:$B,$B$3:$B>0,$A$3:$A=D3))
I am trying to get the some of all values in row B that contain a certain value in row A. Pretty simple problem I guess.
Here is my query:
=QUERY('Sheet1'!$A$16:D, "Select sum(D) Where C contains '"&C5&"' ", -1)
But what that gives me is the actual word "sum" in all the fields where C contains the value.
So I get a lot of "sum" in almost all my rows.
Did the "sum" statement change for queries in google spreadsheets?
It looks like you are using more than one query formula: apparently there is a column with query, each referring to a cell such as C5. In this case there is no room for column label "sum" that the formula wants to insert: the output must be a single cell. Solution: change the column label to empty string with label sum(D) ''.
=QUERY('Sheet1'!$A$16:D, "Select sum(D) Where C contains '"&C5&"' label sum(D) ''", -1)