I have a google spreadsheet where I am trying to count the number of rows where a certain value is present in at least one column. The number of columns with data varies by row.
For example, let's use the following sheet as an example:
https://docs.google.com/spreadsheets/d/1yUnYBsmjKIOF_PubYQ6G41fmIPcw_7hzkiQ6qMIoN64/edit?usp=sharing
Each row represents a task, and the data for who worked on the project is added by adding additional columns.
I would like count how many tasks each person has worked on at least once. (If Person A worked on Task A multiple times, it would only count as 1).
I've tried using formulas such as COUNTIFS or COUNTUNIQUEIFS, but am being thrown off by the fact that the number of columns can vary.
Any ideas of how I can accomplish this?
See if this helps
=countif(ArrayFormula(mmult(N(Sheet1!E2:100=A1), transpose(column(Sheet1!E2:2)^0))), ">1")
See the added sheet "Erik Help," cell A1, for the following array formula:
=ArrayFormula(QUERY({{"Employee";UNIQUE(QUERY({Sheet1!E2:E;Sheet1!H2:H;Sheet1!K2:K;Sheet1!N2:N;Sheet1!Q2:Q;Sheet1!T2:T;Sheet1!W2:W;Sheet1!Z2:Z;Sheet1!AC2:AC;Sheet1!AF2:AF},"Select * Where Col1 Is Not Null",0))},{"Projects";COUNTIF({Sheet1!E2:E&Sheet1!H2:H&Sheet1!K2:K&Sheet1!N2:N&Sheet1!Q2:Q&Sheet1!T2:T&Sheet1!W2:W&Sheet1!Z2:Z&Sheet1!AC2:AC&Sheet1!AF2:AF},"*"&UNIQUE(QUERY({Sheet1!E2:E;Sheet1!H2:H;Sheet1!K2:K;Sheet1!N2:N;Sheet1!Q2:Q;Sheet1!T2:T;Sheet1!W2:W;Sheet1!Z2:Z;Sheet1!AC2:AC;Sheet1!AF2:AF},"Select * Where Col1 Is Not Null",0))&"*")}},"Select * Order By Col2 Desc"))
I will break it down somewhat here, and then encourage you to further dissect it if deeper learning is required.
Obviously, it's an array formula. In this case, that means that one formula is producing the entire report.
The outermost QUERY is just putting the results inside the double curly brackets {{ }} in order by project count: QUERY( {{...}} ,"Select * Order By Col2 Desc")
You see those double curly brackets. But really, it's an outer set of curly brackets containing two more sets of curly brackets: { {...},{...} } The inner two arrays create the first column and second column of the report, respectively. The comma means to place them side by side. You will notice that the first element of each of those inner arrays produces the header for the respective columns (i.e., "Employee" and "Projects").
The semicolon following each of these headers means to place what follows underneath rather than side by side. You will notice a lot of those semicolons in the first inner array. Because you said that there would never be any more than 10 people working on any one project, we can predetermine all columns that might hold names and virtually "stack" them with those semicolons, forming one long virtual column. Of course, many of those columns will be empty, because many projects won't have a full ten names of people attributed to them. So this virtual column of names is wrapped in its own QUERY that will weed out nulls:
QUERY({Sheet1!E2:E;Sheet1!H2:H;Sheet1!K2:K;Sheet1!N2:N;Sheet1!Q2:Q;Sheet1!T2:T;Sheet1!W2:W;Sheet1!Z2:Z;Sheet1!AC2:AC;Sheet1!AF2:AF},"Select * Where Col1 Is Not Null",0)
To this, I applied UNIQUE, which provides the first-column list of unique names (rather than every time a name appears):
UNIQUE(QUERY({Sheet1!E2:E;Sheet1!H2:H;Sheet1!K2:K;Sheet1!N2:N;Sheet1!Q2:Q;Sheet1!T2:T;Sheet1!W2:W;Sheet1!Z2:Z;Sheet1!AC2:AC;Sheet1!AF2:AF},"Select * Where Col1 Is Not Null",0))
So the complete first inner virtual array (which forms the complete first column of the final report) looks like this:
{"Employee";UNIQUE(QUERY({Sheet1!E2:E;Sheet1!H2:H;Sheet1!K2:K;Sheet1!N2:N;Sheet1!Q2:Q;Sheet1!T2:T;Sheet1!W2:W;Sheet1!Z2:Z;Sheet1!AC2:AC;Sheet1!AF2:AF},"Select * Where Col1 Is Not Null",0))}
The second inner virtual array uses the ampersand to join all names assigned to each project into one long string. So, for instance, if Chris, John and Ryan all worked on a project (and some of them multiple times), their rows (unseen) concatenation might look like this: ChrisChrisJohnChrisRyanChris.
We run a COUNTIF on each of this virtual array made up of such concatenations, and the condition you'll see is made up largely of the entire UNIQUE clause from the first inner virtual array (which is the shortlist of all names possible). You will notice that this is appended front and back by asterisks like this: ""&UNIQUE(...)&"" Asterisks are wildcards for any number of characters. So essentially this will search those long concatenated strings for the appearance of each name anywhere; and as soon as a name is found, COUNTIF registers it as TRUE... once (not each time it appears in the string).
So that second inner virtual array looks like this in isolation:
{"Projects";COUNTIF({Sheet1!E2:E&Sheet1!H2:H&Sheet1!K2:K&Sheet1!N2:N&Sheet1!Q2:Q&Sheet1!T2:T&Sheet1!W2:W&Sheet1!Z2:Z&Sheet1!AC2:AC&Sheet1!AF2:AF},"*"&UNIQUE(QUERY({Sheet1!E2:E;Sheet1!H2:H;Sheet1!K2:K;Sheet1!N2:N;Sheet1!Q2:Q;Sheet1!T2:T;Sheet1!W2:W;Sheet1!Z2:Z;Sheet1!AC2:AC;Sheet1!AF2:AF},"Select * Where Col1 Is Not Null",0))&"*")}
Without that outermost QUERY I mentioned up front here, you'd still get accurate results; they'd just be displayed in whatever order the UNIQUE names list happened to appear in the projects. I felt it would make more send to order them by whoever had the most projects to the least.
Related
I have a table like this one here (basically it's data from a google form with multiple choice answers in column A and B and non-muliple choice data in column C) I need a separate row for each multiple choice answer.
Column A
Column B
Email
A,B
XX,YY
1#gmail.com
A,C
FF,DD
2#gmail.com
I tried to un-nest the first column and keep the remaining columns like this
enter image description here
I tried several approaches I found with flatten and split with array formulas but I don't know where to start really.
Any help or hint would be much appreciated!
You can use the split function on the column A and after that, use the index function. Considering the table, you can use:
=index(split(A2,","),1,1)
The split function separate the text using the delimiter indicated, returning an array with 1 line and 2 columns; the index function will return the first line and the first column from this array. To return the second element from the column A, just change to
=index(split(A2,","),1,2)
I think there's no easy solution for this. You're asking for as many combinations of elements as multiple-choice elections have been made. Any function in Google Sheets has its potentials and limitations about how many elements it can express. One very useful formula here is REDUCE. With REDUCE and sequences of elements separated by commas counted with COUNTA, you can stablish this formula:
=QUERY(REDUCE({"Col A","Col B","Email"},SEQUENCE(COUNTA(A2:A)),LAMBDA(z,c,{z;LAMBDA(ax,bx,
REDUCE({"","",""},SEQUENCE(ax),LAMBDA(w,a,
{w;
REDUCE({"","",""},SEQUENCE(bx),LAMBDA(y,b,
{y;INDEX(SPLIT(INDEX(A2:A,c),","),,a),INDEX(SPLIT(INDEX(B2:B,c),","),,b),INDEX(C2:C,c)}
))})))
(COUNTA(SPLIT(INDEX(A2:A,c),",")),COUNTA(SPLIT(INDEX(B2:B,c),",")))})),
"Where Col1 is not null",1)
Since I had to use a "initial value" in every REDUCE, I then used QUERY to filter the empty values:
A spreadsheet contains multiple rows and columns with names (in varying order), the same name can appear in multiple places, but not necessarily in the same column or row.
Looking to list all names and count the number of times each name appears (no duplicates).
Tried the UNIQUE in combination with COUNTIF, but I can't seem to make them work together. :(
I'm sure there's some way of nesting formulas to tabulate the results, but I just can't wrap my head around it.
You can select your whole range in a query like this (change A2:F with your desired range)
=QUERY(FLATTEN(A2:F),"SELECT Col1,COUNT(Col1) where Col1 is not null group by Col1")
See this answer on how to stack unique counts when values are in multiple columns.
For example if your data is in A1:D10:
=UNIQUE({A1:A10;B1:B10;C1:C10;D1:D10})
Will return a (vertical) list of all unique values. Then use countif in a new column on the whole range (rows, columns) with condition on each of the unique values.
I have a Google Form that collects a bunch of data from dropdown questions on a Sheet with each question going to one column (as normal). On separate sheets, I want to be able to count how many times each option is selected.
Here is an example of what the response sheet might look like. A, B, and C are all questions.
I would then have separate sheets for 'Person?', 'Place?', and 'Thing?'. The 'Person?' sheet would look something like this:
I want to be able to add in the count of each time the option appears for that question. In the example, notice that 'Napoleon" is in both Col A and Col C. If I just count the number of times 'Napoleon' appears, I will get '2' even though he only appears once in the "Person?" responses.
I originally used a QUERY function like =QUERY('Input Data'!1:1000, "select count(A) where A contains '"&$A2&"'",0). BUT, I need it to be dynamic. So the "Person?" question may not always be Col A. I want the Query (or whatever formula) to search the headers and only return the count of that option for that question even if the column location changes.
Okay, I figured it out! In case someone else is curious, I used this formula:
=QUERY({'Input Data'!A1:L}, "SELECT COUNT(Col"&MATCH("Person?", 'Input Data'!1:1,0)&") WHERE Col"&MATCH("Person?", 'Input Data'!1:1,0)&" CONTAINS '"&$A2&"' label COUNT(Col"&MATCH("Person?", 'Input Data'!1:1,0)&") ''",0)
Lee, I sent you a PM about your most recent post, but in the process, I came across this one. There is no need for multiple formulas or manual entry references. One formula can produce the entire report with headers, listing and counts:
=IFERROR(QUERY(FILTER(FILTER(A:L,A:A<>""),A1:L1="Person?"),"Select Col1, COUNT(Col1) GROUP BY Col1 ORDER BY Col1 LABEL COUNT(Col1) 'Count'",1),"No Matches")
Just fill in the header your looking for between the quotes where Person? is now.
The double FILTERs mean "Start with only rows where Col A is not null and Row 1 reads 'Person?'"
Then QUERY simply returns the unique names in the left column and their counts in the right column. Because the QUERY had a final parameter of 1, any existing header will be kept (in this case, the one you were searching for); and the created column will receive a header (i.e., LABEL) of Count.
IFERROR will give a friendly error message if no matches are found (in which case check that what you entered for the search in the formula exactly matches a column header in the range).
here i have a sheet, in that we can find the sum of diff categories using query function by selecting from drop down list. but here I can select one month only at a time can i find the amount of January and February at the same time by adding another column for another month or in any other way. here I can find the sales of one month at a time. I want to find sales of two or three months at time.
Please help
https://docs.google.com/spreadsheets/d/1jdtrtdNQBsxiZt8FjvbaE9omCBs8x8vRgp0r2bc1_7c/edit#gid=0
There's no way you can make a drop down list with multiple choice in Google Sheets.
But there are some alternatives.
List of tick boxes (here as list of months)
Manual input of multiple values separated by comma or something else.
I give both:
Months are selected as list separated by | so it can be used as regex inside 'matches' clause in query
This generates list of months:
=join("|",query({A2:B7;C2:D7},"select Col2 where Col1 = true "))
Window with manual input works similar way
=substitute(substitute(F3,", ",","),",","|")
It takes its contents, removes spaces that are adjacent to comma, adds separator | instead of comma. It's case sensitive and I don't know how to get rid of this (?i) does not work within query.
All together it looks like on the picture and combined formula is:
=query(ORDERS!A1:R14,"select A, B, C , D where
A matches '"&join("|",query({A2:B7;C2:D7},"select Col2 where Col1 = true "))&"' and
B matches '"&substitute(substitute(F3,", ",","),",","|")&"'",0)
Here is my solution:
https://docs.google.com/spreadsheets/d/1fQ5_VdxZ-t4MqPbLqzb8q-saqp5Jqz3hVXeWHX_Lls4/copy
I copied your spreadsheet to do my testing. Here's what you can do.
Add another row of the same exact selection found on your "A" row.
Change your formula to this: ={query(ORDERS!A1:R,"Select * where A contains '"&$A2&"' and B contains '"&$B2&"'",1);query(ORDERS!A1:R,"Select * where A contains '"&$A3&"' and B contains '"&$B3&"'",0)}
What this does it run an array of two sets of formulas (In this case 2 queries) for the same set of data.
Here's the screenshot of the output if you're interested.
Sample Screenshot
I have a data set that looks like this: starting on A1 with "1"
1 a
2 b
3 c
4 d
Column A is an arrayformula =arrayformula(row(b1:b))
Column B is manual input
i want to query the database and finding the row of the item by match column B so i have code as such
=query("A1:B","select A where B like '%c%')
this should give me "3"
My question:
is there a way to pull the 1-4 numbers into the query line? with something like array formula row(b1:b). I don't want to waste an extra column on column A
so basically I want just the manual input and when i query it gives me the row number.
No script code please.
I've tried a few things and it didn't work.
Looking for a solutions that starts with
=query()
You can also use a formula to pull in more than one row in the dataset which matches the condition, if this is important to you:
=arrayformula(filter(row(B:B); B:B="c"))
And you can have wildcard type operators, under certain circumstances (you are going to match text or items that can look like text (so numbers can be treated as text - but boolean will need more steps); that the dataset is not huge), using regular expressions. e.g.
=arrayformula(filter(row(B:B); regexmatch(B:B, "(c|d)")))
You could also use standard spreadsheet wildcard operators, e.g.
=arrayformula(filter(row(B:B); countif(B:B, "*c*")))
Explanation: In this case, the filter will be true when countif is greater than zero, i.e. when it sees something with a letter c in it, since spreadsheets see a value greater than zero as a boolean true and so, for that row where there is a countif match, there will be a a filter match, and so it will display that row (indeed, it is a similar situation with the regexmatch creating a true when there is a match of either c or d, in the case above).
Personally, I wanted to learn regex a bit, so I would go towards the regexmatch option. But that is your choice.
You can also, of course, create the match outside of the cell. This makes it easy to create a list of matches that you want to satisfy elsewhere on the sheet. So you could have a column of words or parts of words, from Z2 downwards, and then join them together in cell Z1 for example like this
="("&join("|",filter(Z2:Z50,len(Z2:Z50)))&")"
Then your filter function would look like this:
=arrayformula(filter(row(B:B), regexmatch(B:B, Z1)))
If you want to use like operator in the query function, you can try something like this:
=arrayformula(query(if({1,0}, B:B,row(B:B)),"select Col2 where Col1 like '%c%' "))
You can also use the regular expressions in the query function, for example:
=arrayformula(query(if({1,0}, B:B,row(B:B)),"select Col2 where Col1 matches '(.*c.*|.*d.*)' "))
I'm not entirely clear on the question, but as I understand it, you want to be able to enter a formula, and have it return the row number of the matched item in a range? I'm not sure where array formulas come in.
If I've understood your question correctly, this should do the trick:
=MATCH("C",B1:B,0)
In your example, this returns 3.
Please forgive me if I've misunderstood your question.
Note: If there are multiple matches, this will return the row number for the first instance of your search.
=QUERY({A1:A,ARRAYFORMULA(ROW(A1:A))},"SELECT Col2 WHERE Col1 LIKE '%c%'")