https://docs.google.com/spreadsheets/d/1AQUNPb4d3EqeSZb9SwfzemR7MfSIkMFkDAZ8rPLA3rc/edit?usp=sharing
I have 3 columns :
City
Date start
End date
I want 3 table :
Pivot Table city with people which enter during the year (Done)
=query(QUERY({$A$2:$C$10};
"select Col1, count(Col1)
where year(Col2)=2018 or year(Col2)=2019 or year(Col2)=2020
group by Col1
pivot year(Col2)");
"select * order by Col4 desc, Col3 desc, Col2 desc label Col1 'Start'";1)
Pivot Table city with people which left during the year (Done)
=query(QUERY({$A$2:$C$10};
"select Col1, count(Col1)
where (year(Col3)=2018 or year(Col3)=2019 or year(Col3)=2020)
group by Col1
pivot year(Col3)");
"select * order by Col4 desc, Col3 desc, Col2 desc label Col1 'End'";1)
- Pivot Table city with people which stay during the year (Fail)
=query(QUERY({$A$2:$C$10};
"select Col1, count(Col1)
where
(2018>=YEAR(Col2) and 2018<=YEAR(Col3) or
(2019>=YEAR(Col2) and 2019<=YEAR(Col3) or
(2020>=YEAR(Col2) and 2020<=YEAR(Col3)
group by Col1
pivot year(Col2)");
"select * order by Col4 desc, Col3 desc, Col2 desc label Col1 'Between'";1)
For the last one, i am getting trouble.
I guess my Where condition is not adapted and my pivot not working too.
I know pivot year(Col2) can't work for the last one, because if a row got a date start 2015 and 2020 end start, i want it to be counted, but my pivot won't show up 2018 2019 2020.
Any idea ?
Thanks for your time
use:
=ARRAYFORMULA(QUERY(QUERY(UNIQUE(SPLIT(FLATTEN(IF(DAYS(C2:C10; B2:B10)>=
SEQUENCE(1; 5000; ); ROW(A2:A10)&"×"&A2:A10&"×"&TEXT(B2:B10+SEQUENCE(1; 5000; );
"yyyy-1-1"); )); "×"));
"select Col2,count(Col2)
where year(Col3) matches '2018|2019|2020'
group by Col2
pivot year(Col3)
label Col2'Between'");
"order by Col4 desc, Col3 desc, Col2 desc"))
update:
=ARRAYFORMULA(QUERY(QUERY(SPLIT(FLATTEN(IF(DATEDIF(B2:B; C2:C; "Y")>=
SEQUENCE(1; MAX(DATEDIF(B2:B; C2:C; "Y")); ); ROW(A2:A)&"×"&A2:A&"×"&
YEAR(B2:B)+SEQUENCE(1; MAX(DATEDIF(B2:B; C2:C; "Y")); )&"-1-1"; )); "×");
"select Col2,count(Col2)
where year(Col3) matches '2018|2019|2020'
group by Col2
pivot year(Col3)
label Col2'Between'");
"order by Col4 desc, Col3 desc, Col2 desc"))
I have a column where each row is a sentence. For example:
COLUMN1
R1: -Do you think they'll come, sir?
R2: -Oh they'll come, they'll come all right.
R3: Here. Stamp those and mail them.
R4: It's ringing.
R5: Would you walk Myron the other way?
From this range, I want to extract a list of unique words (COLUMN2), and a count of how often they appeared in the range (COLUMN3).
The trick is to remove punctuation marks like commas, periods, etc..
So the desired result for the above would be:
COLUMN2 COLUMN3
Do 1
you 2
think 1
they'll 3
come 2
sir 1
Oh 1
all 1
right 1
Here 1
Stamp 1
those 1
and 1
mail 1
them 1
It's 1
ringing 1
Would 1
walk 1
Myron 1
the 1
other 1
way 1
I tried parsing each row with the SPLIT function, separating each word into their own cells, but I'm stuck removing the punctuation, and building the list of unique words (which I know will involve the UNIQUE function). The count I'm guessing will also involve the COUNTUNIQUE function.
Any guidance will be appreciated!
You could try something like
=query(ArrayFormula(transpose(split(query(regexreplace(A1:A5, "[^A-Za-z\s/']" ,""),,50000)," "))), "Select Col1, Count(Col1) where Col1 <>'' group by Col1 label Count(Col1)''")
Change range to suit.
If you want to exclude a list of words (ex. in the range J1:J20) you can try
=ArrayFormula(query(transpose(split(query(regexreplace(A1:A5, "[^A-Za-z\s/']" ,""),,50000)," ")), "Select Col1, Count(Col1) where not UPPER(Col1) matches '\b"&textjoin("|", 1, UPPER(J1:J20))&"\b' group by Col1 order by Count(Col1) desc label Count(Col1)''"))
Alternatively, you can also add the list of exclusions to the regex pattern...
=query(ArrayFormula(transpose(split(query(regexreplace(A1:A5, "[^A-Za-z\s/']|\b((?i)the|oh|or|and)\b" ,""),,50000)," "))), "Select Col1, Count(Col1) where Col1 <>'' group by Col1 order by Count(Col1) desc label Count(Col1)''")
UPDATED:
=ArrayFormula(substitute(query(transpose(split(query(regexreplace(substitute(C11:C, char(39), "_"), "[^A-Za-z\s_]" ,""),,50000)," ")), "Select Col1, Count(Col1) where not UPPER(Col1) matches '\b"&textjoin("|", 1, UPPER(substitute(G11:G,char(39),"_")))&"\b' group by Col1 order by Count(Col1) desc label Count(Col1)''", 0), "_", char(39)))
or, using a different approach
=query(filter(regexreplace(transpose(split(query(regexreplace(C11:C, "[^A-Za-z\s'-]" ,""),,50000)," ")), "^-",), isna(match(upper(regexreplace(transpose(split(query(regexreplace(C11:C, "[^A-Za-z\s'-]" ,""),,50000)," ")), "^-",)), upper(filter(G11:G, len(G11:G))),0))), "Select Col1, count(Col1) group by Col1 order by count(Col1) desc label count(Col1)''", 0)
You can use Mid, RegexReplace, Query, Split, etc, Like this:
= query
(
transpose
(
split
(
regexreplace ( textjoin ( " ", true,filter(mid(A11:A,4, len(A11:A)),A11:A<>"") ) , "[>,.?/!-]"," " ) ," ",true,true
)
)
,"Select Col1, Count(Col1) group by Col1 label Col1 'Column2', Count(Col1) 'Column3' "
)
or if without prefix R1: ~ R5, use like this:
= query
(
transpose
(
split
(
regexreplace ( textjoin ( " ", true,filter(A11:A,A11:A<>"")) , "[>,.?/!-]"," " ) ," ",true,true
)
)
, "Select Col1, Count(Col1) group by Col1 label Col1 'Column2', Count(Col1) 'Column3' "
)
try:
=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(REGEXREPLACE(
TEXTJOIN(" ", 1, LOWER(A:A)), "\.|\,|\?", ), " ")),
"select Col1,count(Col1)
group by Col1
order by count(Col1) desc
label count(Col1)''", 0))
or:
=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(REGEXREPLACE(
QUERY(LOWER(A:A),,999^99), "[^a-z0-9а-я ]", ), " ")),
"select Col1,count(Col1)
group by Col1
order by count(Col1) desc
label count(Col1)''", 0))
UPDATE:
=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(REGEXREPLACE(
QUERY(LOWER(A:A),,999^99), "[^a-z0-9 ]", ), " ")),
"select Col1,count(Col1)
where not Col1 matches 'the|and|i|you|its'
group by Col1
order by count(Col1) desc
label count(Col1)''", 0))
In Google Sheets, it's possible to sort by an "inner value", for example:
Is it also possible to do this in the "Filter" (more accurate, this would be a "having" clause), or is this option only available for sorting?
Here's a copy of an example sheet where I have the data sorted, and I'd also like to do a similar filter: https://docs.google.com/spreadsheets/d/1nfdh93lFcTHQQYB79mXGtz9huvJhvSfqTAw8GMMuEFU/edit#gid=0
=QUERY(QUERY(A:D,
"select A,sum(C) where C >= 2 group by A pivot B"),
"where Col2 is not null")
to mimic those Grand Totals from Pivot table:
=ARRAYFORMULA({{QUERY(QUERY(A:D,
"select A,sum(C) where C >= 2 group by A pivot B"),
"where Col2 is not null"),{"Grand Total";
MMULT(QUERY(QUERY(QUERY(A:D,
"select sum(C) where C >= 2 group by A pivot B"),
"where Col1 is not null"), "offset 1", 0),
ROW(INDIRECT("A1:A"&COUNTUNIQUE(B2:B)))^0)}};{"Grand Total",
TRANSPOSE(MMULT(transpose({QUERY(QUERY(QUERY(A:D,
"select sum(C) where C >= 2 group by A pivot B"),
"where Col1 is not null"), "offset 1", 0),
MMULT(QUERY(QUERY(QUERY(A:D,
"select sum(C) where C >= 2 group by A pivot B"),
"where Col1 is not null"), "offset 1", 0),
ROW(INDIRECT("A1:A"&COUNTUNIQUE(B2:B)))^0)}),
ROW(INDIRECT("A1:A"&COUNTUNIQUE(B2:B)))^0))}})
I have a data set (11,11,14,14,10).
My goal is to return all frequently appeared numbers. I used =mode() function.
But, it does not return 11 and 14, only returns 11.
Any ideas/thoughts on that?
With layout as shown,
=query(A2:A6,"select count(A), A group by A order by count(A) desc label count(A) 'frequency'")
should return a listing of all frequencies in descending order:
=QUERY(QUERY(A1:A, "select count(A), A
group by A
order by count(A) desc"),
"select Col2 where Col1 > 1", 0)
=TRANSPOSE(QUERY(QUERY(QUERY(TRANSPOSE(E1:1), "select *"),
"select count(Col1), Col1
group by Col1
order by count(Col1) desc"),
"select Col2 where Col1 > 1", 0))
I have an spreadsheet with one column with a bunch of names ( with duplicates) and a testing column which will be either 'ok', 'not - ok' or '' (if not started). I wanted to create a formula that would get all the unique names and then count how many 'not - ok' + '' that corresponds to that name so eg
Column A Column B
Bob ok
John not - ok
Rob
Bob not - ok
John ok
Joe ok
John
And the desired output would be
Column C Column D
Bob 1
John 2
Rob 1
Joe 0
I was able to get the unique name with
=UNIQUE(A2:A10) but not sure how to generate column D
this query gives everyone with 'ok' matched in column B:
=query(A2:B, "select A, count(B) where B matches '.*ok' group by A", 0)
if you like to make custom headers in this query, use this formula:
=query(A2:B,
"select A, count(B) where B matches '.*ok' group by A label A 'name1', count(B) 'name2'",
0)
Maybe try something like this:
=query(A2:B, "select A, count(B) where A <> '' and B <> 'ok' group by A", 0)
or
=query(A2:B, "select A, count(B) where A <> '' and (B = '' or B = 'not - ok') group by A", 0)
You can do it with pivot. "Names" goes in rows, as value you can use count of "column B" elements. Then add a global filter for the "ok" status