Sort/Order query result based on calculated fields - google-sheets

I have a list of transactions in Transactions tab and in Summary I would like to summarize by tickers the performance. I am using query for grouping the data and using aggregate functions to calculate %-Win, %-Lost (see the link at the bottom with the sample spreadsheet):
Transaction tab:
=query({Transactions!B:B,Transactions!C:F},
"select Col1, count(Col2),sum(Col4),
(count(Col2)/(count(Col2)+count(Col3))), count(Col3),
sum(Col5),
(count(Col3)/(count(Col3)+count(Col2))) where Col1 is not NULL
and
(Col2 is not NULL or Col3 is not Null)
group by Col1
label count(Col2) 'Win', sum(Col4) '$-Win',
(count(Col2)/(count(Col2)+count(Col3))) '%-Win',
count(Col3) 'Lost', sum(Col5) '$-Lost',
(count(Col3)/(count(Col3)+count(Col2))) '%-Lost'",1)
Sample of Summary tab:
but I was not able to obtain from the query by ticker: Total Transactions, Net Gains, Exp. Value(Expected Value), so I did use Arrayformula, and it works, the problem is that I am not able to sort the result by expected value nor Net Gain (FUBO should be first). I was able to calculate percentage using a combination of aggregated functions, but not for the above additional calculations directly in the query.
I tried to use query clause order by: sum(Col3)+sum(Col5) (Net gains) but it doesn't work, it only returns a value when there are Win and Lost transactions.
Using Data->Sort Range doesn't provide the expected result either. Because there are different sources of data: the query and the result of Arrayformula.
I guess I would need to obtain all required calculated fields directly from the query and then to order by, or to find a way to sort the result combining the query and Arrayformula results. The clause order by works well for aggregated functions that are present in the select elements, but not when the sorting should happen based on a formula based on calculated columns.
Here you can find a sample file from my real situation:
https://docs.google.com/spreadsheets/d/1xrDSWGJVIsWD6fvAOdMOZkw2rEY9lGPZRb_Ww_nC7YQ/edit?usp=sharing
Note: A possible solution would be to combine all the results into one sort statement, but I am not able to make it work
=sort({
query({Transactions!B2:B,Transactions!C2:F}, "select Col1, count(Col2),sum(Col4), (count(Col2)/(count(Col2)+count(Col3))), count(Col3), sum(Col5), (count(Col3)/(count(Col3)+count(Col2))) where Col1 is not NULL and (Col2 is not NULL or Col3 is not Null) group by Col1 label count(Col2) '', sum(Col4) '', (count(Col2)/(count(Col2)+count(Col3))) '', count(Col3) '', sum(Col5) '', (count(Col3)/(count(Col3)+count(Col2))) ''",0),
ARRAYFORMULA(if(not(ISBLANK(A2:A)), B2:B+E2:E,)),
ARRAYFORMULA(if(not(ISBLANK(A2:A)), C2:C+F2:F,)),
ARRAYFORMULA(if(not(ISBLANK(A2:A)), (C2:C)*(D2:D) + (F2:F)*(G2:G),))
},10, FALSE)
In the same way avoiding using Arrayformula using two query statements, doesn't work:
=sort({
query({Transactions!B2:B,Transactions!C2:F}, "select Col1, count(Col2),sum(Col4), (count(Col2)/(count(Col2)+count(Col3))), count(Col3), sum(Col5), (count(Col3)/(count(Col3)+count(Col2))) where Col1 is not NULL and (Col2 is not NULL or Col3 is not Null) group by Col1 label count(Col2) '', sum(Col4) '', (count(Col2)/(count(Col2)+count(Col3))) '', count(Col3) '', sum(Col5) '', (count(Col3)/(count(Col3)+count(Col2))) ''",0),
query(query({Transactions!B2:B,Transactions!C2:F}, "select Col1, count(Col2),sum(Col4), (count(Col2)/(count(Col2)+count(Col3))), count(Col3), sum(Col5), (count(Col3)/(count(Col3)+count(Col2))) where Col1 is not NULL and (Col2 is not NULL or Col3 is not Null) group by Col1 label count(Col2) '', sum(Col4) '', (count(Col2)/(count(Col2)+count(Col3))) '', count(Col3) '', sum(Col5) '', (count(Col3)/(count(Col3)+count(Col2))) ''",0),"select Col2+Col5 label Col2+Col5 ''",0),
query(query({Transactions!B2:B,Transactions!C2:F}, "select Col1, count(Col2),sum(Col4), (count(Col2)/(count(Col2)+count(Col3))), count(Col3), sum(Col5), (count(Col3)/(count(Col3)+count(Col2))) where Col1 is not NULL and (Col2 is not NULL or Col3 is not Null) group by Col1 label count(Col2) '', sum(Col4) '', (count(Col2)/(count(Col2)+count(Col3))) '', count(Col3) '', sum(Col5) '', (count(Col3)/(count(Col3)+count(Col2))) ''",0), "select Col3+Col6 label Col3+Col6 ''",0),
query(query({Transactions!B2:B,Transactions!C2:F}, "select Col1, count(Col2),sum(Col4), (count(Col2)/(count(Col2)+count(Col3))), count(Col3), sum(Col5), (count(Col3)/(count(Col3)+count(Col2))) where Col1 is not NULL and (Col2 is not NULL or Col3 is not Null) group by Col1 label count(Col2) '', sum(Col4) '', (count(Col2)/(count(Col2)+count(Col3))) '', count(Col3) '', sum(Col5) '', (count(Col3)/(count(Col3)+count(Col2))) ''",0), "select Col3*Col4+Col6*Col7 label Col3*Col4+Col6*Col7 ''",0)
},10, FALSE)
Doesn't give all the result values for Net Gain and Exp. Value
As you can see it only provides Net Gains and Exp. Value where are Win and Lost values on the same row.

You should fill the blanks with 0.
=SORT(QUERY(query(ArrayFormula({Transactions!B:B,
IF(Transactions!C:F="",0, Transactions!C:F)}),
"select Col1, sum(Col2),sum(Col4),
(sum(Col2)/(sum(Col2)+sum(Col3))),
sum(Col3), sum(Col5), (sum(Col3)/(sum(Col3)+sum(Col2)))
where Col1 is not NULL and NOT (Col2 = 0 and Col3 = 0) group by Col1",1),
"select Col1, Col2, Col3, Col4, Col5, Col6, Col7,Col2+Col5,
Col3+Col6,Col3*Col4+Col6*Col7
label Col2 'Win',Col3 '$-Win', Col4 '%-Win', Col5 'Lost',
Col6 '$-Lost', Col7 '%-Lost', Col2+Col5 'Total Transactions',
Col3+Col6 'Net Gains',Col3*Col4+Col6*Col7 'Exp. Value'",1),
10,FALSE)
Notes:
The condition: NOT (Col2 = 0 and Col3 = 0)ensures to exclude transactions that were not sold, i.e. Win =0 and Lost = 0
The condition: IF(Transactions!C:F="",0, Transactions!C:F)ensures empty values are replaces by 0to ensure the agregate SQL functions work as expected

Related

An Arrayformula to Split the given data as shown

Please tell me the Arrayformula at C1 which converts the column A to column C,D & E as shown.
Google Sheet Link
Extra information:
Column A is actually not raw data, it is also an Arrayformula:
=ARRAYFORMULA(VLOOKUP($A:$A, TRIM(SUBSTITUTE(SPLIT(FLATTEN(QUERY(QUERY( {Sheet1!$B:$B&"^"&Sheet1!$C:$C&"^"&Sheet1!$D:$D&"#", Sheet1!$A:$A,Sheet1!$A:$A&"×"}, "select max(Col1) where Col2 is not null group by Col1 pivot Col3",1),,9^9)), "×"), "#", CHAR(10))), 2, 0))
You can check the "Sheet1" & "Extra Information" sheets to understand it.
"Sheet1" Sheet:
"Extra Information" Sheet:
This Arrayformula at B1 is what I achieved, I am not able to split this by ^ into columns as shown in 1st Image.
try in C2:
=ARRAYFORMULA(IFERROR(REGEXREPLACE(REGEXREPLACE({
VLOOKUP(A2:A, TRIM(SPLIT(FLATTEN(QUERY(QUERY({IF(Sheet1!A2:D="",,{Sheet1!A2:A&"♦", Sheet1!B2:D&"♥"}), ROW(Sheet1!A2:A)},
"select max(Col2) where Col2 is not null group by Col5 pivot Col1"),,9^9)), "♦")), 2, ),
VLOOKUP(A2:A, TRIM(SPLIT(FLATTEN(QUERY(QUERY({IF(Sheet1!A2:D="",,{Sheet1!A2:A&"♦", Sheet1!B2:D&"♥"}), ROW(Sheet1!A2:A)},
"select max(Col3) where Col2 is not null group by Col5 pivot Col1"),,9^9)), "♦")), 2, ),
VLOOKUP(A2:A, TRIM(SPLIT(FLATTEN(QUERY(QUERY({IF(Sheet1!A2:D="",,{Sheet1!A2:A&"♦", Sheet1!B2:C&"♥", TEXT(Sheet1!D2:D, "dd/mm/e")&"♥"}), ROW(Sheet1!A2:A)},
"select max(Col4) where Col2 is not null group by Col5 pivot Col1"),,9^9)), "♦")), 2, )}, "♥$", ), "♥ ", CHAR(10))))

Google Sheet Query Pivot - count if a date is between a range of dates

https://docs.google.com/spreadsheets/d/1AQUNPb4d3EqeSZb9SwfzemR7MfSIkMFkDAZ8rPLA3rc/edit?usp=sharing
I have 3 columns :
City
Date start
End date
I want 3 table :
Pivot Table city with people which enter during the year (Done)
=query(QUERY({$A$2:$C$10};
"select Col1, count(Col1)
where year(Col2)=2018 or year(Col2)=2019 or year(Col2)=2020
group by Col1
pivot year(Col2)");
"select * order by Col4 desc, Col3 desc, Col2 desc label Col1 'Start'";1)
Pivot Table city with people which left during the year (Done)
=query(QUERY({$A$2:$C$10};
"select Col1, count(Col1)
where (year(Col3)=2018 or year(Col3)=2019 or year(Col3)=2020)
group by Col1
pivot year(Col3)");
"select * order by Col4 desc, Col3 desc, Col2 desc label Col1 'End'";1)
- Pivot Table city with people which stay during the year (Fail)
=query(QUERY({$A$2:$C$10};
"select Col1, count(Col1)
where
(2018>=YEAR(Col2) and 2018<=YEAR(Col3) or
(2019>=YEAR(Col2) and 2019<=YEAR(Col3) or
(2020>=YEAR(Col2) and 2020<=YEAR(Col3)
group by Col1
pivot year(Col2)");
"select * order by Col4 desc, Col3 desc, Col2 desc label Col1 'Between'";1)
For the last one, i am getting trouble.
I guess my Where condition is not adapted and my pivot not working too.
I know pivot year(Col2) can't work for the last one, because if a row got a date start 2015 and 2020 end start, i want it to be counted, but my pivot won't show up 2018 2019 2020.
Any idea ?
Thanks for your time
use:
=ARRAYFORMULA(QUERY(QUERY(UNIQUE(SPLIT(FLATTEN(IF(DAYS(C2:C10; B2:B10)>=
SEQUENCE(1; 5000; ); ROW(A2:A10)&"×"&A2:A10&"×"&TEXT(B2:B10+SEQUENCE(1; 5000; );
"yyyy-1-1"); )); "×"));
"select Col2,count(Col2)
where year(Col3) matches '2018|2019|2020'
group by Col2
pivot year(Col3)
label Col2'Between'");
"order by Col4 desc, Col3 desc, Col2 desc"))
update:
=ARRAYFORMULA(QUERY(QUERY(SPLIT(FLATTEN(IF(DATEDIF(B2:B; C2:C; "Y")>=
SEQUENCE(1; MAX(DATEDIF(B2:B; C2:C; "Y")); ); ROW(A2:A)&"×"&A2:A&"×"&
YEAR(B2:B)+SEQUENCE(1; MAX(DATEDIF(B2:B; C2:C; "Y")); )&"-1-1"; )); "×");
"select Col2,count(Col2)
where year(Col3) matches '2018|2019|2020'
group by Col2
pivot year(Col3)
label Col2'Between'");
"order by Col4 desc, Col3 desc, Col2 desc"))

Extracting and counting unique word frequency from a range

I have a column where each row is a sentence. For example:
COLUMN1
R1: -Do you think they'll come, sir?
R2: -Oh they'll come, they'll come all right.
R3: Here. Stamp those and mail them.
R4: It's ringing.
R5: Would you walk Myron the other way?
From this range, I want to extract a list of unique words (COLUMN2), and a count of how often they appeared in the range (COLUMN3).
The trick is to remove punctuation marks like commas, periods, etc..
So the desired result for the above would be:
COLUMN2 COLUMN3
Do 1
you 2
think 1
they'll 3
come 2
sir 1
Oh 1
all 1
right 1
Here 1
Stamp 1
those 1
and 1
mail 1
them 1
It's 1
ringing 1
Would 1
walk 1
Myron 1
the 1
other 1
way 1
I tried parsing each row with the SPLIT function, separating each word into their own cells, but I'm stuck removing the punctuation, and building the list of unique words (which I know will involve the UNIQUE function). The count I'm guessing will also involve the COUNTUNIQUE function.
Any guidance will be appreciated!
You could try something like
=query(ArrayFormula(transpose(split(query(regexreplace(A1:A5, "[^A-Za-z\s/']" ,""),,50000)," "))), "Select Col1, Count(Col1) where Col1 <>'' group by Col1 label Count(Col1)''")
Change range to suit.
If you want to exclude a list of words (ex. in the range J1:J20) you can try
=ArrayFormula(query(transpose(split(query(regexreplace(A1:A5, "[^A-Za-z\s/']" ,""),,50000)," ")), "Select Col1, Count(Col1) where not UPPER(Col1) matches '\b"&textjoin("|", 1, UPPER(J1:J20))&"\b' group by Col1 order by Count(Col1) desc label Count(Col1)''"))
Alternatively, you can also add the list of exclusions to the regex pattern...
=query(ArrayFormula(transpose(split(query(regexreplace(A1:A5, "[^A-Za-z\s/']|\b((?i)the|oh|or|and)\b" ,""),,50000)," "))), "Select Col1, Count(Col1) where Col1 <>'' group by Col1 order by Count(Col1) desc label Count(Col1)''")
UPDATED:
=ArrayFormula(substitute(query(transpose(split(query(regexreplace(substitute(C11:C, char(39), "_"), "[^A-Za-z\s_]" ,""),,50000)," ")), "Select Col1, Count(Col1) where not UPPER(Col1) matches '\b"&textjoin("|", 1, UPPER(substitute(G11:G,char(39),"_")))&"\b' group by Col1 order by Count(Col1) desc label Count(Col1)''", 0), "_", char(39)))
or, using a different approach
=query(filter(regexreplace(transpose(split(query(regexreplace(C11:C, "[^A-Za-z\s'-]" ,""),,50000)," ")), "^-",), isna(match(upper(regexreplace(transpose(split(query(regexreplace(C11:C, "[^A-Za-z\s'-]" ,""),,50000)," ")), "^-",)), upper(filter(G11:G, len(G11:G))),0))), "Select Col1, count(Col1) group by Col1 order by count(Col1) desc label count(Col1)''", 0)
You can use Mid, RegexReplace, Query, Split, etc, Like this:
= query
(
transpose
(
split
(
regexreplace ( textjoin ( " ", true,filter(mid(A11:A,4, len(A11:A)),A11:A<>"") ) , "[>,.?/!-]"," " ) ," ",true,true
)
)
,"Select Col1, Count(Col1) group by Col1 label Col1 'Column2', Count(Col1) 'Column3' "
)
or if without prefix R1: ~ R5, use like this:
= query
(
transpose
(
split
(
regexreplace ( textjoin ( " ", true,filter(A11:A,A11:A<>"")) , "[>,.?/!-]"," " ) ," ",true,true
)
)
, "Select Col1, Count(Col1) group by Col1 label Col1 'Column2', Count(Col1) 'Column3' "
)
try:
=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(REGEXREPLACE(
TEXTJOIN(" ", 1, LOWER(A:A)), "\.|\,|\?", ), " ")),
"select Col1,count(Col1)
group by Col1
order by count(Col1) desc
label count(Col1)''", 0))
or:
=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(REGEXREPLACE(
QUERY(LOWER(A:A),,999^99), "[^a-z0-9а-я ]", ), " ")),
"select Col1,count(Col1)
group by Col1
order by count(Col1) desc
label count(Col1)''", 0))
UPDATE:
=ARRAYFORMULA(QUERY(TRANSPOSE(SPLIT(REGEXREPLACE(
QUERY(LOWER(A:A),,999^99), "[^a-z0-9 ]", ), " ")),
"select Col1,count(Col1)
where not Col1 matches 'the|and|i|you|its'
group by Col1
order by count(Col1) desc
label count(Col1)''", 0))

How to calculate ALL modes in a Google Sheets?

I have a data set (11,11,14,14,10).
My goal is to return all frequently appeared numbers. I used =mode() function.
But, it does not return 11 and 14, only returns 11.
Any ideas/thoughts on that?
With layout as shown,
=query(A2:A6,"select count(A), A group by A order by count(A) desc label count(A) 'frequency'")
should return a listing of all frequencies in descending order:
=QUERY(QUERY(A1:A, "select count(A), A
group by A
order by count(A) desc"),
"select Col2 where Col1 > 1", 0)
=TRANSPOSE(QUERY(QUERY(QUERY(TRANSPOSE(E1:1), "select *"),
"select count(Col1), Col1
group by Col1
order by count(Col1) desc"),
"select Col2 where Col1 > 1", 0))

Rails raw query values along with fields name

This is my query
#pg = ActiveRecord::Base.connection
result = #pg.execute("select sum(col1) AS col1, sum(col2) AS col2 from messages")
now
result.values gives me [[val1, val2]]
result.fields gives [col1, col2]
is there a way we can get result similar to this ?
{col1 => val1, col2 => val2}
i looked into many solutions.. no luck :(
Try this
#pg = ActiveRecord::Base.connection
result = #pg.execute("select sum(col1) AS col1, sum(col2) AS col2 from messages").first

Resources