Sheets: Average range if other column cell contains substring - google-sheets

I have a social media performance report in Google Sheets, and want to average the engagement rate % for posts that use a specific tag (as assigned in SproutSocial) in a central 'campaigns' sheet.
Here's an example of what the data looks like
Col A | Col B | Col C
-------------------------------------
Reach | Engagement Rate | Tags
-------------------------------------
1200 | 3.4% | Black Friday
480 | 1.3% | Black Friday, Blog Post
480 | 3.1% | Landing Page, Black Friday
480 | 5.6% | Blog Post
So let's say I want to see the average Engagement Rate (ER%) for our Black Friday posts. So far, I've used this formula in the campaigns sheet:
=AVERAGEIF('Nov 2021'!$C:$C, "Black Friday", 'Nov 2021'!$B:$B)
This only does an exact match on column C, so returns 3.4% as there's only one matching post.
So, how do I match substrings? I've tried a RegexMatch on the AVERAGEIF criterion, but it returns zero results...

You can use wildcards:
=averageif(C:C,"*Black Friday*",B:B)

Related

google sheet countif and filter out other columns if the value is lotto

I have a column
COLUMN H | Column R
trading type | Closed P&L
Lotto | 100%
Lotto | 200%
| 100%
Day | -50%
Trying to exclude any rows that have trading type = Lotto and calculate the "non-lotto" winning rate. Curious how could I do so?
So in this case the total non-lotto trades are 2, and only 1 win (>0%) => 50% winning rate.
This doesn't seem to do the jobs
=countif(R2:R148,">0")/COUNT(R2:R165, H:H, "Lotto")
This will return N/A
=countif(R2:R148,">0", H:H, "Lotto")/COUNT(R2:R165, H:H, "Lotto")
Use COUNTIFS to get the non Lotto cases where the value is positive and divide it with the total number of non Lotto cases:
=COUNTIFS(H2:H5,"<>Lotto",R2:R5,">0")/COUNTIF(H2:H5,"<>Lotto")

Combine 2 queries (different columns from 2 different sheets) and filter based on matching results

As usual, I have set a goal, way beyond my skills...
I need to get data from 2 sheets, One has a lot more entries than the other (a master list I guess you could say). Any entry in the smaller sheet will always have a matching entry in the Master, but not necessarily the other way round.
I have written what I need in pseudo query syntax, but I need help getting this to work...
QUERY the 'Catalog' sheet and get TITLE, SUBTITLE, STATUS, TITLE-ID WHERE the STATUS does NOT have the word 'Retired' in it.
Then Query 'Report_Dec 2017' and get UNITS, USD, GPB, EUR WHERE TITLE-ID from 'Report_Dec 2017' Matches TITLE-ID from 'Catalog'
Catalog (master)
| TITLE | SUBTITLE | STATUS | TITLE-ID |
Report_Nov_2017
| UNITS | USD | GPB | EUR | (has TITLE-ID also, but don't need this twice)
Final result should look like this:
| TITLE | SUBTITLE | STATUS | TITLE-ID | UNITS | USD | GPB | EUR |
The end result should only ever have a max number of entries equal to that of from 'Report_Nov 2017', So the Catalog might have 100 total entries but since only 20 units were sold in November, then the result will only show 20
First of all is that possible? And secondly, if it is, can someone point me in the right direction?
EDIT UPDATE
I have made some progress with this, but I am stuck on a strange issue...
This is my google sheet:https://docs.google.com/spreadsheets/d/10uXJVilUqAnSE_ZPlA6VKMBl0DCFRt_WqzYYl-c4Syc/edit?usp=sharing
This is my current formula:
=ArrayFormula(query({to_text(Catalog!B:J),to_text('Report_Nov 2017'!A:J)},"SELECT Col1,Col3,Col4,Col9,Col16,Col17,Col18,Col19 where Col4 != 'Retired' and Col15 MATCHES '"&textjoin("|", TRUE, Catalog!J2:J)&"'",1))
I am getting a result where the entries returned from Catalog are not matching the entries returned from ReportNov2017 - It just seems to be grabbing the first 25 results from Catalog instead of checking to see if the TITLE ID matches in ReportNov2017 - Any Ideas where Im going wrong?
I suggest you split the task into smaller tasks:
add some columns to the report sheet: | TITLE | SUBTITLE | STATUS |. Get their values from Catalog (master). You may try vlookup arrayformula to automate this. See the article.
Then use simple query formula to get the rest.

Filter to the latest month and then filter to the best score per person

I've got a Google Sheet which holds the results of a monthly competition. The format is
Name | Date | Score
--------------------------------
Alan Smith | 14/01/2016 | 500
Bob Dow | 14/01/2016 | 450
Bob Dow | 16/01/2016 | 470
Clare Allie| 16/01/2016 | 550
Declan Ham | 16/01/2016 | 350
Alan Smith | 10/02/2016 | 490
Bob Dow | 10/02/2016 | 425
Declan Ham | 12/02/2016 | 400
Declan Ham | 12/02/2016 | 390
Clare Allie| 12/02/2016 | 560
I want to do 2 things with this data
I want to create a new sheet which holds the latest 'best' results. For the data presented here that would be
Alan Smith | 10/02/2016 | 490
Bob Dow | 10/02/2016 | 425
Declan Ham | 12/02/2016 | 400
Clare Allie| 12/02/2016 | 560
i.e. The results from February with the 'best' score per person. Here Declan Ham's lower score of '390' was removed.
I want another sheet to hold the tournament ranking. People are ranked by their top 3 monthly scores. i.e. The best score for each person for each month is obtained and the top 3 scores are combined to give their place in the tournament.
So far I've attempted to use Google queries, vlookups, filters to get these new sheets. But, just focusing on 1), the best I've been able to achieve is
=FILTER(Results!$A:$B, MONTH(Results!$B:$B) = MONTH(MAX(Results!$B:$B)))
Which will get me the results from the latest month. But it does not remove duplicates entries by people.
Does anyone have a suggestion for how I can achieve these requirements? Feel like I'm treading water at the moment.
Rather than trying to remove duplicates, you need to identify the maximum score by each person; you can do that by grouping values by person, then aggregating using max(). Here's how that would look, for the month of February 2016:
=query(Results!A1:C,"select A,max(C) where todate(B) > date '2016-2-1' group by A")
Instead of using a fixed value for the start of the latest month, we can get the year and month using spreadsheet formulas, and concatenate our query with them:
=query(Results!A1:C,"select A,max(C) where todate(B) > date '"&year(max(Results!B2:B))&"-"&month(max(Results!B2:B))&"-1' group by A")
That addresses your first question.
Tournament ranking
Your second goal is too complex for a single spreadsheet formula, in my opinion. Here's a way to accomplish it with multiple formulas, though!
The X & Y axes are filled out by spreadsheet formulas. On the X axis (orange), we populate participants names using this in cell A3:
=unique(Results!A2:A)
The Y axis consists of dates (green). These are the start dates of each unique month that there are scores for, calculated using the following formula in cell D2. This results in strings, e.g. 2016-01-1, and that format is specifically required for the later formulas to work.
=TRANSPOSE(SORT(UNIQUE(ARRAYFORMULA(TEXT(Results!B2:B13,"YYYY-MM-1")))))
Here's the formula for cell D3, which will calculate the sum of the 3 highest scores recorded for the user whose name appears in A3, for the month appearing in D2. (Copy & Paste the formula across the full range of participants & months, and it will adjust.)
=sum(query(Results!$A$1:$C,"select C where A='"&$A2&"' and todate(B) >= date '"&B$1&"' and todate(B) < date '"&IF(ISBLANK(C$1),TEXT(TODAY()+1,"yyyy-mm-dd"),C$1)&"' order by C desc limit 3 label C ''"))
Key points about that formula:
The query range needs to used fixed values so it isn't transposed when copied to additional cells. However, it's still open-ended, to absorb additional rows of scores on the "Results" sheet.
Results!$A$1:$C
A WHERE clause is used to select rows from the Results sheet that are for the given participant (A='"&$A2&"') and fall within the month that heads the column (C$1).
...and todate(B) < date '"&IF(ISBLANK(C$1),TEXT(TODAY()+1,"yyyy-mm-dd"),C$1)&"'
The best 3 scores for the month are found by first sorting the above result descending, then limiting the result to 3 rows.
...order by C desc limit 3
Finally, the QUERY headers are suppressed by this little trick, so that we get a single number as the result:
...label C ''
Individual tournament totals appear in column C, with a range SUM across the row, e.g. for cell C3:
SUM(D3:3)
The corresponding ranking in column B is then:
RANK(C3,C$3:C)
Tidy
For simpler copy/paste, you can do some error checking in these formulas, so that they can be placed in the sheet before the corresponding data is - for example, at the start of your season. Using IF(ISBLANK(... or IFERROR(... can be very effective for this.
B3 & down:
=IFERROR(RANK(C3,C$3:C))
C3 & down:
=IF(ISBLANK(A3),"",sum(D3:3))
D3 & rest of field:
=IFERROR(sum(query(Results!$A$1:$C,"select C where A='"&$A3&"' and todate(B) >= date '"&D$2&"' and todate(B) < date '"&IF(ISBLANK(E$2),TEXT(TODAY()+1,"yyyy-mm-dd"),E$2)&"' order by C desc limit 3 label C ''")))
Alternatively for the first part of your question (the latest 'best' results) , in addition to the solution provided by Mogsdad, this should also work.. :-)
=ArrayFormula(iferror(vlookup(unique(A2:A), sort(A2:C, 2, 0, 3, 0), {1,3}, 0)))
EDIT: This formula sorts the table with dates (col B) descending and col C descending and then (ab)uses the fact that vlookup only returns the first match to return the first and last column.

correlate a demographic column with answers from multiple columns

I have a spreadsheet from a Google Consumer Survey. The survey captured demographics as well as the responses to a question. Acceptable responses could have chosen zero or more 'answers'. The response for each answer is in a unique column. For example,
user id | gender | age | income | answer 1 | answer 2 | answer 3 |
0001 | Female | 20-30 | 50-75 | [empty] | Right | Never |
0002 | Male | 20-30 | 30-50 | Up | Left | [empty] |
I would like to know how to correlate a column of demographic info with each of the possible answers. For example, I want to be able to answer questions like, Were males more likely than females to choose X for answer 1? and Which age group was more likely to choose Y for answer 2?
I prefer an answer using Google Sheets functions, but I am open to learning other ways to understand the data. Thank you for any help!
Good way is to use query function. Let's first assume, your data is stored in range A:G:
A | B | C | D | E | F | G |
user id | gender | age | income | answer 1 | answer 2 | answer 3 |
0001 | Female |20-30| 50-75 | [empty] | Right | Never |
0002 | Male |20-30| 30-50 | Up | Left | [empty] |
you may write simple query functions.
For example, to count all answer 1, group them by gender and age, pivot by answer 1:
=query(A:G,"select B, C, count(D) where not A is null group by B, C pivot E")
where not A is null -- prevents empty data to be used in query
count(D) -- can count any column, that wasn't already used by query
group by B, C -- must contain all selected items, except aggregates (count, sum, ets.)
pivot E -- will make all answers to show in separate columns.
The result will look like this:
Left Never Right Up
Female 20-30 1 1 1
Female 30-40 1
Male 20-30 1 1 1
Male 30-40 1
Please, look at complete Query Language Reference to learn more.
Have you tried using the Pivot Table function of Google Sheets?
Download the data in excel format after the survey is complete and open with Google Sheets
Select the tab with the resulting data from the Google Consumer Survey after it is run.
From the menu, select Data -> Pivot Table. This opens a new tab in your spreadsheet.
For the Values area of the pivot table, select User ID and from the "Summerize by" dropdown, select COUNTUNIQUE
For the columns and rows, select whichever dimensions you are interested in. For instance, in your example, you would pick
"Gender" and "Answer 1" as a row and column.
"Age" and "Answer 2" as a row and column.
This should answer these kinds of questions easily.
Hope this helps!
What I think I needed was the COUNTIFS function (in Google Sheets). Notice the plural use, which is different than countif (singular).
COUNTIFS allowed me to specify multiple criteria to make a score for each demographic segment. For example, I could count all the Males that responded Up in the answer 1 column.

Sort pie chart's slices without sorting the data in the columns

I haven't found something similar to that on the web, and it seems that the only way for the data to be sorted on the pie chart is if they're pre-sorted.
The problem is that I have a sheet populated by a third party software (Typeform) that places random data, which I then aggregate to present to the pie chart.
More specifically, Typeform writes
town | salary | cost
London | 1000 | 500
Bristol | 700 | 300
London | 900 | 400
Leeds | 600 | 200
Leeds | 500 | 300
Leeds | 400 | 200
Then I aggregate the data in another sheet (Sheet2) so that I have
town | occurrences
London 2
Bristol 1
Leeds 3
Obviously the pie chart will draw London first, then Bristol, and then Leeds. These are only 3 entries, however in my example, I have 20, and the data in the pie chart are not ordered.
Sheet2's data cannot be sorted descending since I am using =UNIQUE(Sheet1!A2:A) and then in the column next to it =countif(Sheet1!A:A,A2) to populate them from the Sheet1 where the 3rd party software is writing them, in fact when I select them and click sort they don't get sorted, they reappear as they were.
Is there any way to sort them (and keep them sorted) in Sheet2, or by writing them in a new sheet?
If town is in A1 of Sheet1 please try:
=query(Sheet1!A2:D7, "select A, count(C) group by A")
Assuming Typeform data is in Sheet1!A:C, the following function in Sheet2!A1 should do the trick:
=QUERY("Sheet1!A:C","select A, count(B) order by count(B) desc",1)
another way to do it is to simply apply Filter on your data, then sort from A-Z (for lowest-highest percentage) or Z-A (for highest-lowest percentage).
Then you create your PIE chart out of that and it comes out sorted!

Resources