quick question because I'm getting crazy on this issue.
I have a table where I save a lot of analytics data, and I have to put this data on a table.
Every record has this structure:
id: 1,
user_id: 4,
store_id: 490,
company_id: 1,
completed_courses_percentage: 87
At this moment I'm grouping the user by company and then I use the average method to create a table with the average completed course percentage grouped by company.
UserActivity.group(:company_id).average(:completed_courses_percentage)
If I have three person for a company and the averages are 0, 50 and 100 the total average now is 50%.
By the way I need to change the way I calculate this average: I must calculate the average of people that belong to a company with a completed_courses_percentage > 70.
I try to use
UserActivity.group(:company_id).having("completed_courses_percentage > 70")
but the result is 0.
You can try this.
UserActivity.where("completed_courses_percentage > 70").group(:company_id).average(:completed_courses_percentage)
P.S: Not tried this.
Related
I would like to get an average percentage out of my sample, however, I need to use several conditions. I tried to use the AVERAGE and AVERAGEIF together with FILTER but everything returns an error and I think I'm incorrectly "merging" formulas.
You can find my test sheet here.
The rules I need to apply:
The score for individual rows is possible to find in the "Data" sheet in cell N and the total results should be visible in the sheet "Calculation" cell E.
As the sample is huge in real life, I need to filter out several pieces of information and add conditions:
to filter out all items where the code/ID starts with 0: Data!A:A&"", "^0.+"
to filter out all items that are matching the date in the Calculation sheet: Data!C:C=$B3
to filter all items with the specific name: Data!B:B=$A3
Any idea how to get the average % out of items with specific filters?
UPDATE
Expected results: I want to see the total average for a specific date, name, and ID, and let's say I would use these filters, then I would see only the final average percentage.
Test =100%
Test = 0%
Test = 100%
Total Average %: 66.7%
Also, I think the best way would be to use AVERAGEIFS, but I'm getting the error "Array arguments to AVERAGEIFS are of different size".
=AVERAGEIFS(Data!N:N,Data!B:B=$A3,Data!C:C=$B3,Data!A:A&"", "^0.+")
=IFERROR(AVERAGEIFS(Data!N3:N,Data!B3:B,A3,Data!C3:C,B3,ARRAYFORMULA(if(LEN(Data!A3:A),REGEXMATCH(Data!A3:A,"^0.+"),"")),TRUE),"")
or
=IFERROR(AVERAGE(FILTER(Data!N3:N,Data!B3:B=A3,Data!C3:C=B3,REGEXMATCH(Data!A3:A,"^0.+"))),"")
or
=IFERROR(INDEX(QUERY({Data!A3:C,Data!N3:N},"select avg(Col4) where Col1 starts with '0' and Col2 = '"&A3&"' and Col3 = '"&B3&"'"),2,0),"")
Ok so the need - I have about 3700 lines of email addresses, names, schools, and professions(those are column headers) I want to split this sheet into 4 with 1000 lines(I understand one will be short) in each but here is the catch I can only have 25 lines/emails from each school. So how would someone go about doing this? Keep in mind each sheet needs to have its own unique emails not repeated on the other sheets.
There are 2 problems here and as I don't know how many schools are on the list and if it's possible to have always less than 25 people from one school (for example - if there are only 30 schools, it would be impossible to distribute them in 1000 row batches).
First task:
Distribute database into 4 sheets, 1000 rows each:
It's simple.
Let's say my data has 4 columns from A to D
I make sheets named 1-1000, 1001-2000, etc.
In each one I put a formula:
1)
=query(Master!A1:D,"select * limit 1000 offset 0")
=query(Master!A1:D,"select * limit 1000 offset 1000")
=query(Master!A1:D,"select * limit 1000 offset 2000")
=query(Master!A1:D,"select * limit 1000 offset 3000")
Etc.
In order to limit number of occurences of each schools, I have to count these occurences and define what is the minimal page number on which this student can be displayed (for example - 17th student from certain school can be on 1st page, but 27th can be at least on 2nd page. 60th student can be on third or further.
When I determine minimal page number, I can sort my data accordingly and display sorted by minimal number:
In this situation my query on next pages have additional parameters:
=query(Master!A1:G,"select A,B,C,D order by G limit 1000 offset 0")
I use column G for sorting, but I don't display it.
You can find my solution here:
https://docs.google.com/spreadsheets/d/1TP6MlMmLiUExOELFhgZnti7LR7VQouMg3h-X7QRcHzQ/copy
Names are generated randomly from polish names generator.
I have a scoring spreadsheet for a competition I'm working on. Competitors' place/rank are converted into points towards the overall series based on a chart of corresponding values. For ties, the sum of the points covered by all of the tied places are split evenly among the tied competitors (i.e. 2-way tie for 3rd; if 3rd usually gets 10 points and 4th usually gets 8, these competitors will receive (10+8)/2 (2 being the # of tied competitors), so they each receive 9 points).
I have a formula which does this exact calculation:
=IFERROR(IF(ISBLANK($A4:$A),,SUM(INDEX(SeriesPoints, E4:E):INDEX(SeriesPoints, MIN(E4:E + COUNTIF(E$4:E, E4:E) - 1, ROWS(SeriesPoints)))) / COUNTIF(E$4:E, E4:E), 0))
Where 'SeriesPoints' is a 2 column array; column 1 is the places/ranks (1:125) and column 2 is their corresponding point values. Column 'E' is the competitors' rank from the competition.
I have been unable to convert this formula to an ARRAYFORMULA() so I can avoid dragging it down the entire sheet (possibly up to 1000+ competitors over the series).
I'm mildly proficient with MMULT(), so I understood that would be a good approach for switching out SUM(), however, I haven't been able to create a matrix of the values to be summed.
INDEX():INDEX() doesn't work with ARRAYFORMULA() so I've tried switching to VLOOKUP(). With VLOOKUP() I've been able to produce the start and end values of the range of values for a tie, but not the full list. For example, if there is a 3-way tie for 4th, I can produce the respective points for 4th and 6th (the bounds of the tie).
In an attempt to list out even just the numbers from 4:6, I've hit a wall converting what would be a simple ROW() or SEQUENCE() formula to a matrix/array.
The following formula produces an array of the upper and lower bounds of ties or the single place should there be no tie, although the single place gets repeated.
=ARRAYFORMULA(IF(COUNTIF(E$4:E,E4:E)=1,E4:E,{E4:E,E4:E+COUNTIF(E$4:E,E4:E)-1}))
I'm assuming if I can get VLOOKUP({#:#}) to fill properly, I'll be where I need to be.
From here, I feel confident in my abilities to wrap a VLOOKUP() for the actual point values, an MMULT() to sum across these rows for the total, then a simple division to produce the correct point value.
Spreadsheet: https://docs.google.com/spreadsheets/d/1lpNewR3p4i7ZHmlFGLlG1tLuxgO-6onSeH8mWTeclBw/edit?usp=sharing
Currently, my workspace is off to the right. The original formula is in F4 and my test codes are working on column G instead of E.
So for sample placements of 1,1,3,3,3,6,7,8 and sample points values of 1000, 850,738,663,633,603,573,550 I expect the output to be 925 for the two 1st place tied competitors, 678 for the tied 3rd places, 603 for 6th, 573 for 7th, and 550 for 8th.
I'd appreciate any and all help!
=ARRAYFORMULA(IFERROR(IFERROR(VLOOKUP(G4:G, QUERY({INDIRECT("G4:G"&counta(A4:A)+3),
VLOOKUP(ROW(INDIRECT("A1:A"&COUNTA(A4:A))), SeriesPoints, 2, 0)},
"select Col1,sum(Col2) group by Col1 label sum(Col2)''", 0), 2, 0))/
IFERROR(VLOOKUP(G4:G, QUERY(G4:G,
"select G,count(G) where G is not NULL group by G label count(G)''", 0), 2, 0))))
Simply put I am trying to take a single column query result and output it into a 5 wide by × long table. This is how the main table is organized.
On separate tabs, I want to list all of the caught and seen Pokemon on their own for easy search. While I can get it to output something like this with
=query(NatDex, "Select C Where F <> ''",1)
I would like it to output the data something like this for easy reading so it's not eventually 100+ entries long:
Bonus points if you can give me formula/something to do it where I can vary how wide the second table is. But this is far less important to me. I've tried looking up stuff like Pivot tables or Transpose, but neither of them seems to have the functions I need to pull this off.
if you put your query output in some auxiliary column, you can use this formula and drag down:
=ARRAY_CONSTRAIN(TRANSPOSE(INDIRECT("A"&6+(ROW()-ROW($A$2))*5&":A")), 1, 5)
for 6 columns:
=ARRAY_CONSTRAIN(TRANSPOSE(INDIRECT("A"&7+(ROW()-ROW($A$2))*6&":A")), 1, 6)
for 3 columns:
=ARRAY_CONSTRAIN(TRANSPOSE(INDIRECT("A"&4+(ROW()-ROW($A$2))*3&":A")), 1, 3)
for 5 columns but starting on 10th row:
=ARRAY_CONSTRAIN(TRANSPOSE(INDIRECT("A"&6+(ROW()-ROW($A$2)-9)*5&":A")), 1, 5)
etc.
I am trying to rank the data in one column in my google sheet so that there are no duplicate rankings. I've seen some solutions such as =RANK(A2,$A$2:$A$10)+COUNTIF($A$2:A2,A2)-1, but the problem is that it increments the duplicates based on occurrence in the sheet.
Let's say my data that I'd like ranked is as follows:
1
1
1
2
The rank order would be 2, 3, 4, 1. The problem is, if I change the second entry to 2 (so that my data is now 1, 2, 1, 2) the ranking order becomes 3, 1, 4, 2 instead of 3, 2, 4, 1 like I want. In the original data, the fourth entry was initially the highest and I'd like it to still have the higher rank, but since the formula counts occurrences it gets demoted. Any way to accomplish this?
No, not with native spreadsheet functions. Spreadsheet formulae have no "awareness" of which values were entered most recently.
You would need to resort to Google Apps Script run on an "on edit" trigger.