Finding the average value - mean

I'm trying to create a data frame (df1) with n columns (3 in this case). Column 1 should be a random column from data frame df0. Column 2 should be the average of that same random column plus four other random column from df0. Column 3 should be the average of the earlier five plus another five random columns.

I try to answer one by one your question. lets start with first
total <- 15 # Total number of columns in df0
sample <- 10 # Total number of columns I'm extracting from df0
values <- 4 # Number of rows
random <- sample(total,sample,replace=FALSE)
df0 <- data.frame(matrix(data = rexp(values*total, rate = total), nrow = values, ncol = total))
#At first I select 10 random columns from df0
df1 <- df0[, sample(ncol(df0), sample)]
#I would create an empty data frame
df2 <- data.frame(matrix(, nrow =values , ncol = 3))
#then assign the first column of df1 to the output ,
df2$X1 <- df1[,1]
#then you get the average of five first random selected to second column of df2
df2$X2 <- rowMeans(subset(df1[1:5]))
#finally the average of 10 columns to the third column of df2
df2$X3 <- rowMeans(subset(df1[1:10]))
> df2
# X1 X2 X3
#1 0.18816542 0.12617238 0.08728368
#2 0.09855574 0.07592763 0.06069351
#3 0.12022571 0.06045562 0.07964574
#4 0.00260806 0.06172300 0.06225859
In order to remove all unwanted columns, I personally use something like below
but I am sure there will be another way to do this
# for example you only want to keep column 3 and 5 then
col_list = c("X3", "X5")
dfm = df0[,col_list]

Related

Sheets: Count Row data Based on Key

Im trying to find a Row based on a value, and count the number of entries (Max 5) in that row.
A
B
C
D
E
F
G
1
John
2
2
5
4
2
Mary
1
3
1
6
7
=COUNTA(B1:GZ) will work if I knew the row for "John",
but im trying to get the row based on a value from another cell ...
Z
10
John
In this case Z10 pseudo: =COUNTA(Find Z10 in Col A, then count entries in Row starting at B to Z)
Any ideas ? Thanks.
Recommendation:
You can try this method
=COUNTA(QUERY(A1:Z, "Select * where A = '"&Z10&"'"))-1
Sample
Sample sheet:
John added on cell Z10
The recommended function added on cell Z11
Result: 4 was the result of COUNTA on the rows B:Z based on Column A that contains the value "John"
NOTE:
You may need to turn on Iterative calculation on your Spreadsheet settings.

Create single formula for several different lines (array) (Google Sheets)

In the AD column I have this sequence of values:
2
3
4
These values refer to rows in a column on another page.
In each line in AE column I use this formula:
=IF(AD1="","",IFERROR(SUM(FILTER(INDIRECT("'Registro Geral'!O2:O"&AD1)/100,REGEXMATCH(INDIRECT("'Registro Geral'!H2:H"&AD1),SUBSTITUTE(SUBSTITUTE(JOIN("|",$V$1:$V$4),"||",""),"|||",""))=TRUE))))
=IF(AD2="","",IFERROR(SUM(FILTER(INDIRECT("'Registro Geral'!O2:O"&AD2)/100,REGEXMATCH(INDIRECT("'Registro Geral'!H2:H"&AD2),SUBSTITUTE(SUBSTITUTE(JOIN("|",$V$1:$V$4),"||",""),"|||",""))=TRUE))))
=IF(AD3="","",IFERROR(SUM(FILTER(INDIRECT("'Registro Geral'!O2:O"&AD3)/100,REGEXMATCH(INDIRECT("'Registro Geral'!H2:H"&AD3),SUBSTITUTE(SUBSTITUTE(JOIN("|",$V$1:$V$4),"||",""),"|||",""))=TRUE))))
In short, this formula is getting a running Sum of values in the other sheet based on whether or not the corresponding cell in another column of the same sheet appears in a set of values.
When I try to add ARRAYFORMULA so that I don't have to have a formula on each line, leaving only in AE1, the values that return on all lines are exactly the same value.
Test Formula Fail:
=ARRAYFORMULA(IF(AD1:AD="","",IFERROR(SUM(FILTER(INDIRECT("'Registro Geral'!O2:O"&AD1:AD)/100,REGEXMATCH(INDIRECT("'Registro Geral'!H2:H"&AD1:AD),SUBSTITUTE(SUBSTITUTE(JOIN("|",$V$1:$V$4),"||",""),"|||",""))=TRUE)))))
Link to Spreadhseet example:
https://docs.google.com/spreadsheets/d/1qIv6KnLv-EwJQXRrk7ucuqY-XuJhkIHOCtih9FpAg6U/edit?usp=sharing
You're trying to do a running summation on O based on whether the corresponding value in the H column appears in the Filtered values.
We can do this with a matrix multiplication using a lower-triangular matrix and the listed values, selecting which ones to zero out based on certain conditions using IF.
=ArrayFormula(MMULT(
N(SEQUENCE(D2)>=SEQUENCE(1,D2)),
ARRAY_CONSTRAIN(
IF(
('Registro Geral'!O2:O<>"")*
IFNA(MATCH('Registro Geral'!H2:H,V:V,0)),
'Registro Geral'!O2:O
)/100,
D2,
1
)
))
Why this works
The lower-triangular matrix looks like
1 0 0 0 0 ... up to N columns
1 1 0 0 0
1 1 1 0 0
1 1 1 1 0
1 1 1 1 1
... up to N rows
The Column you want to sum looks like
Value 1
Value 2
...
Value N
So when you multiply the two, you get a new matrix of dimension N x 1:
Value 1
Value 1 + Value 2
...
Value 1 + ... + Value N
If we don't want to sum a value, then we can zero it out with a conditional so that it never gets added.

running totals automatically inserted reset every 9 rows

I have a google sheet with 4 users use to log hours and other info in. The columns are totalised and displayed at the top.
What I would like to do is in a column, say L, number 1 - 9 every nine lines (rows) starting at row 6 (row number to be defined) and in another column, say M, insert the running total of the nine lines of the value in column I.
The reason for the running total is that it needs to be transferred in paper-based logbooks which have 9 lines per page.
Hope that makes sense?
How would you go about doing this and is it an equation or a macro I would need?
In the screenshot I would like to automatically add columns L and M. M is based on the value in I
L and M are the values under Page totals and I is the flight time.
test sheet
update 1: I have solved the counting with the following in L7 and then dragged down
=if(L6+1 < 10, L6+1,1)
Update 2: nearly working using the below in M7 and dragging it down, however, when column I is blank it still adds a number instead of leaving the cell blank!
=if(L7=1,
I7,
if(ISBLANK(I7), " ", M6+I7)
)
Formulae:
For column L, you can use an ARRAYFORMULA with the MOD operator, paste the following into L6:
=ARRAYFORMULA(MOD((ROW(L6:L) + 3),9) + 1)
For column M, use L7 = 0 instead of ISBLANK(L7), paste this in M6 and drag down:
=IF(L6 = 1, I6, if(I6 = 0, " ", M5 + I6))

How to get the sum of a column up to a certain value?

I have a google sheet that I am using to try and calculate leveling and experience points. Column A has the level and Column B has the exp needed to reach the next level. i.e. To get to Level 3 you need 600 exp.
A B
1 200
2 400
3 600
...
99 19800
In column I2 I have an integer for an amount of exp (e.g. 2000), in column J2 I want to figure out what level someone would be at if they started from 0.
Put this in column J and ddrag down as required. Rounddown(I2,-2) rounds I2 down to the nearest 100. Index match finds a match in column B and returns the value in column A of the matched row.
=index(A2:A100,match(ROUNDDOWN(I2,-2),B2:B100,0))
Using a helper column (for example Z): put =sum(B$1:B1) in cell Z1 and drag down. This will compute the sums required for each level. In J2, use the formula
=vlookup(I2, {B:B, Z:Z}, 2) + 1
which looks up I2 in column B, and returns the nearest match that is less than or equal to the search key. It adds 1 to find the level that would be reached, because your table has this kind of an offset to you: the entry against level N is about achieving level N+1.
You may want to put 0 0 on top of the table, to correctly handle the amounts under 200. Or treat them with a separate if condition.
Using algebra
In your specific scenario, the point amount required for level N can be computed as
200*(1+2+3+...+N-1) = 200*(N-1)*N/2 = 100*(N-1/2)^2 - 25
So, given x amount of points, we can find N directly with algebra:
N = floor(sqrt((x+25)/100)+1/2)
which means that the formula
=floor(sqrt((I2 + 25) / 100) + 1/2)
will have the desired effect in cell J2, without the need for an extra column and vlookup.
However, the second approach only works for this specific point values.

Formula for summation on amount of equal column values

Given a spreadsheet with two columns, say A and B, each containing n values under it, all text; is there a formula that allows me to fill just one cell containing the amount of equal values in columns A and B?
Example:
A B
-----
1 M M
2 L M
3 L L
4 M M
5 M L
-----
3
Since columns A and B both contain an M in rows 1 and 4, and an L in row 3, the result is (i.e. 2+1).
A simple solution is to use QUERY function in google spreadsheet:
=SUM(QUERY(A1:B5, "Select 1 where A = B"))
Or using SUMPRODUCT:
=ARRAYFORMULA(SUM(((A:A)=(B:B)) * (1) ))
One of possible solution will be to add the following formula in column C: =N(EXACT(A1,B1)),
copy it throughout the column down to the last row and then sum up the column C values using =SUM(C1:C5).
Here we go:
=IF(EQ(LEFT(A0, 1), "A"),
SUM(ARRAYFORMULA(N(EXACT(TRANSPOSE(A1:A5), TRANSPOSE(B1:B5))))),
"")
Reading: if the value in row 0 (it doesn't exist, but my example above does ;) ) is equal to the text "A", take the sum of an array N, otherwise put in an empty string. ("")
Array N is build by taking the transpose of columns A and B. (Turning them, so they look like a row) and comparing the values. (Burnash gave me the options "N" and "EXACT") The formula N transforms this into a 1 or 0.
Copy paste the formula in an entire row and what do you know... It worked! That was hellish for something so trivial.
Thanks anyway.

Resources