Duplicates in Google Sheets - google-sheets

I am using Google sheets and I am trying to concatenate multiple column A values in Column C, when and if Column B has a duplicate:
Sample data:
Column A Column B Column C
1 1247 Santa Fe 1250/1150
2 1250 Santa Fe 1247/1150
3 1258 North Shore 1354
4 1341 Hogan 1255
5 1255 Hogan 1341
6 1354 North Shore 1258
7 1150 Santa Fe 1247/1250
Here, Column C needs to have multiple concatenated values of A, corresponding to the duplicates in column B.

C1:
=JOIN("/",FILTER($A$1:$A$7,$B$1:$B$7=B1,ROW($B$1:$B$7)<>ROW(B1)))
Drag fill down.

Related

Google Sheets: Convert Horizontal Transaction Data into Chronological Statement + Combining Columns of Data

On a sheet named, "Performance," I have data concerning stock trades in a row like so:
A B C D E F G H I J
1 TICKER TRADE OPEN DATE TRADE CLOSED DATE SHARES AVG BUY INVESTMENT AVG SALE PROCEEDS PROFIT/LOSS ROIC:
2 ABC 01/05/22 03/31/22 107 $14.22 -$1,521.54 $15.00 $1,605.00 $83.46 5.49%
3 BCA 01/05/22 03/31/22 344 $14.52 -$4,994.88 $15.00 $5,160.00 $165.12 3.31%
4 CAB 01/05/22 03/31/22 526 $12.55 -$6,601.30 $13.00 $6,838.00 $236.70 3.59%
... and so forth ...
Within the same workbook but on a separate sheet named, "Contributions/Withdrawals," I have a list of contributions and withdrawals like so:
A B
1 DATE AMOUNT
2 01/05/22 $700.00
3 02/05/22 $700.00
4 03/05/22 $400.00
5 03/15/22 -$7,000.00
... and so forth ...
I need to convert the first table of trade transactions into a vertical column format exactly like what is in the Contributions/Withdrawals table. (Note that each trade transaction actually represents two transactions, one for opening with its own date, and one for closing with its date.) Finally, I need to stack both tables of transactions in date order to make a combined chronological list of transactions so that I can run an XIRR formula on it.
The resulting table on a sheet named, "Cash Flows," needs to look like this:
A B
1 DATE AMOUNT
2 01/05/22 -$1,521.54
3 01/05/22 -$4,994.88
4 01/05/22 -$6,601.30
5 01/05/22 $700.00
6 02/05/22 $700.00
7 03/05/22 $700.00
8 03/10/22 $400.00
9 03/15/22 -$7000.00
10 03/31/22 $1,605.00
11 03/31/22 $5,160.00
12 03/31/22 $6,838.00
Using the following in cell A2 and B2...
A2 =SORT({Performance!$B$2:$B;Performance!$C$2:$C;'Contributions/Withdrawals'!$A$2:$A})
B2 =SORT({Performance!$F$2:$F;Performance!$H$2:$H;'Contributions/Withdrawals'!$B$2:$B})
...almost gets me there, but the transactions are not lining up with the correct dates. Google Sheets is ordering the amounts from smallest to largest. What I end up with is this:
A B
1 DATE AMOUNT
2 01/05/22 -$7,000.00
3 01/05/22 -$6,602.72
4 01/05/22 -$6,602.39
5 01/05/22 -$6,601.30
6 01/05/22 -$6,596.40
7 01/05/22 -$6,587.10
8 01/05/22 -$4,994.88
9 01/05/22 -$3,315.26
10 01/05/22 -$3,284.91
11 01/05/22 -$1,521.54
12 02/05/22 $400.00
13 03/05/22 $700.00
14 03/10/22 $700.00
15 03/15/22 $700.00
16 03/31/22 $1,605.00
17 03/31/22 $3.249.00
18 03/31/22 $3,731.00
19 03/31/22 $5,160.00
20 03/31/22 $6,348.00
21 03/31/22 $6,532.00
22 03/31/22 $6,786.00
23 03/31/22 $6,838.00
Any help would be appreciated. Thanks!
You are very close indeed! You should join both ranges in order to sort them by the first column:
=SORT({Performance!$B$2:$B;Performance!$C$2:$C;'Contributions/Withdrawals'!$A$2:$A,Performance!$F$2:$F;Performance!$H$2:$H;'Contributions/Withdrawals'!$B$2:$B})
(You may need to change that only comma to a inverted slash if you have another locale settings)

Find the row of highest numbers from each of names or group who'd has been have a some of similarity of names then sumif their values group of names

I want to make the total of values every each member or names in every each their own group at the first match (or after blank space) or highest values positions of each them on column "D" according to column "B" with the result's row of an output like the exactly as an EXPECT OUTPUT as act of what I've just created on column "E". That's the replace a little bit down of just only one row from the column "B" positions or row must be the same as the column "C" and "D". Could we do this anyway ?
My achievements: I feel I've tried this before and got succeed to achieve this but I've forgot how to solve this when that happened. But it's look like kinda this code of formula:
=FILTER(IF(IFERROR(MATCH($B$3:$B;$B:$B;0);0)=ROW($B$3:$B);SUMIF($B$3:$B;$B$3:$B;$D$3:$D);"");$B$3:$B<>"0")
I don't know if I'm right or wrong but please see the table I'd created at the down below this and also see how I expected with that and feel free as well to edit to my doc file of google sheet I attached down below this.
THIS HERE YOU CAN EDIT TO MY SAMPLE G.SHEET TO SOLVE THIS MY QUIZ. THANKS IN ADVANCE!
A
B
C
D
E
1
2
N U M B
I D   -   M E M B E R
I D      -     C O D E
V A L U E S
E X P E C T     O U T P U T
3
4
4
JYFI7
5
JYFI7
J3573
3
6
6
JYFI7
IYR
1
7
JYFI7
F498S
2
8
9
3
DFJ9F11
10
DFJ9F11
C684J
7
8
11
DFJ9F11
J58
1
12
13
2
H684K
14
H684K
JF585
2
2
15
16
1
FJSR
17
FJSR
4684
7
16
18
FJSR
834
1
19
FJSR
49
2
20
FJSR
9835
6
Here's a possible solution:
=ARRAYFORMULA(LAMBDA(cusum,IF(SCAN(,cusum,
LAMBDA(acc,cur,if(cur="",,acc+1)))=1,cusum,))
(SORT(SCAN(,SORT(D3:D,ROW(D3:D),0),
LAMBDA(acc,cur,if(cur="",,acc+cur))),ROW(D3:D),0)))
You can find it in tab 'z' cell F3.

Google Sheets auto increment column B with empty cells restarting from 1 at each new category String in Column A instead of continuous incrementing

I found this partial solution to my problem:
Google Sheets auto increment column A if column B is not empty
With this formula:
=ARRAYFORMULA(IFERROR(MATCH($B$2:$B&ROW($B$2:$B),FILTER($B$2:$B&ROW($B$2:$B),$B$2:$B<>""),0)))
What I need is the same but instead of continuous numbers I'd need it to restart incrementing from 1 at each new category string on an adjacent column (column A in example below, categories strings are A, B, C, D etc.).
For example:
Problem with formula in C12 and C15 (added numbers 1 and 2)
Needed result in column D, as with D11 and D19 restarts incrementing from 1 at new category string)
1
needed result
2
A
1
1
1
3
A
4
A
5
A
1
2
2
6
A
7
A
8
A
9
A
1
3
3
10
A
11
B
1
4
1
12
B
1
13
B
14
C
1
5
2
15
C
2
16
C
17
C
1
6
3
18
C
19
D
1
7
1
20
D
21
D
22
D
1
8
2
23
D
24
D
1
9
3
25
D
26
D
27
D
1
10
4
28
D
29
D
try:
=INDEX(IF(B2:B="",,COUNTIFS(A2:A&B2:B, A2:A&B2:B, ROW(A2:A), "<="&ROW(A2:A))))
or:
=INDEX(IF(B2:B="",,COUNTIFS(A2:A&IF(B2:B<>"", 1, ), A2:A&IF(B2:B<>"", 1, ), ROW(A2:A), "<="&ROW(A2:A))))
Here's another similar solution.
=ArrayFormula(if(B2:B="",,countifs(A2:A,A2:A,B2:B,"<>",row(A2:A),"<="&row(A2:A))))

Google Sheets: Compare each cell of a column seperately and check another cell in the found row for conditional formatting

Hello all Sheet users out there.
I have a sheet with a list of resources with their production and usage being calculated on the left side and the overall prod/use being monitored on the right side.
A B C D | E F G H
1 Input In Output Out | Resource totIn totOut effective
2 Iron 20 FeIngot 30 | Iron 30 =SUMIF(...) =totIn-totOut
3 Copper 20 CuIngot 20 | Copper 25 =SUMIF(...) =totIn-totOut
4 Stone 10 Gravel 50 | CuIngot =SUMIF(...) =SUMIF(...) =totIn-totOut
5 FeIngot 10 FePlate 5 | FeIngot =SUMIF(...) =SUMIF(...) =totIn-totOut
6 CuIngot 25 Wire 75 | Stone 45 =SUMIF(...) =totIn-totOut
7 CuIngot 10 Cable 20 | Gravel =SUMIF(...) =SUMIF(...) =totIn-totOut
The actual sheet would look more like this:
A B C D | E F G H
1 Input In Output Out | Resource totIn totOut effective
2 Iron 20 FeIngot 30 | Iron 30 20 10
3 Copper 20 CuIngot 20 | Copper 25 20 5
4 Stone 10 Gravel 50 | CuIngot 20 35 -15
5 FeIngot 10 FePlate 5 | FeIngot 30 10 20
6 CuIngot 25 Wire 75 | Stone 45 10 35
7 CuIngot 10 Cable 20 | Gravel 50 0 50
On the left side, I want to mark all cells in column "In" red that have a negative effective production calculated on the right side. I thought about using the conditional formatting, looping through every text cell in the "Resource" column to find the one that equals the "Input" of the same row the cell I want to check is in and then check if the "effective" value of the "Resource" I found is less than 0. The problem is that I don't know how to loop through the values and store the matching row to check if the H value is negative.
Example 1: B6 is checked. A6 needs to be compared to every cell in E2:E and when there is a match, in this case E4, check if H4 is negative. It is, so there is formatting applied.
Example 2: B3 is checked. A3 needs to be compared to every cell in E2:E and when there is a match, in this case E3, check if H3 is negative. It is not, so there is no formatting applied.
Is there any way that I can apply this formatting in the conditional formatting tool?
Keep in mind that my sheet is much more complex than these examples and it has about 120 resources that can't all be moved in order with the left side because multiple rows can use the same resource as input or output.
Thank you in advance for every ounce of your help.
try this formula =VLOOKUP($A1,$E:$H,4,false)<0 in conditional formatting

What to do if response (or label) columns are in another data frame?

I'm newbie in machine learning, so I need your advice.
Imagine, we have two data sets (df1 and df2).
First data set include about 5000 observations and some features, to simplify:
name age company degree_of_skill average_working_time alma_mater
1 John 39 A 89 38 Harvard
2 Steve 35 B 56 46 UCB
3 Ivan 27 C 88 42 MIT
4 Jack 26 A 87 37 MIT
5 Oliver 23 B 76 36 MIT
6 Daniel 45 C 79 39 Harvard
7 James 34 A 60 40 MIT
8 Thomas 28 B 89 39 Stanford
9 Charlie 29 C 83 43 Oxford
The learning problem - to predict productivity of companies from second data set (df2) for next period of time (june-2016), based on data from the first data set (df1).
df2:
company productivity date
1 A 1240 april-2016
2 B 1389 april-2016
3 C 1388 april-2016
4 A 1350 may-2016
5 B 1647 may-2016
6 C 1272 may-2016
So as we can see both data sets include feature "company". But I don't understand how I can create a link between these two features. What shoud I do with two data sets to solve the learning problem? Is it possible?

Resources