Join two pandas dataframes based on line order - join

I have two dataframes df1 and df2 I want to join. Their indexes are not the same and they don't have any common columns. What I want is to join them based on the order of the rows, i.e. join the first row of df1 with the first row of df2, the second row of df1 with the second row of df2, etc.
Example:
df1:
'A' 'B'
0 1 2
1 3 4
2 5 6
df2:
'C' 'D'
0 7 8
3 9 10
5 11 12
Should give
'A' 'B' 'C' 'D'
0 1 2 7 8
3 3 4 9 10
5 5 6 11 12
I don't care about the indexes in the final dataframe. I tried reindexing df1 with the indexes of df2 but could not make it work.

You could assign to df1 index of df2 and then use join:
df1.index = df2.index
res = df1.join(df2)
In [86]: res
Out[86]:
'A' 'B' 'C' 'D'
0 1 2 7 8
3 3 4 9 10
5 5 6 11 12
Or you could do it in one line with set_index:
In [91]: df1.set_index(df2.index).join(df2)
Out[91]:
'A' 'B' 'C' 'D'
0 1 2 7 8
3 3 4 9 10
5 5 6 11 12

Try concat:
pd.concat([df1.reset_index(), df2.reset_index()], axis=1)
The reset_index() calls make the indices the same, then, concat with axis=1 simply joins horizontally.

I guess you can try to join them (doing this it performs the join on the index, which is the same for the two DataFrame due to reset_index):
In [18]: df1.join(df2.reset_index(drop=True))
Out[18]:
'A' 'B' 'C' 'D'
0 1 2 7 8
1 3 4 9 10
2 5 6 11 12

Related

Intercalate columns when they are in pairs

Using this table:
A
B
C
D
1
2
3
4
5
6
7
8
9
10
11
12
In Google Sheets if I do this here in column E:
={A1:B3;C1:D3}
Teremos:
E
F
1
2
5
6
9
10
3
4
7
8
11
12
But the result I want is this:
E
F
1
2
3
4
5
6
7
8
9
10
11
12
I tried multiple options with FLATTEN, but none of them returned what I wanted.
Well you can try:
=WRAPROWS(TOCOL(A1:D3),2)
You could try with MAKEARRAY
=MAKEARRAY(ROWS(A1:D3)*2,2,LAMBDA(r,c,INDEX(FLATTEN(A1:D3),c+(r-1)*2)))
GENERAL ANSWER
For you or anyone else: to do something similar but with a variable number of columns of origin or of destination, you can use this formula. Changing the range and amount of columns at the end of LAMBDA:
=LAMBDA(range,cols,MAKEARRAY(ROWS(range)*ROUNDUP(COLUMNS(range)/cols),cols,LAMBDA(r,c,IFERROR(INDEX(FLATTEN(range),c+(r-1)*cols)))))(A1:D3,2)
you can do:
={FLATTEN({A1:A3, C1:C3}), FLATTEN({B1:B3, D1:D3})}
for more columns, it could be automated with MOD

How To Skip Down by 1 Row/Cell The Formula Output and Remove The Last Sequential Output Before 1's Google Sheets?

I've got these 3 groups of data in range F2:G22 as below
(3 groups as minimal example, in reality many thousands of groups, and recurrent similar datasets expected in the future):
I need to number each group's rows sequentially, starting over at 1 at each new group.
The expected result would be like in range E1:E22.
I tried the following formula n cell C2 , then in cell D3:
=INDEX(IF(A2:A22="",COUNTIFS(B2:B22&A2:A22, B2:B22&A2:A22, ROW(B2:B22), "<="&ROW(B2:B22)),1))
In C2:
In D3:
That fixed partially the sequence issue, but there's still 2 issues I can't find remedy for.
1st remaining issue:
I'd prefer not having to manually do the C2 to D3 step each time I get new similar data (but would accomodate if there's no simple solution to this issue).
Is there a simple way to modify the formula to make it output the correct sequencing from C2 ?
2nd remaining issue:
At rows 7, 14 and 23 there still remain unecessary ending numbering for these intermediary rows in D7 , D14 , and D23:
I could only think of an extra manual step of filtering out the non-blank rows in Column A to fix this 2nd issue (i.e. Highlighting Column A > Data tab > Create Filter > Untick all > Tick Blanks > Copy All > Paste In new Sheet).
But would there be a way to do it in the same formula? I'm not seeing the way to add the proper filter or using another method in the formula.
Any help is greatly appreciated.
EDIT (Sorry for Forgotten Sample):
Formula Input A
Formula Input B
Formula Output 1
Formula Output 2
EXPECTED RESULT
rockinfreakshow
ztiaa
DATA
DATA BY GROUP
7
1
1
7
7
2
1
1
1
2
Element-1
Group-1
7
3
2
2
2
3
Element-2
Group-1
7
4
3
3
3
4
Element-3
Group-1
7
5
4
4
4
5
Element-4
Group-1
8
1
5
6
8
8
2
1
1
1
7
Element-1
Group-2
8
3
2
2
2
8
Element-2
Group-2
8
4
3
3
3
9
Element-3
Group-2
8
5
4
4
4
10
Element-4
Group-2
8
6
5
5
5
11
Element-5
Group-2
8
7
6
6
6
12
Element-6
Group-2
9
1
7
13
9
9
2
1
1
1
14
Element-1
Group-3
9
3
2
2
2
15
Element-2
Group-3
9
4
3
3
3
16
Element-3
Group-3
9
5
4
4
4
17
Element-4
Group-3
9
6
5
5
5
18
Element-5
Group-3
9
7
6
6
6
19
Element-6
Group-3
9
8
7
7
7
20
Element-7
Group-3
9
9
8
8
8
21
Element-8
Group-3
9
Can you try:
=INDEX(LAMBDA(y,z,
IF(LEN(z),COUNTIFS(y,y,ROW(z),"<="&ROW(z)),))
(LOOKUP(ROW(G2:G),FILTER(ROW(G2:G),BYROW(G2:G,LAMBDA(z,IF(z<>OFFSET(z,-1,0),row(z),0))))),G2:G))
You can simply use SCAN.
=SCAN(,G2:G,LAMBDA(a,c,IF(c="",,a+1)))
Sample sheet

Google Sheets auto increment column B with empty cells restarting from 1 at each new category String in Column A instead of continuous incrementing

I found this partial solution to my problem:
Google Sheets auto increment column A if column B is not empty
With this formula:
=ARRAYFORMULA(IFERROR(MATCH($B$2:$B&ROW($B$2:$B),FILTER($B$2:$B&ROW($B$2:$B),$B$2:$B<>""),0)))
What I need is the same but instead of continuous numbers I'd need it to restart incrementing from 1 at each new category string on an adjacent column (column A in example below, categories strings are A, B, C, D etc.).
For example:
Problem with formula in C12 and C15 (added numbers 1 and 2)
Needed result in column D, as with D11 and D19 restarts incrementing from 1 at new category string)
1
needed result
2
A
1
1
1
3
A
4
A
5
A
1
2
2
6
A
7
A
8
A
9
A
1
3
3
10
A
11
B
1
4
1
12
B
1
13
B
14
C
1
5
2
15
C
2
16
C
17
C
1
6
3
18
C
19
D
1
7
1
20
D
21
D
22
D
1
8
2
23
D
24
D
1
9
3
25
D
26
D
27
D
1
10
4
28
D
29
D
try:
=INDEX(IF(B2:B="",,COUNTIFS(A2:A&B2:B, A2:A&B2:B, ROW(A2:A), "<="&ROW(A2:A))))
or:
=INDEX(IF(B2:B="",,COUNTIFS(A2:A&IF(B2:B<>"", 1, ), A2:A&IF(B2:B<>"", 1, ), ROW(A2:A), "<="&ROW(A2:A))))
Here's another similar solution.
=ArrayFormula(if(B2:B="",,countifs(A2:A,A2:A,B2:B,"<>",row(A2:A),"<="&row(A2:A))))

Octave Conditional Merging of matrices

I have searched for an Octave function that facilitates conditional merging of matrices but haven't one so far. My goal is to do this using vectors without looping. Here is an example of what I am trying to do.
A= [1 1
2 2
3 1
5 2];
B= [1 9
2 10];
I would like to get C as
C= [1 1 9
2 2 10
3 1 9
5 2 10];
Is there a function that takes A, B and the list of column(s) to join on and then produce C?
You can use the second output of ismember to find the occurrences of the second column of A in the first column of B and then use that to grab specific entries from the second column of B to construct C.
[~, inds] = ismember(A(:,2), B(:,1));
C = [A, B(inds,2)];
%// 1 1 9
%// 2 2 10
%// 3 1 9
%// 5 2 10

Using COUNTIFS on 3 different columns and then need to SUM a 4th column?

I have written this formula below. I do not know the correct part of this formula that will add the numbers I have in Column AB2:AB552. As it is, this formula is counting the number of cells in that range that has numbers in it, but I need it to total those numbers as my final result. Any help would be great.
=COUNTIFS(Cases!B2:B552,"1",Cases!G2:G552,"c*",Cases!X2:X552,"No",**Cases!AB2:AB552,">0"**)
Assuming you don't actually need the intermediate counts, the sumifs function should give you the final result:
=SUMIFS(Cases!AB2:AB552,Cases!B2:B552,1,Cases!G2:G552,"c",Cases!X2:X552,"No",Cases!AB2:AB552,">0")
Testing this with some limited data:
Row B G X AB
2 2 a No 10
3 1 c No 24
4 2 c No 4
5 1 c No 0
6 1 a Yes 9
7 2 c No 12
8 2 c No 6
9 2 b No 0
10 1 b No 0
11 1 a No 10
12 2 c No 6
13 1 c No 20
14 1 c No 4
15 1 b Yes 22
16 1 b Yes 22
the formula above returned 48, the sum of AB3, AB13, and AB14, which were the only rows matching all 4 criteria

Resources