import pandas as pd
a = [0.0, 0.6900000000000001, 1.3800000000000001, 1.3800000000000001, 2.0700000000000003, 2.7600000000000002, 2.7600000000000002, 3.45, 4.1262, 4.140000000000001, 4.140000000000001, 4.83, 5.5200000000000005, 5.5200000000000005, 5.9754000000000005, 6.4308000000000005, 6.6654, 6.9]
b = [0.0, 0.6900000000000001, 1.3800000000000001, 1.3800000000000001, 2.0700000000000003, 2.7600000000000002, 2.7600000000000002, 3.45, 4.1262, 4.140000000000001, 4.140000000000001, 4.83, 5.5200000000000005, 5.5200000000000005, 5.9754000000000005, 6.4308000000000005, 6.4308000000000005, 6.6654, 6.9]
df1 = pd.DataFrame(a, columns=['x(m)'])
df2 = pd.DataFrame(b, columns=['x(m)'])
df1:
x(m)
0 0.0000
1 0.6900
2 1.3800
3 1.3800
4 2.0700
5 2.7600
6 2.7600
7 3.4500
8 4.1262
9 4.1400
10 4.1400
11 4.8300
12 5.5200
13 5.5200
14 5.9754
15 6.4308
16 6.6654
17 6.9000
df2:
x(m)
0 0.0000
1 0.6900
2 1.3800
3 1.3800
4 2.0700
5 2.7600
6 2.7600
7 3.4500
8 4.1262
9 4.1400
10 4.1400
11 4.8300
12 5.5200
13 5.5200
14 5.9754
15 6.4308
16 6.4308
17 6.6654
18 6.9000
The difference of the two dataframes is row 16 in df2 is not in df1. How can I get the difference? The output can be the row index 16 plus the value 6.4308 , or a combined dataframe with the third column specifying not in df1 or anything similar. Thanks.
IIUC, do you want something like this?
Output:
Note:
pd.DataFrame.mergewithout theondefined will merge dataframes on all identical columns in the dataframe. If you want to keep the indexing in both dataframes,reset_indexfirst.And, if you only one the difference then add
.query('in_df != "both"')