I am working with panda dataframes, having different columns, but let's focus on depth and value1. I used to concatenate different dataframes matching a condition, and then dropping the duplicates over one column. The problem is that the depths can be different and the dropping doesn't work. Is there an efficient way to check the intersection depths between the dataframes and then prioritizing one of the two over that depth?
here is a snippet
for i in range(loc_list_unique.__len__()):
###initialization
for j in range(list.__len__()):
if loc_list[j] == loc_list_unique[i]:
item = list[j]
temp = full.loc[full["Location"]==item].sort_values(by="depth")
temp_concat = pd.concat([temp_concat, temp])
###Merge multiple CPTs at 1 location and drops last values
cond = ['38','42']
if loc_list_unique[i] in cond:
item_be_kept = 'first'
else:
item_be_kept = 'last'
temp_concat =temp_concat.drop_duplicates(subset='depth', keep=item_be_kept).sort_values(by="depth").reset_index(drop=True)
Thanks in advance! F
I am trying to find the intersection and manually copying the DF
if temp_concat.__len__()>0:
min1 = temp_concat['SCPT_DPTH'].min()
max1 = temp_concat['SCPT_DPTH'].max()
else:
min1=0
max1=0
min2 = temp['SCPT_DPTH'].min()
max2 = temp['SCPT_DPTH'].max()
depth_intersection_min = max(min1, min2)
depth_intersection_max = min(max1, max2)
if depth_intersection_min < depth_intersection_max:
df1_within_intersection = df1[(df1['depth'] >= depth_intersection_min) & (df1['depth'] <= depth_intersection_max)]
df2_within_intersection = df2[(df2['depth'] >= depth_intersection_min) & (df2['depth'] <= depth_intersection_max)]
# Prioritize one dataframe over the other based on specified conditions within the intersection
prioritized_within_intersection = df1_within_intersection # or df2_within_intersection based on your conditions
temp_concat = pd.concat([prioritized_within_intersection, df1[~df1['depth'].between(depth_intersection_min, depth_intersection_max)], df2[~df2['depth'].between(depth_intersection_min, depth_intersection_max)]])
```