Identify stocks with same feature values using a np.array_equal() function in a nested loop

67 Views Asked by user141624 At 17 August 2025 at 15:32

I would like to understand if my code is working correctly.

The dataframe, df2 is a vertically stacked time series of a stock's feature.

stock_id	log_target_vol_corr_32_clusters_stnd
1	0.4
1	0.8
1	0.7
2	0.3
2	0.4
2	0.0
3	0.4
3	0.8
3	0.7
4	0.9
4	0.9
4	0.1
5	0.9
5	0.9
5	0.1

Notice that stocks (1 & 3) and (4 & 5) have the same feature values therefore I want to group them together into a cluster. Ultimately, I want to find all the stock ids belonging to each cluster.

## find stock ids of clusters having same feature values
column = 'log_target_vol_corr_32_clusters_stnd'
remaining_stocks = df2['stock_id'].unique().astype(int)
clusters = {}
for s in remaining_stocks:
    print(s)
    clusters[s] = []
    a1 = df2[df2['stock_id'] == s ][column]
    remaining_stocks = np.delete(remaining_stocks,np.where(remaining_stocks==s))
    for s1 in remaining_stocks:
        a2 = df2[df2['stock_id'] == s1 ][column]
        if np.array_equal(a1,a2):
            print(s1)
            remaining_stocks = np.delete(remaining_stocks,np.where(remaining_stocks==s1))
            clusters[s].append(s1)
            print(remaining_stocks)

Could you please explain what is the error in this code?

I wrote the following code and seem to get more than the actual numbers of clusters in the dataframe.

Original Q&A

Identify stocks with same feature values using a np.array_equal() function in a nested loop

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in NESTED-LOOPS

Related Questions in FEATURE-CLUSTERING

Trending Questions

Popular # Hahtags

Popular Questions

stock_id	log_target_vol_corr_32_clusters_stnd
1	0.4
1	0.8
1	0.7
2	0.3
2	0.4
2	0.0
3	0.4
3	0.8
3	0.7
4	0.9
4	0.9
4	0.1
5	0.9
5	0.9
5	0.1

stock_id	log_target_vol_corr_32_clusters_stnd
1	0.4
1	0.8
1	0.7
2	0.3
2	0.4
2	0.0
3	0.4
3	0.8
3	0.7
4	0.9
4	0.9
4	0.1
5	0.9
5	0.9
5	0.1

stock_id	log_target_vol_corr_32_clusters_stnd
1	0.4
1	0.8
1	0.7
2	0.3
2	0.4
2	0.0
3	0.4
3	0.8
3	0.7
4	0.9
4	0.9
4	0.1
5	0.9
5	0.9
5	0.1