I'm trying to use SMOGN to balance my data but it's giving TypeError or UFuncTypeError how to solve this problem?

39 Views Asked by Рим At 23 January 2024 at 12:37

I have data as images(arrays) with their labels uploaded from folders. the data is imbalanced and i'm trying to balance it using smgon after creating dataframe.

here's the code:

    r_labels=[]
    im=[]
    for filename in os.listdir(folder):
        img = cv.imread(os.path.join(folder, filename))
        if img is not None:
            aio_plant = filename.split("_")
            flowering_time = aio_plant[2].split(".")[0]
            im.append(np.asarray(img).astype(np.float32))
            r_labels.append(np.uint8(flowering_time))
    df = pd.DataFrame({'images': im, 'labels':r_labels})  
    sm= smogn.smoter(
        data = df,  ## pandas dataframe
        y = 'labels'  ## string ('header name')
        )

this is giving an error: TypeError: unhashable type: 'numpy.ndarray' I tried to change the type like this:

            r_labels.append(flowering_time)

and it gives: UFuncTypeError: ufunc 'subtract' did not contain a loop with signature matching types (dtype('<U2'), dtype('<U2')) -> None

the data looks like this:

                                                 images  labels
0     [[[0.0, 0.0, 255.0], [0.0, 255.0, 0.0], [0.0, ...      86
1     [[[255.0, 0.0, 0.0], [255.0, 0.0, 0.0], [0.0, ...      53
2     [[[255.0, 0.0, 0.0], [0.0, 255.0, 0.0], [255.0...      46
3     [[[255.0, 0.0, 0.0], [0.0, 255.0, 0.0], [0.0, ...      44
4     [[[255.0, 0.0, 0.0], [255.0, 0.0, 0.0], [255.0...      63
...                                                 ...     ...
998   [[[0.0, 0.0, 255.0], [0.0, 255.0, 0.0], [255.0...      86
999   [[[255.0, 0.0, 0.0], [0.0, 255.0, 0.0], [255.0...     215
1000  [[[0.0, 0.0, 255.0], [0.0, 0.0, 255.0], [0.0, ...      92
1001  [[[255.0, 0.0, 0.0], [0.0, 255.0, 0.0], [255.0...      61
1002  [[[255.0, 0.0, 0.0], [0.0, 255.0, 0.0], [255.0...     183

Original Q&A

There are 1 best solutions below

Рим On 23 January 2024 at 14:12 BEST ANSWER

I solved the problem by converting labels to hashable integers and images column to string representation of NumPy array then converting them back after smote.

# Convert labels to hashable integers
    df['labels'] = df['labels'].astype(int)
    # Convert images column to string representation of NumPy array
    df['images'] = df['images'].apply(lambda x: np.array2string(x.flatten(), separator=','))

    sm= smogn.smoter(
        data = df,  ## pandas dataframe
        y = 'labels',  ## string ('header name')
        )
    sm['images'] = sm['images'].apply(lambda x: np.fromstring(x[1:-1], sep=','))
    df['labels'] = df['labels'].astype(int)

I'm trying to use SMOGN to balance my data but it's giving TypeError or UFuncTypeError how to solve this problem?

There are 1 best solutions below

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in IMBALANCED-DATA

Related Questions in SMOTE

Trending Questions

Popular # Hahtags

Popular Questions