String not converting to numeric

27 Views Asked by At

Hi i have strings i want to convert to numeric, basically to get the difference in area under the graph. (I have nothing to add but I have to because StackOverflow says so)

Graph looks something like this: enter image description here

Code I have tried

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({
  "TimeStamp": TimeStamp,
  "average_pv_conc": average_pv_conc.values,
    "average_pv_green": average_pv_green.values
})

df['SortingTime'] = pd.to_datetime(df['TimeStamp'], format='%H:%M')
sorted_indices = df['SortingTime'].argsort()
df = df.loc[sorted_indices].reset_index(drop=True)
df = df.drop(columns='SortingTime')

# Convert to numeric types
df['average_pv_conc'] = pd.to_numeric(df['average_pv_conc'], errors='coerce')
df['average_pv_green'] = pd.to_numeric(df['average_pv_green'], errors='coerce')

# Use 'o' as a marker for scatter plot
plt.scatter(df['TimeStamp'], df['average_pv_conc'], label='PV conc', marker='.', color='b', s=marker_size)
plt.scatter(df['TimeStamp'], df['average_pv_green'], label = 'PV green', marker='.', color='c', s=marker_size)

# Adding labels and title
plt.xlabel('TimeStamp')
plt.ylabel('Values')
plt.title('Plot of Column1 and Column2 against TimeStamp')

# Adding a legend
plt.legend()

# Display the plot
plt.show()

# Calculate the area under the curves using the trapezoidal rule
area_column1 = np.trapz(df['average_pv_conc'], df['TimeStamp'])
area_column2 = np.trapz(df['average_pv_green'], df['TimeStamp'])

# Find the difference in areas
area_difference = abs(area_column1 - area_column2)

print(f"Difference in areas under the curves: {area_difference}")

The error is: TypeError: unsupported operand type(s) for -: 'str' and 'str'

0

There are 0 best solutions below