Time-series trend analysis in python

677 Views Asked by At

I have some data like shown in the table below. I want to figure out the point of change in the trend when a line is plotted with x=date_code and y= mass_weight. something like the attached image. There should be a constant decline or increase in weight_kg values after that point.

This is what my data looks like:

   date_code  weight_kg
0        354     215.16
1        355     502.59
2        356     568.15
3        357     328.20
4        358     824.07

I'm trying to figure out the change point in trend when we plot the data. There are so many weight_kg and date_code in the original data. I want to divide the date_code into specific periods and identify the change point in the weight_kg trend for each of the periods. You can access the data on the link below. I am using Python. The data shared is an example of a date_code period.

This is what the plot would look like: trend plot

Here's the link to my dataset
1

There are 1 best solutions below

1
Musabbir Arrafi On BEST ANSWER

Here's your solution

You can change the date window as you like.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Read data from Excel file
df = pd.read_excel("stckovflw.xlsx")

# Extract date_code and weight_kg columns
date_code = df['date_code']
weight_kg = df['weight_kg']
# Calculate differences between consecutive weight values
weight_diff = np.diff(weight_kg)

# Find the index of the maximum weight difference
max_diff_index = np.argmax(np.abs(weight_diff))


# Calculate moving average with a window of your choice (e.g., 5) for both positive and negative trends
window = 5

positive_trend_avg = weight_kg.rolling(window=window).mean()
negative_trend_avg = weight_kg[::-1].rolling(window=window).mean()[::-1]

# Calculate the overall moving average
overall_moving_avg = weight_kg.rolling(window=window).mean()

# Find the index of the maximum weight difference
max_diff_index = positive_trend_avg.idxmax()

# Calculate slopes from the intercept of changepoint to the last day's weight
slope_positive = (weight_kg.iloc[-1] - positive_trend_avg.iloc[max_diff_index]) / (date_code.iloc[-1] - date_code.iloc[max_diff_index])
slope_negative = (weight_kg.iloc[-1] - negative_trend_avg.iloc[max_diff_index]) / (date_code.iloc[-1] - date_code.iloc[max_diff_index])

# Plot the data, the detected changepoint, the positive and negative trend moving averages,
# the slopes from changepoint to the last day's weight, and the overall moving average
plt.plot(date_code, weight_kg, marker='o', linestyle='-', color='b', label='Weight')
plt.plot(date_code, positive_trend_avg, color='orange', linestyle='--', label=f'{window}-Day Positive Trend')
plt.plot(date_code, negative_trend_avg, color='green', linestyle='--', label=f'{window}-Day Negative Trend')
plt.plot([date_code.iloc[max_diff_index], date_code.iloc[-1]], [positive_trend_avg.iloc[max_diff_index], weight_kg.iloc[-1]], color='purple', linestyle='-', label='Positive Trend Slope')
plt.plot([date_code.iloc[max_diff_index], date_code.iloc[-1]], [negative_trend_avg.iloc[max_diff_index], weight_kg.iloc[-1]], color='red', linestyle='-', label='Negative Trend Slope')
plt.axvline(x=date_code.iloc[max_diff_index], color='r', linestyle='--', label='Changepoint')
plt.xlabel('Date Code')
plt.ylabel('Weight (kg)')
plt.title('Weight Trend with Positive and Negative Trend Moving Averages, Slopes, Overall Moving Average, and Detected Changepoint')
plt.legend()
plt.show()

# Print the detected changepoint
print("Detected changepoint:")
print("Date Code:", date_code.iloc[max_diff_index])
print("Weight (kg):", weight_kg.iloc[max_diff_index])

# Print the calculated slopes
print("Positive Trend Slope:", slope_positive)
print("Negative Trend Slope:", slope_negative)

Output:

Detected changepoint:
Date Code: 396
Weight (kg): 3155.09
Positive Trend Slope: -16.937824999999997
Negative Trend Slope: -12.673725000000001

trend analysis output