I've used Gekko to solve an MILP problem, where I am expected to select the optimal price for each product, in order to maximize my objective function (the total profit). However, I have had to slice the dataframe, as if I include the whole data set, I get a Max Equation Length Error. Is there any way to resolve this to input a larger data frame in to the solver? I was originally using the PuLP package, but the execution time of the code was taking unreasonably long for larger datasets. I've attached my code below.
# Preprocess data
price_matrix = []
profit_matrix = []
revenue_matrix = []
grevenue_matrix = []
num_prices_per_product = []
df = df_org[:200]
# Iterate over unique products
for product in df['Product'].unique():
# Filter data for the current product
product_data = df[df['Product'] == product]
# Extract unique price points and sort in descending order
price_points = np.sort(product_data['Sales Price'].unique())[::-1]
num_prices = len(price_points)
# Preallocate arrays
product_profit_matrix = np.zeros(num_prices)
product_revenue_matrix = np.zeros(num_prices)
product_grevenue_matrix = np.zeros(num_prices)
# Iterate over price points
for i, price in enumerate(price_points):
# Filter data for the current price point
price_data = product_data[product_data['Sales Price'] == price]
if not price_data.empty:
# Assign values to matrices
product_profit_matrix[i] = price_data['Profit Margin'].iloc[0]
product_revenue_matrix[i] = price_data['Revenue'].iloc[0]
product_grevenue_matrix[i] = price_data['Gross revenue'].iloc[0]
# Append matrices and other information
price_matrix.append(price_points)
profit_matrix.append(product_profit_matrix)
revenue_matrix.append(product_revenue_matrix)
grevenue_matrix.append(product_grevenue_matrix)
num_prices_per_product.append(num_prices)
start = time.time()
# Initialize gekko model
m = GEKKO(remote=False)
# Decision variables
x = {}
for i, product_name in enumerate(df['Product'].unique()):
for j in range(max(num_prices_per_product)):
x[(product_name, j)] = m.Var(value=0, lb=0, ub=1, integer=True)
# Objective function (maximize total profit)
total_profit = 0
for i, product_name in enumerate(df['Product'].unique()):
for j in range(num_prices_per_product[i]):
total_profit += profit_matrix[i][j] * x[(product_name, j)]
m.Maximize(total_profit)
# Constraint: Each product must have exactly one selected price
for i, product_name in enumerate(df['Product'].unique()):
m.Equation(sum(x[(product_name, j)] for j in range(num_prices_per_product[i])) == 1)
# Discount Constraint
revenue_difference = 0
for i, product_name in enumerate(df['Product'].unique()):
for j in range(num_prices_per_product[i]):
revenue_difference += (grevenue_matrix[i][j] - revenue_matrix[i][j]) * x[(product_name, j)]
discount_constraint = 0.13 # Set your desired maximum discount
discount_tolerance = 0.01
total_gross_revenue = 0
for i, product_name in enumerate(df['Product'].unique()):
for j in range(num_prices_per_product[i]):
total_gross_revenue += grevenue_matrix[i][j] * x[(product_name, j)]
m.Equation(revenue_difference <= (discount_constraint + discount_tolerance) * total_gross_revenue)
m.Equation(revenue_difference >= (discount_constraint - discount_tolerance) * total_gross_revenue)
# Profit Constraint
profit_constraint = 6000
profit_tolerance = 0.05
m.Equation(total_profit <= profit_constraint + (profit_tolerance * profit_constraint))
m.Equation(total_profit >= profit_constraint - (profit_tolerance * profit_constraint))
# Solve the optimization problem
m.solve(disp=True)
end = time.time()
print("Optimization Time:", end - start)
# Print the results
print("Status:", m.options.SOLVESTATUS)
print("Total Profit ($):", total_profit)
# Print the selected prices
results_df = pd.DataFrame(selected_prices, columns=["Product", "Selected Price Point", "Value (AED)"])
print(tabulate(results_df, headers='keys', tablefmt='fancy_grid', showindex=False))
I tried using Gekko and PuLP to solve a MILP problemn, but ran in to issues with using my algorithm on large datasets.
A simple problem demonstrates the issue and a potential solution. There are
nrandomly generated products, each with a randompriceandrevenue. The objective is to maximize profit by selecting the best 10 products. The solution selects the 10 products with the highest margins.When
n>=413, there is the error that a symbolic expression is over 15,000 characters.There are two changes that address the long expression problem.
m.sum()instead ofsum(). This uses thegekkosummation that takes longer to compile, but doesn't generate a single large expression that may violate the 15,000 character limit.Now the model can solve with
n=500or many more products.Here is the solution with
n=5000products.You can inspect the symbolic expressions in the temporary run folder
m._pathor open the folder withm.open_folder(). Open the text filegk0_model.apmto see the variables, equations, and objective function as an archive of what was compiled to byte-code.Large-scale problems with many binary decision variables may solve slowly. There are APOPT solver options that may help to speed up the solution with the MINLP solver. If the problem is MILP, there are good solvers such as
CPLEXandGurobithat are likely more efficient than using anMINLPsolver.