Modelling profitability of credit card by Markov Decision Process.

404 Views Asked by AnkitPandey At 27 July 2025 at 23:14

This is with reference to a paper published on Modelling the profitability of credit cards by Markov Decision processed.I am trying to implement the same in python using Mdptoolbox but not getting the output in the format expected.

My states are the combination of current Risk score and current credit limit of a customer. My actions are to increase the limit of a customer.

I have prepared my transition probabilities for each state.

On running the MDP code using Python MDPtoolbox, I am getting a policy vector which is insufficient for my usage as i need to have the optimal policy for combination of each risk score and credit limit. My current output tells me to increase the limit of a particular risk band to a new limit, which is too generic.

import mdptoolbox
import numpy as np
transitions = np.array([
#Limit1
    [
            [0.2, 0.798, 0.001, 0.001], #s1
            [0.001, 0.1, 0.2, 0.699], #s2
            [0.099, 0.001, 0.8, 0.1], #s3
            [0.001, 0.001, 0.898, 0.1] #s4
    ],
#Limit2
    [
            [0.2, 0.001, 0.001, 0.798], #s1
            [0.001, 0.2, 0.798, 0.001], #s2
            [0.001, 0.4, 0.1, 0.499], #s3
            [0.1, 0.2, 0.001, 0.699] #s4
    ],
#Limit3
    [
            [0.001, 0.1, 0.001, 0.898], #s1
            [0.798, 0.2, 0.001, 0.001], #s2
            [0.001, 0.001, 0.001, 0.997], #s3
            [0.001, 0.2, 0.5, 0.299] #s4
    ],
#Limit4
        [
                [0.2, 0.001, 0.001, 0.798], #s1
                [0.1, 0.001, 0.299, 0.6], #s2
                [0.001, 0.1, 0.001, 0.898], #s3
                [0.001, 0.001, 0.1, 0.898] #s4
        ]
])
rewards = np.array([
        [0, 0, 0.9, 0.1],
        [0, 0.8, 0, 0.2],
        [0.1, 0, 0.7, 0.2],
        [0, 0, 0, 1.0]
        ])
vi = mdptoolbox.mdp.ValueIteration(transitions, rewards,0.995)
vi.run();
vi.policy
vi.V

The final Policy that I am getting is:

vi.policy
Out[86]: (2, 1, 2, 3)

which only says that increase the limit of a customer with risk score S1 to limit 2 and so on. which is too generic. What I am expecting is the matrix policy which tells me how much limit I should increase for each combination of credit risk score and limit.

Original Q&A

Modelling profitability of credit card by Markov Decision Process.

There are 0 best solutions below

Related Questions in CREDIT-CARD

Related Questions in MARKOV-DECISION-PROCESS

Related Questions in VALUE-ITERATION

Trending Questions

Popular # Hahtags

Popular Questions