Feature engineering for fraud detection

568 Views Asked by Diego At 17 August 2025 at 20:41

I'm doing some research into fraud detection for academic purposes. I' d like to know specifically about techniques for feature selection\engeneering from a transactional dataset. In more details, given a dataset of transactions (credit card for example), what kind of features are selected to be used on the model and how are they engineered?

All the papers I've come across focus on the model itself (SVM, NN, ...) not really touching on this subject.

Also, if anyone knows of public datasets that are not anonymized - that would also help.

Thanks

Original Q&A

There are 1 best solutions below

Hendouz On 14 May 2018 at 14:20

Having a good understanding of feature selection/ranking can be a great asset for a data scientist or machine learning practitioner. A good grasp of these methods leads to better performing models, better understanding of the underlying structure and characteristics of the data and leads to better intuition about the algorithms that underlie many machine learning models.

There are in general two reasons why feature selection is used: 1. Reducing the number of features, to reduce overfitting and improve the generalization of models. 2. To gain a better understanding of the features and their relationship to the response variables.

Possible methods:

Univariate feature selection:

Pearson Correlation
Mutual information and maximal information coefficient (MIC)
Distance correlation
Model based ranking

Tree based methods:

Random forest feature importance (Mean decrease impurity, Mean decrease accuracy)

Others:

stability selection
RFE

Feature engineering for fraud detection

There are 1 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in FEATURE-SELECTION

Related Questions in FRAUD-PREVENTION

Related Questions in FEATURE-ENGINEERING

Trending Questions

Popular # Hahtags

Popular Questions