Temporal and Behavioural Feature Attribution Analysis for E-commerce Transaction Fraud Detection: A Comparative Study of Explainability Methods
DOI:
https://doi.org/10.66372/JGER.v2i2.2Keywords:
explainable artificial intelligence, feature attribution, transaction fraud detection, class imbalanceAbstract
Online transaction fraud imposes substantial financial losses on digital commerce ecosystems globally, with reported annual losses exceeding USD 48 billion as of 2024. While gradient boosting classifiers have achieved strong detection performance, the interpretability of their predictions remains inadequately addressed, particularly regarding the relative contribution of temporal and behavioral features to fraud identification. This study presents a systematic comparative analysis of three mainstream explainability methods—SHapley Additive exPlanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), and Permutation Feature Importance (PFI)—applied to XGBoost and LightGBM classifiers trained on two publicly available benchmark datasets: the ULB Credit Card Fraud Detection Dataset (284,807 transactions; fraud prevalence: 0.172%) and the IEEE-CIS Fraud Detection Dataset (590,540 transactions; fraud prevalence: 3.5%). A structured feature engineering pipeline constructs temporal interval features, cyclic time encodings, and rolling behavioral statistics. Class imbalance is addressed via Synthetic Minority Over-sampling Technique (SMOTE). Detection performance is evaluated using precision-recall area under the curve (PR-AUC) and F1-score. Cross-method attribution consistency is quantified via Spearman rank correlation across top-10 feature rankings. Results indicate that engineered temporal and behavioral features yield consistent PR-AUC improvements of 4.3–5.6 percentage points, and that SHAP demonstrates substantially higher attribution stability than LIME and PFI across all experimental configurations.

