MODEL LIVE ยท v2.3.1
LightGBM ยท PR-AUC 0.924
View on GitHub
PR-AUC Score
0.924
โ†‘ 0.038 vs baseline
Fraud Recall
91.2%
at threshold 0.42
Precision
84.7%
false positive rate 2.1%
F1 Score
0.878
590k transactions eval
๐Ÿ“ˆ Precision-Recall Curve
๐ŸŽฏ Score Distribution by Class
๐Ÿ” Confusion Matrix threshold = 0.42
Pred: Legit
Pred: Fraud
Actual: Legit
536,812
TN
11,521
FP
Actual: Fraud
1,847
FN
19,043
TP
True Negatives: 536,812
Legitimate transactions correctly cleared
True Positives: 19,043
Fraud correctly caught
False Positives: 11,521
Legitimate txns flagged (2.1%)
False Negatives: 1,847
Missed fraud (8.8%)
๐Ÿงฌ Top SHAP Features (Global)
Recent Transactions Stream
TXN IDAmountMerchantCountryHourVelocity 24hScoreVerdict
โšก Score a Transaction
๐ŸŽฏ Fraud Score
Enter transaction details and click Score
Best Model
LightGBM
PR-AUC: 0.924
Experiments Run
47
MLflow tracked
Training Time
4m 12s
590k samples ยท 8 cores
๐Ÿ† Model Comparison
Model
PR-AUC
Recall
Precision
F1
Train (s)
๐Ÿ“Š Threshold Analysis
Optimal threshold at 0.42 maximizes F1. Higher threshold โ†’ fewer false positives but misses more fraud.
๐Ÿ” Cross-Validation Stability
5-fold time-series CV. Low variance confirms no data leakage. Mean PR-AUC: 0.921 ยฑ 0.008
๐Ÿ—๏ธ Feature Engineering
STEP 01 ยท RAW FEATURES
Base Transaction Fields
Amount, merchant, timestamp, location, card metadata
STEP 02 ยท TEMPORAL
Time-Based Features
Hour, day-of-week, is_weekend, time_since_last_txn, is_night
STEP 03 ยท VELOCITY
Rolling Window Aggregates
txn_count_1h, txn_count_24h, total_spend_7d, unique_merchants_24h
STEP 04 ยท BEHAVIORAL
User Baseline Deviation
amount_vs_avg_ratio, new_merchant_flag, unusual_country, amount_z_score
STEP 05 ยท ENCODING
Categorical Encoding
Target encoding for merchant_category, frequency encoding for country
โš–๏ธ Class Imbalance Strategy
Dataset: 3.5% fraud rate (highly imbalanced)
โ–  Legitimate 96.5%โ–  Fraud 3.5%
APPROACH 01
Cost-Sensitive Learning
class_weight = {0:1, 1:28} โ€” penalizes missing fraud 28x more than false positives
APPROACH 02
PR-AUC as Primary Metric
ROC-AUC is misleading on imbalanced data. PR-AUC focuses on the minority class.
APPROACH 03
Threshold Calibration
Default 0.5 is wrong. Optimal threshold 0.42 found via F1 maximization on val set.
APPROACH 04
Time-Based CV Splits
No random splits. 5-fold chronological CV prevents leakage from future to past.
๐Ÿ’ป LightGBM Hyperparameters (Best Run)
n_estimators
1200
learning_rate
0.05
num_leaves
127
max_depth
8
min_child_samples
50
subsample
0.8
colsample_bytree
0.7
reg_alpha
0.1
reg_lambda
0.2
scale_pos_weight
28
early_stopping
50
metric
average_precision
๐Ÿ“ฆ Tech Stack
Python 3.11LightGBM 4.xscikit-learnSHAPpandasnumpyFastAPIMLflowDockerpytestevidently
$ pip install lightgbm shap scikit-learn pandas fastapi mlflow evidently
$ python src/train.py --config configs/lgbm_best.yaml
$ mlflow ui  # experiment tracking
$ uvicorn src.api:app --reload --port 8000
$ docker compose up --build