FraudGuard — ML Fraud Detection System

PR-AUC Score

0.924

↑ 0.038 vs baseline

Fraud Recall

91.2%

at threshold 0.42

Precision

84.7%

false positive rate 2.1%

F1 Score

0.878

590k transactions eval

📈 Precision-Recall Curve

🎯 Score Distribution by Class

🔍 Confusion Matrix threshold = 0.42

Pred: Legit

Pred: Fraud

Actual: Legit

536,812

11,521

Actual: Fraud

1,847

19,043

True Negatives: 536,812

Legitimate transactions correctly cleared

True Positives: 19,043

Fraud correctly caught

False Positives: 11,521

Legitimate txns flagged (2.1%)

False Negatives: 1,847

Missed fraud (8.8%)

🧬 Top SHAP Features (Global)

Recent Transactions Stream

TXN ID	Amount	Merchant	Country	Hour	Velocity 24h	Score	Verdict

⚡ Score a Transaction

Transaction Amount ($)

Merchant Category

Hour of Day (0–23)

Day of Week (0=Mon)

Txns Last 1h (velocity)

Txns Last 24h (velocity)

Avg Spend This Merchant ($)

Distance from Home (km)

Card Age (months)

Foreign Transaction

New Device

PIN Used

🎯 Fraud Score

Enter transaction details and click Score

Best Model

LightGBM

PR-AUC: 0.924

Experiments Run

MLflow tracked

Training Time

4m 12s

590k samples · 8 cores

🏆 Model Comparison

Model

PR-AUC

Recall

Precision

F1

Train (s)

📊 Threshold Analysis

Optimal threshold at 0.42 maximizes F1. Higher threshold → fewer false positives but misses more fraud.

🔁 Cross-Validation Stability

5-fold time-series CV. Low variance confirms no data leakage. Mean PR-AUC: 0.921 ± 0.008

🏗️ Feature Engineering

STEP 01 · RAW FEATURES

Base Transaction Fields

Amount, merchant, timestamp, location, card metadata

STEP 02 · TEMPORAL

Time-Based Features

Hour, day-of-week, is_weekend, time_since_last_txn, is_night

STEP 03 · VELOCITY

Rolling Window Aggregates

txn_count_1h, txn_count_24h, total_spend_7d, unique_merchants_24h

STEP 04 · BEHAVIORAL

User Baseline Deviation

amount_vs_avg_ratio, new_merchant_flag, unusual_country, amount_z_score

STEP 05 · ENCODING

Categorical Encoding

Target encoding for merchant_category, frequency encoding for country

⚖️ Class Imbalance Strategy

Dataset: 3.5% fraud rate (highly imbalanced)

■ Legitimate 96.5%■ Fraud 3.5%

APPROACH 01

Cost-Sensitive Learning

class_weight = {0:1, 1:28} — penalizes missing fraud 28x more than false positives

APPROACH 02

PR-AUC as Primary Metric

ROC-AUC is misleading on imbalanced data. PR-AUC focuses on the minority class.

APPROACH 03

Threshold Calibration

Default 0.5 is wrong. Optimal threshold 0.42 found via F1 maximization on val set.

APPROACH 04

Time-Based CV Splits

No random splits. 5-fold chronological CV prevents leakage from future to past.

💻 LightGBM Hyperparameters (Best Run)

n_estimators

1200

learning_rate

0.05

num_leaves

127

max_depth

min_child_samples

subsample

0.8

colsample_bytree

0.7

reg_alpha

0.1

reg_lambda

0.2

scale_pos_weight

early_stopping

metric

average_precision

📦 Tech Stack

Python 3.11LightGBM 4.xscikit-learnSHAPpandasnumpyFastAPIMLflowDockerpytestevidently

        $ pip install lightgbm shap scikit-learn pandas fastapi mlflow evidently

        $ python src/train.py --config configs/lgbm_best.yaml

        $ mlflow ui  # experiment tracking

        $ uvicorn src.api:app --reload --port 8000

        $ docker compose up --build