Unified Real-Time Anomaly Detection Across Retail Fraud and Network Intrusion Streams Using Dependency-Aware Feature Extraction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Real-time monitoring requires detecting rare, high-impact anomalies across heterogeneous streams such as retail transactions and network traffic, under severe class imbalance and strict latency budgets. We propose a unified, domain-aware anomaly detection pipeline that maps both domains to a common event schema with domain masking, and fulfills two goals: (i) efficient feature extraction that captures temporal and contextual dependencies and (ii) real-time deployability. Temporal features are computed in a past-only manner per entity (time since last event and capped recent-activity counts), while contextual typicality is encoded via a train-derived entity-frequency feature mapped to validation/test without leakage. Using time-aware splits, we train gradient-boosted decision trees (LightGBM; XGBoost for comparison) and evaluate AUROC/AUPRC with validation-selected operating thresholds. On the unified test stream, the full LightGBM configuration (base+temporal+context) achieves AUROC = 0.9546 and AUPRC = 0.9042, improving over base-only (AUPRC = 0.8366) and temporal-only (AUPRC = 0.8925). Additional baselines (Logistic Regression, Isolation Forest, Random Forest, and an LSTM sequence model with seq_len = 10) confirm the competitiveness of the proposed approach, with LightGBM remaining best overall. Micro-batched inference benchmarking demonstrates operational feasibility, sustaining 55k–62k events/s with p99 latency < 0.026 ms/event. These results show that dependency-aware feature extraction combined with efficient tree ensembles enables accurate and practical unified detection for retail and network monitoring.