Unified Real-Time Anomaly Detection Across Retail Fraud and Network Intrusion Streams Using Dependency-Aware Feature Extraction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Real-time monitoring requires detecting rare, high-impact anomalies across heterogeneous streams such as retail transactions and network traffic, under severe class imbalance and strict latency budgets. We propose a unified, domain-aware anomaly detection pipeline that maps both domains to a common event schema with domain masking, and fulfills two goals: (i) efficient feature extraction that captures temporal and contextual dependencies and (ii) real-time deployability. Temporal features are computed in a past-only manner per entity (time since last event and capped recent-activity counts), while contextual typicality is encoded via a train-derived entity-frequency feature mapped to validation/test without leakage. Using time-aware splits, we train gradient-boosted decision trees (LightGBM; XGBoost for comparison) and evaluate AUROC/AUPRC with validation-selected operating thresholds. On the unified test stream, the full LightGBM configuration (base+temporal+context) achieves AUROC = 0.9546 and AUPRC = 0.9042, improving over base-only (AUPRC = 0.8366) and temporal-only (AUPRC = 0.8925). Additional baselines (Logistic Regression, Isolation Forest, Random Forest, and an LSTM sequence model with seq_len = 10) confirm the competitiveness of the proposed approach, with LightGBM remaining best overall. Micro-batched inference benchmarking demonstrates operational feasibility, sustaining 55k–62k events/s with p99 latency < 0.026 ms/event. These results show that dependency-aware feature extraction combined with efficient tree ensembles enables accurate and practical unified detection for retail and network monitoring.

Article activity feed