Modeling Publisher Behavior Through Conditional Feature Patterns for Fraudulent Activity Detection
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Detecting click fraud is challenging due to the absence of labeled ground truth in user-click datasets, limiting the ability to classify publishers as genuine or fraudulent. Existing feature sets, while helpful, lack the granularity to capture evolving behavioral patterns. This work proposes novel 16 composite features by statistically aggregating clickstream attributes (mean, variance, skewness, and standard deviation) over finer time intervals. Experiments on the FDMA2012 dataset were conducted in three stages: using baseline features, combining baseline with proposed features, and applying Kruskal–Wallis and ranking techniques for feature relevance.Evaluated via 10-fold cross-validation, the enhanced feature set achieved an average precision of 86.12%, showing improved discrimination between fraudulent and legitimate publishers.