An Adaptive Attention-Based GRU Framework with Reinforcement Learning for Cold-Start Prediction and Mitigation in Serverless Computing

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Serverless computing, exemplified by AWS Lambda, suffers from cold start latencies that impair performance in dynamic workloads. This study proposes an Adaptive Attention-GRU framework integrated with a Deep Deterministic Policy Gradient reinforcement learning agent to predict and preempt cold starts through dynamic provisioned concurrency adjustments. The Adaptive Attention-GRU leverages hierarchical attention and multi-scale feature encoding to capture short- and long-term patterns in Lambda metrics from AWS CloudWatch, while a dynamic thresholding mechanism triggers pre-warming via Step Functions and PyTorch Lightning with Ray RLlib integration. Evaluated on 90 days of real-world traces across 127 functions, the model achieves 93.63% accuracy, 98.61% ROC-AUC, and 92.46% recall, reducing cold starts by 94.7% with only 23.1% cost increase over baseline scaling. Key innovations include adaptive instantaneous feature weighting, multi-scale temporal modeling, and cost-aware RL autoscaling, offering a validated tool for optimizing latency, cost, and resource efficiency in production serverless environments.

Article activity feed