CP-LLM: Conformal Calibration for Time Series Interval Forecasting with Frozen Large Language Model

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Reliable prediction intervals are essential for deploying time-series forecasting systems in high-stakes domains, yet uncertainty estimates from frozen large language models (LLMs) often deviate from nominal confidence levels. This paper studies interval calibration for multi-step forecasting with frozen LLMs and proposes CP-LLM, a fine-tuning-free two-stage post-hoc framework. In Stage 1, base intervals are constructed from sampled LLM forecasts and augmented with three complementary uncertainty signals: sampling dispersion, temperature sensitivity, and serialization perturbation sensitivity. In Stage 2, base intervals are calibrated on a held-out calibration segment using either Conformalized Quantile Regression (CQR) for relatively stable regimes or Adaptive Conformal Inference (ACI) under temporal distribution shift. Experiments on five public univariate datasets with four API-accessed frozen LLMs show that CP-LLM substantially reduces coverage bias while maintaining competitive interval sharpness. Additional comparisons with traditional forecasting methods, together with ablation and parameter-sensitivity analyses, provide further evidence for the effectiveness of the proposed framework.

Article activity feed