Interpretability requires interaction and integration of complexity-theoretic and experimental efforts

Federico G Adolfi

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

System opacity underlies many of the risks we currently worry about in AI and undermines many of the intended scientific applications. Understanding the conditions under which we can provably or otherwise reasonably guarantee that interpretability methods meet the requirements of scientific and societal needs is a central concern. We argue that the interpretability field is at a critical juncture, with a minimal foundation of theoretical and empirical results, but it is not well positioned to seize the opportunities or meet the challenges. This will require an approach so far unexplored: interaction of complexity-theoretic and experimental efforts and continual integration of their formal and empirical results. We present a comprehensive, actionable research strategy to do so based on computational modeling and parameterized complexity analysis, and algorithmic design and parametric experimentation. We illustrate its potential and feasibility by arguing from case studies of input and component attribution.

Version published to 10.31219/osf.io/ws8gd_v1 on OSF Preprints
Jun 11, 2025

Why P = NP? The Heuristic Physics Perspective

This article has 1 author:
1. Rogério Figurelli
This article has no evaluationsLatest version Jun 17, 2025
Reason Without Reverence: AI as a Partner in Scientific Validation

This article has 1 author:
1. Steven Bryant
This article has no evaluationsLatest version May 7, 2025
Depicting Falsifiability in Algebraic Modelling

This article has 2 authors:
1. Achim Schlather
2. Martin Schlather
This article has no evaluationsLatest version Jun 4, 2025

Listed in

Abstract

Article activity feed

Related articles

Why P = NP? The Heuristic Physics Perspective

Reason Without Reverence: AI as a Partner in Scientific Validation

Depicting Falsifiability in Algebraic Modelling