Persistence Landscapes Across Privacy Budgets for Explanation Methods Across Differential Privacy Mechanisms

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Machine-learning credit scoring must be both auditable and privacy-preserving, yet post-hoc explainers may rely on sensitive records or privileged model access that privacy constraints restrict. We study how local explanations for a feed-forward neural-network credit-risk classifier on the Home Equity Line of Credit (HELOC) dataset change when the data or learning pipeline is sanitized with differential privacy (DP) using additive noise, DP-Stochastic gradient descent (SGD), synthetic data generation, and DP-principal component analysis (PCA). While Local Interpretable Model-agnostic Explanations (LIME), SHapley Additive exPlanations (SHAP), and gradient-based attributions are widely used, their behavior under DP-induced noise remains poorly characterized. Topology-based comparisons using Mapper and persistence summaries can capture explanation structure beyond per-feature averages, but they raise sensitivity and estimation challenges. To close this gap, we treat per-instance attribution vectors as a point cloud, build Mapper graphs using predicted probability as the lens, and convert them to persistence diagrams and persistence landscapes. We introduce a variance-reduced generalized control-variate Monte Carlo (CVMC) estimator for mean landscapes and an adaptive epsilon grid that concentrates computation where stability changes most. Across 49 explainer-mechanism combinations, mean landscapes vary smoothly with privacy budget and exhibit a small set of recurring motifs; in this setting, first-homology landscapes are consistently zero. These results suggest that privatized explanations can remain informative proxies for model behavior, enabling auditing without direct access to raw records and providing a measurable tool for monitoring privacy-interpretability trade-offs and explanation drift in regulated deployments.

Article activity feed