Efficient Conceptual Knowledge Removal in Large Language Models: Methods and Evaluations

Miyim Dimitriou
Daniel Rogowski
Michael Anderson
Eric Vanderbilt
Lawrence Carmichael

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The increasing use of deep neural networks has led to models that accumulate vast amounts of knowledge from their training data, often retaining outdated or biased information that needs to be selectively removed. Novel techniques are required to efficiently erase specific conceptual knowledge from these models while maintaining overall performance and avoiding computationally expensive re-training processes. This paper introduces a scalable framework for conceptual knowledge removal through targeted weight modification and sparse fine-tuning, demonstrating how specific knowledge representations can be isolated and erased without significant degradation to the model's broader capabilities. The methodology achieves high precision in knowledge suppression by leveraging probing techniques and gradient-based optimization, ensuring minimal disruption to general task performance. Extensive experimental evaluations confirm the effectiveness of the proposed approach, highlighting its application to scenarios where adaptive model refinement is essential for maintaining both accuracy and ethical integrity. Contributions to the field include the development of a flexible and efficient mechanism for knowledge erasure, applicable across various architectures, that minimizes computational overhead while enhancing the model's responsiveness to dynamic knowledge requirements.

Version published to 10.21203/rs.3.rs-5208091/v1 on Research Square
Oct 8, 2024

Lightweight Self-Supervised Representation Learning with Knowledge Distillation on Compact Datasets

This article has 1 author:
1. Khawla Hussein ِAli
This article has no evaluationsLatest version Jun 25, 2025
A Comprehensive and Critical Survey of Large Language Model Inference and Feature Generation

This article has 1 author:
1. Snehil Shrivastava
This article has no evaluationsLatest version Jun 16, 2025
A Comprehensive and Critical Survey of Large Language Model Inference and Feature Generation

This article has 1 author:
1. Snehil Shrivastava
This article has no evaluationsLatest version Jun 16, 2025

Listed in

Abstract

Article activity feed

Related articles

Lightweight Self-Supervised Representation Learning with Knowledge Distillation on Compact Datasets

A Comprehensive and Critical Survey of Large Language Model Inference and Feature Generation

A Comprehensive and Critical Survey of Large Language Model Inference and Feature Generation